[Bug]: Relative Urls in the webpage not extracted properly

### crawl4ai version

0.4.247

### Expected Behavior

When parsing the markdown of a given webpage, 
1. if the href in the anchor/img/link tag is a relative url, it should be combined with base url properly (or let it remain relative) 
2. if the href in the anchor/img/link tag is a absolute url, it should not be combined with the base url:

The code to extract the Markdown:
```
async with AsyncWebCrawler() as crawler:
    result = await crawler.arun(
        url=url,
    )
    return result.markdown
```

This is the expected Markdown of the webpage "https://docs.crawl4ai.com/"

```
[Crawl4AI Documentation (v0.4.3b2)](https://docs.crawl4ai.com/)
  * [ Home ](https://docs.crawl4ai.com/)
  * [ Quick Start ](https://docs.crawl4ai.com/core/quickstart/)
  * [ Search ](https://docs.crawl4ai.com/#)


  * Home
  * Setup & Installation
    * [Installation](https://docs.crawl4ai.com/core/installation/)
    * [Docker Deployment](https://docs.crawl4ai.com/core/docker-deploymeny/)
  * [Quick Start](https://docs.crawl4ai.com/core/quickstart/)
```


### Current Behavior


When parsing the markdown of a given webpage, relative urls are not being converted properly. relative urls are combined with the base url as base_url/<relative_url>  with the  angle brackets '<' and '>' symbols. 
Additionally, the relative url is being combined with base url even if the href contains absolute url.

This is the current Markdown of the webpage "https://docs.crawl4ai.com/"

```
[Crawl4AI Documentation (v0.4.3b2)](https://docs.crawl4ai.com/<https:/docs.crawl4ai.com/>)
  * [ Home ](https://docs.crawl4ai.com/<.>)
  * [ Quick Start ](https://docs.crawl4ai.com/<core/quickstart/>)
  * [ Search ](https://docs.crawl4ai.com/<#>)


  * Home
  * Setup & Installation
    * [Installation](https://docs.crawl4ai.com/<core/installation/>)
    * [Docker Deployment](https://docs.crawl4ai.com/<core/docker-deploymeny/>)
  * [Quick Start](https://docs.crawl4ai.com/<core/quickstart/>)
 ```

Side note: spelling mistake in ```https://docs.crawl4ai.com/core/docker-deploymeny/```

### Is this reproducible?

Yes

### Inputs Causing the Bug

```bash
URL: https://docs.crawl4ai.com/
```

### Steps to Reproduce

```bash
1. Run the below code snippet for the mentioned url
   python webpage_crawler.py https://docs.crawl4ai.com/ crawl4ai.md
2. Compare the generated Markdown and the raw html.
```

### Code snippets

```python
# filename: webpage_crawler.py

import asyncio
from crawl4ai import AsyncWebCrawler

async def get_markdown(url: str) -> str:
    async with AsyncWebCrawler() as crawler:
        try:
            result = await crawler.arun(
                url=url,
            )
            if result.url != url:
                print(f"Redirected to {result.url}")
            if not result.success:
                raise Exception(result.error_message)
            if result.status_code == 404:
                raise Exception(f"url not found")
            return result.markdown
        except Exception as err:
            print("Crawler failed for", url)
            raise err
        

async def get_cleaned_html(url: str) -> str:
    async with AsyncWebCrawler() as crawler:
        try:
            result = await crawler.arun(
                url=url,
            )
            if result.url != url:
                print(f"Redirected to {result.url}")
            if not result.success:
                raise Exception(result.error_message)
            if result.status_code == 404:
                raise Exception(f"url not found")
            return result.cleaned_html
        except Exception as err:
            print("Crawler failed for", url)
            raise err

if __name__ == "__main__":
    import sys
    
    if len(sys.argv) != 3:
        print("Usage: python script.py <url> <output_file>")
        sys.exit(1)
    
    url = sys.argv[1]
    output_file = sys.argv[2]
    
    markdown = asyncio.run(get_markdown(url))
    print(len(markdown))
    with open(output_file, 'w', encoding='utf-8') as f:
        f.write(markdown)
```

### OS

Windows 10 (also observed on Linux)

### Python version

3.11

### Browser

Google Chrome

### Browser version

131.0.6778.265

### Error logs & Screenshots (if applicable)

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Relative Urls in the webpage not extracted properly #570

crawl4ai version

Expected Behavior

Current Behavior

Is this reproducible?

Inputs Causing the Bug

Steps to Reproduce

Code snippets

OS

Python version

Browser

Browser version

Error logs & Screenshots (if applicable)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Relative Urls in the webpage not extracted properly #570

Description

crawl4ai version

Expected Behavior

Current Behavior

Is this reproducible?

Inputs Causing the Bug

Steps to Reproduce

Code snippets

OS

Python version

Browser

Browser version

Error logs & Screenshots (if applicable)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions