Skip to main content
GET
/
crawler
Crawler API
curl --request GET \
  --url https://api.example.com/crawler
{
  "results": [
    {
      "url": "<string>",
      "status": 123,
      "raw_html": "<string>",
      "markdown": "<string>",
      "error": "<string>"
    }
  ]
}

Overview

The Crawler API performs direct crawling of specified URLs with JavaScript rendering support. Ideal for extracting content from single pages or multiple URLs in a single request.

Endpoint

GET https://api.crawleo.dev/api/v1/crawler

Parameters

Required Parameters

urls
string
required
Comma-separated list of URLs to crawl.Example: https://example.com,https://example.org

Output Format Parameters

raw_html
boolean
default:"false"
Return the original HTML source of each page.
markdown
boolean
default:"false"
Return content as structured Markdown (recommended for RAG/LLM).

Example Requests

Basic Crawl with Markdown Output

curl -X GET "https://api.crawleo.dev/api/v1/crawler?urls=https://example.com&markdown=true" \
  -H "Authorization: Bearer YOUR_API_KEY"

Crawl Multiple URLs

curl -X GET "https://api.crawleo.dev/api/v1/crawler?urls=https://example.com,https://example.org&markdown=true&raw_html=true" \
  -H "Authorization: Bearer YOUR_API_KEY"

Get Raw HTML Only

curl -X GET "https://api.crawleo.dev/api/v1/crawler?urls=https://example.com&raw_html=true" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

A successful response returns crawled content for each URL:
{
  "results": [
    {
      "url": "https://example.com",
      "status": 200,
      "markdown": "# Example Domain\n\nThis domain is for use in illustrative examples...",
      "raw_html": "<!DOCTYPE html><html>...</html>"
    }
  ]
}
results
array
Array of crawl result objects.

Use Cases

Crawl documentation pages or knowledge bases and convert to Markdown for vector database ingestion.
# Example: Crawl docs for RAG
response = requests.get(
    "https://api.crawleo.dev/api/v1/crawler",
    params={
        "urls": "https://docs.example.com/guide,https://docs.example.com/api",
        "markdown": True
    },
    headers={"Authorization": "Bearer YOUR_API_KEY"}
)

for result in response.json()["results"]:
    # Add to vector database
    vector_db.add(result["markdown"], metadata={"url": result["url"]})
Extract clean content from web pages for analysis or processing.
Scrape multiple pages in a single API call with JavaScript rendering support.
Provide AI agents with the ability to read and understand web pages.

Tips

For LLM applications, always use markdown=true to get clean, structured content that minimizes token usage.
Ensure you have permission to crawl the target URLs. Respect robots.txt and website terms of service.