Overview
The Crawler API performs direct crawling of specified URLs with JavaScript rendering support. Ideal for extracting content from single pages or multiple URLs in a single request.
Endpoint
GET https://api.crawleo.dev/api/v1/crawler
Parameters
Required Parameters
Comma-separated list of URLs to crawl. Example: https://example.com,https://example.org
Return the original HTML source of each page.
Return content as structured Markdown (recommended for RAG/LLM).
Example Requests
Basic Crawl with Markdown Output
curl -X GET "https://api.crawleo.dev/api/v1/crawler?urls=https://example.com&markdown=true" \
-H "Authorization: Bearer YOUR_API_KEY"
Crawl Multiple URLs
curl -X GET "https://api.crawleo.dev/api/v1/crawler?urls=https://example.com,https://example.org&markdown=true&raw_html=true" \
-H "Authorization: Bearer YOUR_API_KEY"
Get Raw HTML Only
curl -X GET "https://api.crawleo.dev/api/v1/crawler?urls=https://example.com&raw_html=true" \
-H "Authorization: Bearer YOUR_API_KEY"
Response
A successful response returns crawled content for each URL:
{
"results" : [
{
"url" : "https://example.com" ,
"status" : 200 ,
"markdown" : "# Example Domain \n\n This domain is for use in illustrative examples..." ,
"raw_html" : "<!DOCTYPE html><html>...</html>"
}
]
}
Array of crawl result objects. Show Result object properties
HTTP status code of the crawled page.
Full HTML source (if raw_html=true).
Markdown content (if markdown=true).
Error message if the crawl failed for this URL.
Use Cases
Crawl documentation pages or knowledge bases and convert to Markdown for vector database ingestion. # Example: Crawl docs for RAG
response = requests.get(
"https://api.crawleo.dev/api/v1/crawler" ,
params = {
"urls" : "https://docs.example.com/guide,https://docs.example.com/api" ,
"markdown" : True
},
headers = { "Authorization" : "Bearer YOUR_API_KEY" }
)
for result in response.json()[ "results" ]:
# Add to vector database
vector_db.add(result[ "markdown" ], metadata = { "url" : result[ "url" ]})
Scrape multiple pages in a single API call with JavaScript rendering support.
Provide AI agents with the ability to read and understand web pages.
Tips
For LLM applications , always use markdown=true to get clean, structured content that minimizes token usage.
Ensure you have permission to crawl the target URLs. Respect robots.txt and website terms of service.