Overview
The Crawler API performs direct crawling of specified URLs with JavaScript rendering support. Ideal for extracting content from single pages or multiple URLs in a single request.
Endpoint
GET https://api.crawleo.dev/crawl
Parameters
Required Parameters
URL(s) to crawl. Can be a single URL or comma-separated list. Example: https://example.com or https://example.com,https://example.org
Output format for crawled content. Options:
raw_html - Original HTML source
enhanced_html - Clean HTML with ads, scripts, and tracking removed
markdown - Structured Markdown (recommended for RAG/LLM)
Localization Parameters
2-letter country code for geo-targeted crawling (e.g., us, gb, de).
Advanced Options
Capture a screenshot of the page.
Use datacenter proxies for the request. Costs 2 credits per page (vs 1 credit without proxy).
Use residential proxies for the request (higher success rate for protected sites). Costs 10 credits per page.
use_proxies and use_premium_proxies are mutually exclusive. Only one can be set to true at a time.
Example Requests
Basic Crawl with Markdown Output
curl -X GET "https://api.crawleo.dev/crawl?urls=https://example.com&output_format=markdown" \
-H "Authorization: Bearer YOUR_API_KEY"
Crawl Multiple URLs
curl -X GET "https://api.crawleo.dev/crawl?urls=https://example.com,https://example.org&output_format=markdown" \
-H "Authorization: Bearer YOUR_API_KEY"
Get Raw HTML Only
curl -X GET "https://api.crawleo.dev/crawl?urls=https://example.com&output_format=raw_html" \
-H "Authorization: Bearer YOUR_API_KEY"
Crawl with Proxy Support
curl -X GET "https://api.crawleo.dev/crawl?urls=https://example.com&output_format=markdown&use_premium_proxies=true&country=gb" \
-H "Authorization: Bearer YOUR_API_KEY"
Response
A successful response returns crawled content for each URL:
{
"success" : true ,
"results" : [
{
"url" : "https://example.com" ,
"status" : 200 ,
"markdown" : "# Example Domain \n\n This domain is for use in illustrative examples..." ,
"raw_html" : "<!DOCTYPE html><html>...</html>"
}
],
"credits_used" : 1 ,
"credits_remaining" : 9999
}
Array of crawl result objects. Show Result object properties
HTTP status code of the crawled page.
Full HTML source (if raw_html=true).
Markdown content (if markdown=true).
Error message if the crawl failed for this URL.
Number of credits consumed by this request. Varies based on proxy settings:
No proxy: 1 credit per URL
Standard proxy: 2 credits per URL
Premium proxy: 10 credits per URL
Your remaining credit balance for the current billing period.
Use Cases
Crawl documentation pages or knowledge bases and convert to Markdown for vector database ingestion. # Example: Crawl docs for RAG
response = requests.get(
"https://api.crawleo.dev/crawl" ,
params = {
"urls" : "https://docs.example.com/guide,https://docs.example.com/api" ,
"output_format" : "markdown"
},
headers = { "Authorization" : "Bearer YOUR_API_KEY" }
)
for result in response.json()[ "results" ]:
# Add to vector database
vector_db.add(result[ "markdown" ], metadata = { "url" : result[ "url" ]})
Scrape multiple pages in a single API call with JavaScript rendering support.
Provide AI agents with the ability to read and understand web pages.
Tips
For LLM applications , always use output_format=markdown to get clean, structured content that minimizes token usage.
Ensure you have permission to crawl the target URLs. Respect robots.txt and website terms of service.