Overview
The Crawler API performs direct crawling of specified URLs with JavaScript rendering support. Ideal for extracting content from single pages or multiple URLs in a single request.
Endpoint
GET https://api.crawleo.dev/crawl
Parameters
Your Crawleo API key for authentication. (Alternatively, use the Authorization: Bearer YOUR_API_KEY header.) Example: x-api-key: YOUR_API_KEY or Authorization: Bearer YOUR_API_KEY
Required Parameters
URL(s) to crawl. Can be a single URL or comma-separated list. Example: https://example.com or https://example.com,https://example.org
Rendering Options
Enable browser rendering for JavaScript-heavy sites.
true - Browser rendering (10 credits/URL)
false - HTTP request (1 credit/URL)
ISO 3166-1 alpha-2 country code for geolocation (e.g., us, gb, de). Example: geolocation=us
Return the raw HTML as-is from the page source.
Return cleaned/sanitized HTML with ads, scripts, and tracking removed.
Return plain-text extraction of the page content.
Return markdown conversion (recommended for RAG/LLM applications).
Screenshot Options
Capture a screenshot of the page. Only available when render_js=true.
Capture full-page screenshot vs viewport only.
Screenshots are only available when render_js=true. Setting screenshot=true without JavaScript rendering will be ignored.
Example Requests
Basic Crawl with Markdown Output
curl -X GET "https://api.crawleo.dev/crawl?urls=https://example.com&markdown=true" \
-H "x-api-key: YOUR_API_KEY"
Crawl Multiple URLs
curl -X GET "https://api.crawleo.dev/crawl?urls=https://example.com,https://example.org&markdown=true" \
-H "x-api-key: YOUR_API_KEY"
Crawl with JavaScript Rendering
curl -X GET "https://api.crawleo.dev/crawl?urls=https://example.com&render_js=true&markdown=true" \
-H "x-api-key: YOUR_API_KEY"
Crawl with Screenshot and Geolocation
curl -X GET "https://api.crawleo.dev/crawl?urls=https://example.com&render_js=true&screenshot=true&screenshot_full_page=true&geolocation=gb" \
-H "x-api-key: YOUR_API_KEY"
curl -X GET "https://api.crawleo.dev/crawl?urls=https://example.com&raw_html=true&enhanced_html=true&markdown=true&page_text=true" \
-H "x-api-key: YOUR_API_KEY"
Response
A successful response returns crawled content for each URL:
{
"success" : true ,
"results" : [
{
"url" : "https://example.com" ,
"status" : 200 ,
"markdown" : "# Example Domain \n\n This domain is for use in illustrative examples..." ,
"raw_html" : "<!DOCTYPE html><html>...</html>"
}
],
"credits_used" : 1 ,
"credits_remaining" : 9999
}
Array of crawl result objects. Show Result object properties
HTTP status code of the crawled page.
Full HTML source (if raw_html=true).
Cleaned/sanitized HTML (if enhanced_html=true).
Markdown content (if markdown=true).
Plain-text extraction (if page_text=true).
Base64-encoded screenshot (if screenshot=true).
Error message if the crawl failed for this URL.
Number of credits consumed by this request. Varies based on rendering:
HTTP request (render_js=false): 1 credit per URL
Browser rendering (render_js=true): 10 credits per URL
Failed requests: 0 credits
Your remaining credit balance for the current billing period.
Use Cases
Crawl documentation pages or knowledge bases and convert to Markdown for vector database ingestion. # Example: Crawl docs for RAG
response = requests.get(
"https://api.crawleo.dev/crawl" ,
params = {
"urls" : "https://docs.example.com/guide,https://docs.example.com/api" ,
"markdown" : True
},
headers = { "x-api-key" : "YOUR_API_KEY" }
)
for result in response.json()[ "results" ]:
# Add to vector database
vector_db.add(result[ "markdown" ], metadata = { "url" : result[ "url" ]})
Scrape multiple pages in a single API call with JavaScript rendering support.
Provide AI agents with the ability to read and understand web pages.
Tips
For LLM applications , always use markdown=true to get clean, structured content that minimizes token usage. The markdown format is enabled by default.
Ensure you have permission to crawl the target URLs. Respect robots.txt and website terms of service.