crawl1 / pageWalk an entire site. Async, with webhooks and a run_id you can poll.
Fetch one URL. Get markdown, JSON, HTML or a screenshot. One credit. Sync.
scrape fetches one URL and returns it in any combination of markdown, JSON, HTML and screenshot. The call is synchronous and returns the full response within 60 seconds, or it fails with a structured error.
Pass actions to interact with the page before capture (click, type, scroll, wait). Pass a JSON Schema to run the LLM extractor and get validated structured fields. Pass use_session to route the fetch through your Chrome extension and scrape pages that need you to be logged in.
Pick your language. Every snippet is a real, runnable example.
curl -X POST https://api.stekpad.com/v1/scrape \ -H "Authorization: Bearer stkpd_live_..." \ -H "Content-Type: application/json" \ -d '{ "url": "https://example.com/product/42", "formats": ["markdown", "json"] }'| Name | Type | Required | Description |
|---|---|---|---|
| url | string | required | The URL to fetch. Must be HTTP or HTTPS. |
| formats | string[] | optional | Any subset of markdown, json, html, screenshot. Defaults to ["markdown"]. |
| schema | JSON Schema | optional | If set, json is added to formats and the LLM extractor runs. |
| actions | object[] | optional | Ordered list of browser actions: wait, click, scroll, type, press_key, screenshot. |
| use_session | string | optional | Domain name. Triggers the cookie bridge for authenticated scraping. |
| persist | boolean | optional | Defaults to true. If false, nothing is stored — the response still has the data. |
| dataset | object | optional | { id, mode } to control which dataset receives the row. |
{ "run_id": "run_01HZ...", "url": "https://example.com/product/42", "status": "completed", "markdown": "# Product 42\n\n...", "json": null, "metadata": { "title": "Product 42 — Example", "language": "en", "canonical": "https://example.com/product/42" }, "dataset_id": "ds_default", "row_id": "row_01HZ...", "credits_charged": 1}| Code | When |
|---|---|
| invalid_url | Not HTTP/HTTPS, or unparseable. |
| target_blocked | The site returned 403/451/captcha and stealth mode could not pass. |
| target_timeout | The page did not respond within 60 seconds. |
| schema_validation_failed | LLM extract produced JSON the schema rejected, twice. |
| session_unavailable | use_session was set but the cookie bridge is not connected. |
| insufficient_credits | Workspace wallet at 0. |
Every error includes a code, a human message, and a guidance field with the exact action to take.
Get an API key, paste the curl, watch the row land in your dataset.