scrape1 creditFetch one URL. Markdown, JSON, HTML or a screenshot.
Walk a site. One credit per page crawled. Async, with webhooks and a run id you can poll.
crawl walks a site from a seed URL, follows links inside the include_paths globs, skips anything in exclude_paths, and lands every page as a row in a dataset. The call returns a run_id immediately — poll GET /v1/runs/:id or subscribe to webhooks.
Per-page rendering uses the same engine as scrape, so all the scrape options apply: formats, schema-driven extraction, actions, even use_session for authenticated crawls.
Pick your language. Every snippet is a real, runnable example.
curl -X POST https://api.stekpad.com/v1/crawl \ -H "Authorization: Bearer stkpd_live_..." \ -H "Content-Type: application/json" \ -d '{ "url": "https://example.com", "limits": { "max_pages": 500, "max_depth": 5 }, "include_paths": ["/blog/**", "/docs/**"], "scrape_options": { "formats": ["markdown"] }, "dataset": { "type": "markdown_bundle", "name": "Example docs" } }'| Name | Type | Required | Description |
|---|---|---|---|
| url | string | required | Seed URL. Crawl starts here. |
| limits.max_pages | int | optional | Hard cap on pages crawled. Default 100. |
| limits.max_depth | int | optional | Link depth from seed. Default 5. |
| limits.max_duration_seconds | int | optional | Wall-clock cap. Default 1800. |
| include_paths | string[] | optional | Glob patterns. Only matching paths are crawled. |
| exclude_paths | string[] | optional | Glob patterns. Matching paths are skipped. |
| respect_robots | boolean | optional | Default true. Override gated to Cloud Growth+. |
| deduplicate_by | string | optional | canonical_url, content_hash, or a column name. |
| scrape_options | object | optional | Same shape as /v1/scrape. Applied per page. |
| use_session | string | optional | Triggers the cookie bridge. |
| dataset | object | required | { type, name } to create, or { id } to append. |
| webhook_url | string | optional | HTTPS URL for events. |
| webhook_events | string[] | optional | run.completed, row.added, run.failed. |
{ "run_id": "run_01HZ...", "status": "queued", "dataset_id": "ds_xyz", "poll_url": "https://api.stekpad.com/v1/runs/run_01HZ...", "estimated_credits": 500}| Code | When |
|---|---|
| invalid_seed_url | Seed URL is malformed. |
| crawl_quota_exceeded | max_pages reached — partial dataset is still saved. |
| crawl_duration_exceeded | max_duration_seconds reached. |
| target_blocked | Site blocked the crawler globally. |
| insufficient_credits | Wallet hit 0 mid-crawl. The run pauses; top up to resume. |
Every error includes a code, a human message, and a guidance field with the exact action to take.
Get an API key, paste the curl, watch the row land in your dataset.