Feature · Enrichment

19 enrichers. Zero third-party vendors.

Your scraped data never leaves our stack. Not to Hunter, not to Clearbit, not to Proxycurl, not to NumVerify. We do it all in-house.

What it means

The promise.

Most “enrichment” tools are stitches over a dozen vendor APIs. Your scraped data goes to your scraper, then to Hunter for emails, then to Clearbit for companies, then to NumVerify for phones — and now lives in four databases you don’t control. That’s four privacy reviews, four legal contracts, four GDPR liabilities.

Stekpad ships 19 enrichers. We wrote all 19. They run on our infrastructure. Your data goes from scrape to enriched row without leaving Stekpad.

How it works

Four families.

  • LLM-native (6) — `ai_extract`, `ai_summary`, `ai_classify`, `ai_translate`, `ai_clean`, `ai_embed`. Powered by Cloudflare Workers AI.
  • Web-derived (6) — `find_emails`, `find_phones`, `find_socials`, `find_tech_stack`, `find_company_info`, `find_favicon`. Pure scraping and parsing.
  • Validation (3) — `email_verify`, `whois`, `dns`. In-house network ops.
  • Specialized (4) — `phone_enrich`, `company_enrich`, `email_finder`, `geocode`. The vendor-replacement set.

The full catalog with inputs and output columns lives at /docs/enrichers.

How we replaced the usual vendors

Honest substitutions.

Vendor we don’t useStekpad enricherWhat we use instead
Hunter / Snovfind_emails + email_finderSite scrape + permutation + SMTP verify
NumVerify / Twilio Lookupphone_enrichlibphonenumber (Google, MIT)
Clearbit / Apollocompany_enrich + find_company_infoOpenCorporates + Companies House + SIRENE/INSEE + EDGAR + site scrape
Proxycurllinkedin_enrich (premium, opt-in v1.5)Still uses Proxycurl, explicitly labeled, not on by default
Google Maps GeocodegeocodeNominatim (OpenStreetMap)
Wappalyzer SaaSfind_tech_stackWappalyzer fingerprints (MIT, in-process)

We’re honest about Proxycurl: there’s no way to replicate LinkedIn data legally without a vendor like them. We keep it as an opt-in premium enricher with explicit labeling so you know exactly when your data leaves the stack.

Examples

Pipelines.

Enrichers chain. Define a pipeline once, attach it to a dataset, and every new row gets the full treatment automatically.

scrape → find_emails → email_verify → find_socials → ai_classify(labels=["B2B","B2C"])

Pipelines are versioned. Re-run on existing rows with `force: true`. New rows added later inherit the latest version.

Run an enricher / define a pipeline
bash
# Run an enricher on a whole dataset
curl -X POST https://api.stekpad.com/v1/datasets/ds_abc/enrich \
-H "Authorization: Bearer stkpd_live_..." \
-d '{
"enricher": "find_emails",
"force": false
}'
 
# Or define a pipeline at dataset creation time
curl -X POST https://api.stekpad.com/v1/crawl \
-H "Authorization: Bearer stkpd_live_..." \
-d '{
"url": "https://example.com",
"dataset": { "type": "table", "name": "Companies" },
"pipeline": [
"find_company_info",
"find_emails",
"email_verify",
"find_socials"
]
}'
Cost

1 credit per row per enricher.

The premium linkedin_enrich is 12 credits per row.

A typical lead-enrichment pipeline on 1,000 rows:

  • `find_company_info` → 1,000 cr
  • `find_emails` → 1,000 cr
  • `email_verify` → 1,000 cr
  • `find_socials` → 1,000 cr

Total: 4,000 credits — about 12 € on a Pro pack.

FAQ

Common questions.

How is `email_verify` accuracy benchmarked?

~85% on non-catch-all domains. Catch-alls are flagged honestly. We don’t quote 98%+ figures because nobody can deliver them on cold lists.

Can I bring my own enricher?

Not in v1. v1.5 will expose a Workers-based custom enricher SDK.

Do you sell enrichment data sourced from other customers?

No. Never. Your data is yours. We don’t aggregate, share, sell, or train on it.

Can I run enrichers on rows I imported from a CSV?

Yes — upload a CSV to a dataset and run any enricher that matches the row schema.

Enrich without four privacy reviews.

19 native enrichers. Sign up free. 300 credits a month.

Native enrichment — 19 enrichers, zero third-party vendors — Stekpad