Enrichers catalog

19 native enrichers. Zero third-party vendors.

Every enricher runs in our stack. None of them call a third-party data vendor. One credit per row, except the v1.5 premium add-on.

How they work

A function that adds typed columns.

An enricher takes a row from a table dataset and writes one or more typed columns onto it. Run it on one row from the dashboard, on a whole dataset as an async run with a webhook, or chain enrichers in a pipeline that runs automatically on every new row added.

The catalog

Filter by category.

ai_extractLLMai_summaryLLMai_classifyLLMai_translateLLMai_cleanLLMai_embedLLMfind_emailsWebfind_phonesWebfind_socialsWebfind_tech_stackWebfind_company_infoWebfind_faviconWebemail_verifyValidationwhoisValidationdnsValidationphone_enrichSpecializedcompany_enrichSpecializedemail_finderSpecializedgeocodeSpecialized
ai_extract1 cr
LLM-native
Inputs
row, JSON schema, prompt
Outputs
as declared in schema

Default Gemma 4. Escalates to Haiku on schema fail.

ai_summary1 cr
LLM-native
Inputs
row, text column, length
Outputs
summary

Length: one_line, paragraph, bullet.

ai_classify1 cr
LLM-native
Inputs
row, text column, labels[]
Outputs
class, confidence

Multi-label variant via flag.

ai_translate1 cr
LLM-native
Inputs
row, text column, target_lang
Outputs
<col>_<lang>

ISO 639-1 targets.

ai_clean1 cr
LLM-native
Inputs
row, text column
Outputs
<col>_clean

Removes boilerplate, normalizes whitespace and casing.

ai_embed1 cr
LLM-native
Inputs
row, text column
Outputs
<col>_embedding

1024-dim vectors via Workers AI bge-m3.

find_emails1 cr
Web-derived
Inputs
row with url column
Outputs
emails[]

Scrapes /contact, /about, /legal, footer. Dedupes, filters disposable.

find_phones1 cr
Web-derived
Inputs
row with url column
Outputs
phones[]

Regex + schema.org + libphonenumber.

find_socials1 cr
Web-derived
Inputs
row with url column
Outputs
linkedin, twitter, facebook, instagram, github, youtube, tiktok

Parses <link rel="me">, footer links, schema.org.

find_tech_stack1 cr
Web-derived
Inputs
row with url column
Outputs
tech_stack[]

Wappalyzer fingerprints (MIT).

find_company_info1 cr
Web-derived
Inputs
row with url column
Outputs
name, description, logo_url, country, language

OG + schema.org + meta + TLD heuristic.

find_favicon1 cr
Web-derived
Inputs
row with url column
Outputs
favicon_url

Cached on Stekpad CDN.

email_verify1 cr
Validation
Inputs
row with email column
Outputs
valid, reason, catch_all, disposable

SMTP handshake (MX → RCPT TO → no DATA). ~85% precision, catch-all flagged.

whois1 cr
Validation
Inputs
row with domain column
Outputs
registrant, registrar, created_at, expires_at, country

TCP/43 WHOIS.

dns1 cr
Validation
Inputs
row with domain column
Outputs
a_records[], mx_records[], txt_records[], spf, dmarc, dkim_present

Node DNS resolver.

phone_enrich1 cr
Specialized
Inputs
row with phone column
Outputs
country, type, carrier_hint, e164, valid

libphonenumber (Google, MIT). Replaces NumVerify.

company_enrich1 cr
Specialized
Inputs
row with company name and/or domain
Outputs
legal_name, registration_number, status, address, directors[], founded, country

OpenCorporates + Companies House (UK) + SIRENE/INSEE (FR) + EDGAR (US) + scrape fallback.

email_finder1 cr
Specialized
Inputs
row with person name + domain
Outputs
email_guess, confidence, pattern_detected

Permutation + pattern detection. SMTP-verified. Honest confidence field.

geocode1 cr
Specialized
Inputs
row with address column
Outputs
lat, lng, formatted_address, country_code

Nominatim (OpenStreetMap, attribution required). Self-hostable in v1.5.

Pipelines

Chain enrichers on a dataset.

Pipeline definitions are versioned per dataset. They run automatically on every new row added. Re-run on existing rows with force: true.

pipeline.txt
bash
// Pipeline definition on a dataset
scrape
find_emails
email_verify
find_socials
ai_classify(labels=["B2B", "B2C", "SaaS", "eCom"])
 
// Cost: 5 enrichers × 1,000 rows = 5,000 credits
v1.5 premium

One opt-in vendor enricher.

The only enricher that calls a third-party vendor is linkedin_enrich, deferred to v1.5. It costs 12 credits per row and is explicitly labeled 'powered by Proxycurl' on every row it touches. Opt-in per call. Off by default.

Enrich a row. Keep the data in your stack.

One credit per row. No third-party vendors. No leaked data.

19 Enrichers — the full Stekpad catalog — Stekpad