ai_extract1 cr- Inputs
- row, JSON schema, prompt
- Outputs
- as declared in schema
Default Gemma 4. Escalates to Haiku on schema fail.
Every enricher runs in our stack. None of them call a third-party data vendor. One credit per row, except the v1.5 premium add-on.
An enricher takes a row from a table dataset and writes one or more typed columns onto it. Run it on one row from the dashboard, on a whole dataset as an async run with a webhook, or chain enrichers in a pipeline that runs automatically on every new row added.
ai_extractLLMai_summaryLLMai_classifyLLMai_translateLLMai_cleanLLMai_embedLLMfind_emailsWebfind_phonesWebfind_socialsWebfind_tech_stackWebfind_company_infoWebfind_faviconWebemail_verifyValidationwhoisValidationdnsValidationphone_enrichSpecializedcompany_enrichSpecializedemail_finderSpecializedgeocodeSpecializedai_extract1 crDefault Gemma 4. Escalates to Haiku on schema fail.
ai_summary1 crLength: one_line, paragraph, bullet.
ai_classify1 crMulti-label variant via flag.
ai_translate1 crISO 639-1 targets.
ai_clean1 crRemoves boilerplate, normalizes whitespace and casing.
ai_embed1 cr1024-dim vectors via Workers AI bge-m3.
find_emails1 crScrapes /contact, /about, /legal, footer. Dedupes, filters disposable.
find_phones1 crRegex + schema.org + libphonenumber.
find_socials1 crParses <link rel="me">, footer links, schema.org.
find_tech_stack1 crWappalyzer fingerprints (MIT).
find_company_info1 crOG + schema.org + meta + TLD heuristic.
find_favicon1 crCached on Stekpad CDN.
email_verify1 crSMTP handshake (MX → RCPT TO → no DATA). ~85% precision, catch-all flagged.
whois1 crTCP/43 WHOIS.
dns1 crNode DNS resolver.
phone_enrich1 crlibphonenumber (Google, MIT). Replaces NumVerify.
company_enrich1 crOpenCorporates + Companies House (UK) + SIRENE/INSEE (FR) + EDGAR (US) + scrape fallback.
email_finder1 crPermutation + pattern detection. SMTP-verified. Honest confidence field.
geocode1 crNominatim (OpenStreetMap, attribution required). Self-hostable in v1.5.
Pipeline definitions are versioned per dataset. They run automatically on every new row added. Re-run on existing rows with force: true.
// Pipeline definition on a datasetscrape → find_emails → email_verify → find_socials → ai_classify(labels=["B2B", "B2C", "SaaS", "eCom"]) // Cost: 5 enrichers × 1,000 rows = 5,000 creditsThe only enricher that calls a third-party vendor is linkedin_enrich, deferred to v1.5. It costs 12 credits per row and is explicitly labeled 'powered by Proxycurl' on every row it touches. Opt-in per call. Off by default.
One credit per row. No third-party vendors. No leaked data.