Website Email Extractor

you already have the list. what you don't have is the contacts.

Every team accumulates URL lists with no contact data attached: conference exhibitors, directory exports, portfolio collections, partner prospects, the "companies to reach out to someday" spreadsheet. Visiting each site by hand to find an email takes minutes per row.

This Actor does that visit with a real Chromium browser: homepage first, then up to three more pages (contact, about, team), extracting every email, phone number — US and international formats — and social profile it finds. If the URL turns out to be a Linktree, Beacons or any of 15+ bio-link aggregators, it follows through to the real website automatically.

At $0.02 per URL and roughly 10-20 seconds per site, a 500-row list that would take a week by hand comes back enriched in well under an hour for $10.

how to use it

how to enrich a URL list in 5 steps

URLs in, contacts out. runs on Apify, $0.02 per URL processed.

paste your url list

the urls input takes any list — company websites, bio links, mixed sources. toggles: scrapeEmails, scrapePhones, scrapeSocials, and followLinkPages (auto-resolves Linktree-style aggregators). maxConcurrency defaults to 3. batches of 100-500 URLs are the recommended sweet spot.

a real browser opens each site

Playwright renders each page like a human visitor — JavaScript included, which matters because plenty of sites only inject contact info client-side. average processing is 10-20 seconds per URL.

bio-link pages get resolved to the real site

if a URL is a Linktree, Beacons, bio.link, linkin.bio, solo.to, stan.store, campsite.bio, carrd.co or one of 9+ other aggregators, the Actor detects it and follows through to the actual website before extracting — the contacts live there, not on the link page.

up to 4 pages crawled per site

homepage plus up to three internal contact / about / team pages, found automatically. each result reports what happened:

"inputUrl": "linktr.ee/somebrand", "resolvedUrl": "somebrand.com", "wasLinkAggregator": true, "emails": ["[email protected]"], "phones": ["+1 813 ..."], "socials": { "instagram": "...", "linkedin": "..." }, "pagesCrawled": 4

export the enriched list

CSV, JSON or Excel from the dashboard, or the Apify API for automation. natural next hop in the catalog: pipe the emails through the Email Verifier ($0.003/check) so only deliverable addresses reach your sequencer.

what to expect

hit rates by site type

documented extraction rates from the Actor — honest numbers, not promises. sites that publish no contacts return none.

documented rates by category

business websites · 60-80% yield contact data — the best case, since businesses want to be reached.

SaaS companies · 50-70% — contact info often lives on about or legal pages, which the 4-page crawl covers.

portfolio sites · 50-70% — freelancers and studios usually publish an email.

e-commerce stores · 40-60% — support emails and phones, often in footers.

bio-link pages · 30-40% on the link page itself — which is why the Actor follows through to the real site when followLinkPages is on.

phone formats · US and international formats are both recognized.

socials covered · Facebook, Instagram, Twitter/X, LinkedIn, YouTube, TikTok, Pinterest and Threads.

three ways operators use it

where the extractor pays for itself

lists · revival

turn a dead spreadsheet into a campaign

a sales team inherits 400 company URLs from a trade-show sponsor list — no emails, no phones. one run later ($8, under an hour) the list has contacts on the majority of business sites, each row tagged with how many pages were crawled and whether the URL was really a bio page. the rows that come back empty are honestly empty: the site publishes nothing.

run frequency: per list · 400 URLs ≈ $8

creators · bio links

resolve a pile of Linktrees into real contacts

creator outreach lists are full of linktr.ee and beacons.ai URLs that contain no contacts themselves. with followLinkPages on, the Actor hops through to each creator's actual website and extracts there — the difference between a 30-40% yield on the link page and the 50-70% of a real portfolio site.

run frequency: per campaign · resolution included in the $0.02

pipeline · enrichment

the universal second stage for any scraper

any catalog scraper that outputs websites chains into this one: Google Maps leads whose first crawl missed an email, Facebook pages with a site but no published contact, TikTok creators with personal domains. an n8n flow feeds the URLs in and the Email Verifier cleans what comes out.

run frequency: continuous via API · $0.02/URL

how it compares

extractor vs manual research vs guessing patterns

honest comparison against the two ways teams actually do this today.

	data-runner.dev	manual research	email-pattern guessing
Source of truth	the website itself, rendered	the website itself	name@domain templates
Bounce risk	low — published addresses	low	high — guesses bounce
JS-rendered contacts	✓ real browser	✓ human browser	✗
Bio-link resolution	✓ automatic, 15+ services	manual clicking	✗
Speed	10-20s per URL, parallel	minutes per URL	instant but unreliable
Phones + socials too	✓	✓ if noted down	✗ emails only
Cost at 500 URLs	$10	days of someone's time	free, paid for in bounces

honest read · this Actor extracts what websites publish — it does not find emails that aren't there, and it does not guess patterns (deliberately: guessed emails bounce and burn sender domains). if a row comes back empty, manual research would likely come back empty too. for guessing done responsibly — pattern inference plus verification before anything is sent — that's the Email Verifier & Enricher's enrichment mode, and the two Actors chain well.

got questions

FAQ

how it works, what it costs, what's legal, and how it handles edge cases.

What exactly does it crawl on each website?+

The homepage plus up to three internal pages chosen automatically — contact, about and team pages are the priority targets. That 4-page budget covers where the overwhelming majority of published contact data lives without burning time on blog archives. Each result reports pagesCrawled so you can see what happened.

What happens when I feed it a Linktree or bio-link URL?+

With followLinkPages enabled, the Actor detects 17+ aggregator services (Linktree, Beacons, bio.link, linkin.bio, solo.to, stan.store, campsite.bio, carrd.co and more), follows through to the actual website behind the link page, and extracts there. The output marks wasLinkAggregator and includes the resolvedUrl so you keep both.

What hit rate should I expect?+

Depends on the list. Documented rates: business websites 60-80%, SaaS and portfolio sites 50-70%, e-commerce 40-60%, raw bio-link pages 30-40% before resolution. A mixed B2B list typically lands in the middle. Rows that return nothing usually mean the site genuinely publishes no contact info.

Does it find phone numbers and socials too, or just emails?+

All three, each behind its own toggle: emails, phones (US and international formats), and social links across Facebook, Instagram, Twitter/X, LinkedIn, YouTube, TikTok, Pinterest and Threads.

How fast is a run?+

10-20 seconds per URL on average, processed in parallel (maxConcurrency, default 3). A 500-URL batch typically completes well within the hour. The recommended batch size is 100-500 URLs per run for efficiency, with no hard cap documented.

Why $0.02 per URL instead of per email found?+

Because the work — rendering the site, crawling 4 pages, resolving aggregators — happens whether or not the site publishes an email. Per-URL pricing keeps it simple and honest: a 500-URL run costs $10, period. The documented hit rates above tell you what yield to expect before you spend.

Will it guess emails like [email protected] if none are published?+

No — deliberately. Guessed patterns bounce, and bounces above ~2-3% get your sending domain filtered. This Actor only extracts published addresses. If you want pattern inference done responsibly, the Email Verifier & Enricher guesses the pattern AND verifies deliverability before you ever send — that's its enrichment mode at $0.012 per found record.

Is scraping contact data from websites legal?+

The Actor reads what websites publicly publish — the same pages any visitor sees. Scraping publicly accessible data is generally legal in most jurisdictions (notably upheld in hiQ v. LinkedIn in the US). Your obligations sit on the outreach side: CAN-SPAM, GDPR, and honoring unsubscribes. See the data-runner.dev disclaimer for the full policy.

Can it handle JavaScript-heavy sites?+

Yes — that's why it uses Playwright with a real Chromium browser instead of plain HTTP requests. Contact info injected client-side (common on modern sites and almost universal on bio-link pages) renders before extraction runs.

Can I pipe the output into my CRM or automation stack?+

Yes. Apify exports JSON, CSV, and Excel out of the box and exposes a REST API plus webhooks. Common patterns: push leads to Google Sheets via n8n or Zapier, sync to HubSpot or GoHighLevel, or chain with the catalog's Email Verifier before your sequencer. We also build custom n8n workflows if you want the integration done for you.

you already have the list. what you don't have is the contacts.

how to enrich a URL list in 5 steps

hit rates by site type

where the extractor pays for itself

turn a dead spreadsheet into a campaign

resolve a pile of Linktrees into real contacts

the universal second stage for any scraper

extractor vs manual research vs guessing patterns

$0.02 per URL. four pages deep.

FAQ

run the website email extractor

website email extractor

you already have the list. what you don't have is the contacts.

how to enrich a URL list in 5 steps

hit rates by site type

where the extractor pays for itself

turn a dead spreadsheet into a campaign

resolve a pile of Linktrees into real contacts

the universal second stage for any scraper

extractor vs manual research vs guessing patterns

$0.02 per URL. four pages deep.

More from data-runner.dev

FAQ

run the website email extractor