ByteHarvester

Posted on Apr 16

I Built a Google Maps Email Scraper That Finds 74% More Emails Than the Competition

#webscraping #javascript #automation #leadgeneration

I spent months scraping Google Maps for lead generation projects and kept running into the same wall: none of the existing tools actually found emails.

Sure, they scraped names, phone numbers, and addresses. But emails? The one piece of contact data that actually converts for cold outreach? Almost nobody bothered.

So I built my own. And in testing across 5,000+ businesses, it found valid emails for 74% of listings that had a website.

Here is how it works, why the architecture matters, and how you can use it today.

Why I Built This

Last year, a client asked me to build a prospect list of 2,000 dental practices in Germany. The brief was simple: name, address, phone, email, and any social media links. Easy, right?

I tried three different Google Maps scrapers on the Apify Store. They all gave me names, addresses, and phone numbers in minutes. But the email column? Empty. Every single row.

The problem is straightforward: Google Maps does not display email addresses. It shows website URLs. To get emails, you need to actually visit each business website and parse the HTML for email patterns. That is a second crawl on top of the Maps scrape, and most tools skip it because it is slower and more complex.

I also needed results in German -- the search queries, the Maps interface, the opening hours. Most scrapers hard-code English selectors and break completely when you switch languages.

After patching together two separate tools and writing my own email-parsing scripts, I decided to build a single actor that does the entire pipeline: Maps scrape, website visit, email extraction, social media extraction, and smart filtering. All in one run.

What Makes It Different

It visits every business website. This is the big one. Google Maps shows a website URL for most businesses, but it never shows their email. This scraper takes that URL, loads the page with a fast HTML parser, and scans for mailto: links plus regex-based email patterns. That is where the 74% hit rate comes from.

It works in any language. The actor supports 50+ language codes. Search for "Zahnarzt in Munchen" in German, "restaurantes en Madrid" in Spanish, or "fogorvos Budapest" in Hungarian. The opening hours extraction is language-agnostic too -- it finds time patterns like 12:00-23:00 rather than looking for the English word "Hours."

Smart email filtering. Raw regex extraction on HTML is noisy. You get image filenames that look like emails, Wixpress system addresses, font file references, noreply addresses, and other junk. The scraper runs every match through a multi-layer filter: blacklisted domains, image/font file extensions, generic system prefixes, and domain length validation. What you get back are real, contactable emails.

Built-in social media extraction. It pulls Facebook, Instagram, LinkedIn, and Twitter/X profile links from business websites, filtering out share dialogs and tracking URLs to give you clean profile links.

Opening hours. Most scrapers skip this entirely because it requires expanding a collapsed UI element in Google Maps and parsing a table. This actor clicks the hours panel open, finds the hours table in the DOM, and returns structured day-by-day data.

Half the cost. At roughly $5 per 1,000 results, it is about half the price of the popular "Google Maps Email Extractor" by contacts-api, which charges around $10 per 1,000. And you pay for results, not runtime.

How It Works: The 3-Phase Architecture

The scraper runs in three distinct phases, each using the right tool for the job.

Phase 1: Google Maps Crawling (Playwright)

Google Maps is a heavy single-page application. There is no static HTML to parse -- everything renders via JavaScript. The results list uses infinite scrolling, so you can not just fetch a URL and call it done.

This phase uses Playwright (a headful browser) through Crawlee's PlaywrightCrawler. Here is what happens:

The crawler builds a Maps search URL from your query and language code.
It waits for the results feed to appear.
It scrolls the feed panel repeatedly, collecting place links after each scroll.
It detects when scrolling stops producing new results (5 consecutive stable rounds) or when maxResults is reached.
Each discovered place URL is enqueued as a separate PLACE request.

For each place page, the crawler extracts business name, category, address, phone, website URL, rating, review count, opening hours, Google Maps URL and place ID.

The selectors use multiple fallback strategies. For example, opening hours extraction tries three different click targets, making it resilient to Google's frequent DOM changes.

Phase 2: Website Email Scraping (Cheerio)

This is where the magic happens. Every business that has a website URL gets a second visit using Crawlee's CheerioCrawler.

Why Cheerio instead of Playwright? Business websites are mostly static HTML. You do not need a full browser to parse them. Cheerio loads the raw HTML and parses it into a DOM you can query with CSS selectors -- no browser overhead, no GPU, no rendering. It is roughly 10-20x faster and uses a fraction of the memory.

For each business website, the crawler extracts all mailto: link hrefs, runs a regex scan across the entire HTML body, combines both sources and deduplicates, filters through the blacklist, removes generic prefixes, validates minimum domain length, and extracts social media profile URLs.

Phase 3: Dataset Assembly

Results are pushed to the Apify dataset in batches of 50 to avoid memory spikes. You pay only for actual results, not for idle browser time.

Example Output

Here is what a single result looks like:

{
  "name": "Borkonyha WineKitchen",
  "address": "Sas u. 3, Budapest, 1051 Hungary",
  "phone": "+36 1 266 0835",
  "website": "https://borkonyha.hu",
  "emails": ["info@borkonyha.hu", "reservation@borkonyha.hu"],
  "rating": 4.6,
  "reviewCount": 3842,
  "category": "Restaurant",
  "openingHours": {
    "Monday": "Closed",
    "Tuesday": "12:00-15:00, 18:00-23:00",
    "Wednesday": "12:00-15:00, 18:00-23:00",
    "Thursday": "12:00-15:00, 18:00-23:00",
    "Friday": "12:00-15:00, 18:00-23:00",
    "Saturday": "12:00-15:00, 18:00-23:00",
    "Sunday": "Closed"
  },
  "socialMedia": {
    "facebook": "https://facebook.com/borkonyha",
    "instagram": "https://instagram.com/borkonyhawine"
  },
  "googleMapsUrl": "https://www.google.com/maps/place/Borkonyha+WineKitchen",
  "searchQuery": "restaurants in Budapest"
}

Quick Start

The actor is live on the Apify Store: Google Maps Email Scraper

Minimal input to get started:

{
  "searchQueries": ["plumbers in Chicago"],
  "maxResults": 50,
  "scrapeEmails": true,
  "language": "en"
}

Tips for best results:

Use specific, localized queries. "Italian restaurants in Manhattan, NYC" will outperform "restaurants" every time.
Use residential proxies (the default Apify proxy setting). Google Maps blocks datacenter IPs aggressively.
Set language to match your target country. Searching for German businesses? Use "de".
Set scrapeEmails to false for a 3-5x speed boost if you only need Maps data.

The Numbers

Metric	This Actor	Typical Competitor
Email hit rate	74%	0-15%
Cost per 1,000 results	~$5	~$10
Visits business websites	Yes	Rarely
Social media extraction	Yes	Sometimes
Opening hours	Yes	Rarely
Multi-language support	50+ languages	English only
Email filtering	Multi-layer	Basic or none

Tech Stack

Playwright for Google Maps (JS-heavy SPA, infinite scroll, dynamic DOM)
Cheerio for business websites (fast, lightweight HTML parsing, no browser needed)
Crawlee as the orchestration layer (request queuing, retries, proxy rotation)
Apify SDK for input/output, dataset management, and pay-per-result charging
Node.js 18+ with ES modules

The dual-crawler architecture is the key design decision. Using Playwright everywhere would work but would be 10-20x slower and far more expensive for the website phase. Cheerio everywhere would fail on Google Maps because it cannot execute JavaScript. By splitting the pipeline, each phase uses the cheapest, fastest tool that actually works.

Try It Out

If you are building prospect lists, doing local SEO research, or just need business contact data from Google Maps, give it a spin. The free tier on Apify gives you enough credits to test a small batch.

Google Maps Email Scraper on Apify Store

Got questions or feature requests? Drop a comment below or open an issue on the actor page. I am actively maintaining this and shipping improvements based on user feedback.

DEV Community