The basic &start= parameter method we covered in our previous guide works for simple scraping tasks, but it quickly falls apart when scraping at scale or dealing with modern Google Search’s dynamic content.
Today’s Google Search is not a static HTML page – it’s a complex web application that loads most content dynamically with JavaScript. It uses advanced AI-powered anti-bot systems that can detect even the most sophisticated scrapers based on hundreds of signals, including browser fingerprint, mouse movement and typing patterns.
In this guide, we’ll show you how to build a reliable, production-grade Google SERP scraper that can handle modern Google’s dynamic content and strict anti-bot systems. We’ll cover headless browser automation, humanization techniques, dynamic content extraction, and the critical role of proxies in scaling your operations without CAPTCHAs.

Why Basic &start= Scraping Fails in 2026
The simple requests-based scraping method has three fatal flaws when used on modern Google:
1.No JavaScript support: Basic HTTP clients can’t execute JavaScript, so they miss all dynamic content that loads after the initial page load. This includes People Also Ask boxes, video results, Local Packs and AI overviews, which now make up 60% of the average SERP.
2.Easily detected: Simple HTTP clients have unique fingerprints that Google can identify instantly. Even if you rotate user agents, you’ll still get blocked within a few requests.
3.Inconsistent results: Google returns different results to scrapers than it does to real human users. Basic scrapers often get outdated or incomplete data that doesn’t match what actual users see.
To overcome these limitations, you need to use a headless browser – a real browser that runs without a graphical interface, allowing you to automate exactly the same actions a human user would take.
The Best Tool for Modern SERP Scraping: Playwright
There are several headless browser libraries available, but Playwright is by far the best for Google SERP scraping. Developed by Microsoft, Playwright is faster, more reliable and has better anti-detection capabilities than older tools like Selenium.
Playwright allows you to:
- Automate Chrome, Firefox and Safari with a single API
- Simulate realistic mouse movements, scrolling and typing
- Intercept and modify network requests
- Take screenshots and record videos
- Extract data from dynamic content that loads asynchronously
Complete Production-Grade Scraper Implementation
Below is a complete, production-ready Google SERP scraper using Playwright. This script implements all the best practices we’ll cover in this guide, including humanized delays, natural scrolling, and proxy integration.
python
import random
import time
from playwright.sync_api import sync_playwright
def human_delay(min_ms=600, max_ms=2500):"""Add a random delay to mimic human behavior"""
time.sleep(random.uniform(min_ms / 1000, max_ms / 1000))def human_scroll(page):"""Simulate natural scrolling through the page"""
scroll_height = page.evaluate("document.body.scrollHeight")
current_position = 0while current_position < scroll_height:# Scroll a random distance
scroll_step = random.randint(200, 600)
current_position += scroll_step
# Don't scroll past the end of the pageif current_position > scroll_height:
current_position = scroll_height
page.mouse.wheel(0, scroll_step)
human_delay(200, 700)def extract_organic_results(page):"""Extract all organic results from the page"""
results = []
result_items = page.locator("div#search div.g")for i in range(result_items.count()):
item = result_items.nth(i)# Skip non-organic resultsif item.locator("div[data-ad-render]").count() > 0:continue
title = item.locator("h3").first.inner_text(timeout=2000) if item.locator("h3").first.is_visible() else None
url = item.locator("a").first.get_attribute("href", timeout=2000) if item.locator("a").first.is_visible() else None
description = item.locator("div.VwiC3b").first.inner_text(timeout=2000) if item.locator("div.VwiC3b").first.is_visible() else Noneif title and url:
results.append({"position": len(results) + 1,"title": title,"url": url,"description": description
})return results
def scrape_google_top_100(query, proxy=None):
all_results = []with sync_playwright() as p:# Launch browser with anti-detection flags
browser = p.chromium.launch(
headless=True,
args=["--disable-blink-features=AutomationControlled","--no-sandbox","--disable-dev-shm-usage","--disable-web-security","--allow-running-insecure-content"])# Create a new browser context with proxy if provided
context_args = {"viewport": {"width": 1366, "height": 768},"user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/143.0.0.0 Safari/537.36"}if proxy:
context_args["proxy"] = {"server": proxy["server"],"username": proxy["username"],"password": proxy["password"]}
context = browser.new_context(**context_args)
page = context.new_page()# Navigate to Google
page.goto("https://www.google.com", wait_until="domcontentloaded")
human_delay(1500, 3000)# Accept cookies if the prompt appearsif page.locator("button#L2AGLb").is_visible():
page.locator("button#L2AGLb").click()
human_delay(1000, 2000)# Type the search query naturally
search_box = page.locator("textarea[name='q']")
search_box.click()
human_delay(500, 1000)for char in query:
search_box.type(char, delay=random.randint(50, 150))
human_delay(500, 1000)
search_box.press("Enter")
human_delay(2000, 4000)
page_number = 1while len(all_results) < 100:print(f"Scraping page {page_number}")# Scroll naturally through the page to load all content
human_scroll(page)
human_delay(1000, 2000)# Extract results
page_results = extract_organic_results(page)print(f"Found {len(page_results)} results on page {page_number}")for result in page_results:if len(all_results) >= 100:break
result["page"] = page_number
all_results.append(result)# Check if there's a next page
next_button = page.locator("a#pnnext")if not next_button.is_visible() or len(all_results) >= 100:break# Click the next page button naturally
next_button.scroll_into_view_if_needed()
human_delay(1000, 2000)
next_button.click()
page.wait_for_load_state("domcontentloaded")
human_delay(2000, 4000)
page_number += 1
browser.close()return all_results
# Usage with IPFLY proxyif __name__ == "__main__":# Replace with your IPFLY proxy credentials
ipfly_proxy = {"server": "http://gate.ipfly.com:10000","username": "your-ipfly-username","password": "your-ipfly-password"}
results = scrape_google_top_100("best wireless headphones 2026", proxy=ipfly_proxy)print(f"\nSuccessfully scraped {len(results)} results:")for result in results:print(f"{result['position']}. {result['title']} — {result['url']}")
Advanced Humanization Techniques
The script above implements basic humanization, but for maximum success rate, you should add these advanced techniques:
- Randomize browser fingerprints: Use different viewport sizes, user agents and browser settings for each session
- Vary session duration: Don’t spend exactly the same amount of time on each page
- Simulate mouse movements: Move the mouse around the page randomly before clicking links
- Add occasional mistakes: Type a wrong character and backspace it when entering search queries
- Randomize request order: Don’t always scrape pages in order from 1 to 10
The Critical Role of Proxies in Scaling
Even with the most sophisticated humanization, you will eventually get blocked if you send all your requests from the same IP address. This is especially true now that you need to send 10x more requests to collect the same amount of data.
For reliable, large-scale SERP scraping, you need to use high-quality residential proxies with automatic rotation. Residential proxies use IP addresses assigned to real homes, making your traffic indistinguishable from that of regular human users.
IPFLY’s residential proxy network is specifically optimized for Google SERP scraping. With over 10 million IPs in 190+ countries, you can distribute your requests across thousands of unique addresses, ensuring that no single IP sends more than one or two queries per day. Our automatic rotation feature switches your IP address for every request, drastically reducing CAPTCHA rates and allowing you to scale your scraping operations to millions of queries per day.
For the highest success rate, we recommend using mobile proxies for Google scraping. Mobile IPs have the lowest block rate of any proxy type, as Google is extremely hesitant to block them for fear of banning real mobile users.
Handling Dynamic SERP Elements
Modern SERPs contain much more than just organic results. To get a complete picture of the search results, you need to extract these dynamic elements as well:
- People Also Ask boxes: These contain common questions related to the search query
- Video results: YouTube and other video content that appears in the SERP
- Local Packs: Business listings for local search queries
- AI Overviews: Google’s AI-generated answers that appear at the top of many SERPs
- Shopping ads: Product listings for e-commerce queries
Playwright makes it easy to extract all these elements by simulating the same interactions a human user would take, such as clicking to expand People Also Ask boxes.

Scraping Google SERP in 2026 requires a much more sophisticated approach than it did just a year ago. The days of simple HTTP requests and the &num=100 parameter are gone forever.
Today, successful SERP scraping requires a combination of headless browser automation, advanced humanization techniques, and high-quality rotating proxies. By implementing the methods outlined in this guide and using IPFLY’s residential proxies, you can build a reliable, scalable scraping system that can handle even Google’s strictest anti-bot systems.