HTTP User-Agents in SEO

The User-Agent HTTP request header tells the server which client is making the request. Search crawlers use it to identify themselves. Servers can respond differently based on it, and those differences are not always intentional. Checking what your server returns to a crawler, rather than a browser, is one of the more direct auditing steps available to you. Sending a request with a different user-agent string than your real client is sometimes called user-agent switching or spoofing.

What is the User-Agent header?

Every HTTP request includes headers alongside the URL. The User-Agent header is a string the client sends to identify itself: its name, version, and sometimes additional context.

A browser might send:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36

Googlebot’s desktop crawler sends something like:

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/W.X.Y.Z Safari/537.36

The format follows a product/version token pattern, often with a comment in parentheses. User-agent strings are self-reported: any client can claim any identity. There is no authentication in the header itself. Actual identity verification requires checking the source IP address against published ranges or reverse DNS records.

How do search crawlers identify themselves?

Well-behaved crawlers publish their user-agent strings in official documentation and use them consistently. The string in a robots.txt User-agent: directive matches on a token, not the full string: Googlebot matches any user-agent containing that substring.

CrawlerOperatorToken
Googlebot (desktop)GoogleGooglebot/2.1
Googlebot-SmartphoneGoogleGooglebot-Smartphone
BingbotMicrosoftbingbot/2.0
DuckDuckBotDuckDuckGoDuckDuckBot/1.1
SlurpYahooYahoo! Slurp

For AI crawlers (GPTBot, ClaudeBot, PerplexityBot and others), see AI Crawler User-Agents.

When analysing crawler activity by user-agent in server logs, see Log File Analysis.

When would you use user-agent switching?

  • Cloaking audit: check whether your site serves different HTML to crawlers than to users, which would violate Google’s spam policies
  • Access and blocking check: confirm whether a specific crawler is being blocked by a robots.txt rule, a misconfigured firewall, or an overly aggressive hosting provider; a 200 for a browser alongside a 403 or redirect for Googlebot is a clear signal
  • Redirect chain debugging: confirm crawlers and browsers are sent to the same final destination
  • CMP audit: some consent management platforms inject noindex via x-robots-tag for unrecognised clients, silently suppressing indexing without any trace in the page HTML
  • Dynamic serving verification: confirm the Vary: User-Agent header is present and the mobile and desktop variants are what you expect
  • A/B testing sanity check: confirm your testing framework is not accidentally cloaking by serving an unmodified page to bots while users see a variant

How do you check server responses with cURL?

cURL is a command-line tool available on macOS, Linux, and Windows. If you’re not comfortable with the terminal, the same checks can be done in Chrome DevTools or via a browser extension, covered below.

cURL with the -A flag sets a custom user-agent string. This lets you send requests that your server reads as coming from a specific crawler, then inspect what it returns.

Check response headers only (HEAD request):

curl -A "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" \
  -I https://example.com/page

-I sends a HEAD request and returns only response headers: status code, Content-Type, x-robots-tag, redirect location, Vary, and cache directives.

Follow redirects and show each hop:

curl -A "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" \
  -I -L https://example.com/page

-L follows redirects. Each hop is printed in sequence, so you can see the full chain a Googlebot request would take and compare it to what a browser receives.

Fetch the full HTML response body:

curl -A "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" \
  https://example.com/page

Compare this to your browser’s “View Source.” They should be functionally identical. Significant differences are worth investigating.

Check the mobile crawler:

curl -A "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" \
  -I https://example.com/page

Google publishes its current user-agent strings in Search Central documentation and recommends using them to test server behaviour.1

Things to check in the response:

  • Status code: is the crawler getting a 200, a redirect, or a 4xx/5xx when a browser gets a 200?
  • Redirect destination: does Googlebot get sent somewhere a browser does not?
  • x-robots-tag header: some consent management platforms inject noindex into response headers for unrecognised clients.2 This would suppress indexing without any sign of it in the page HTML.
  • Vary header: a Vary: User-Agent response signals that content differs by client. Not inherently a problem, but worth understanding.
  • Content differences: compare the HTML body to what a browser receives.

Browser-based alternatives

If you prefer not to use the terminal, there are two options that require no extra software:

Chrome DevTools (built in): Open DevTools, open the three-dot menu, select More tools, then Network conditions. Under User agent, uncheck “Use browser default”, then select a preset or type a custom string. Reload the page. The override applies to the current tab for the session.

Browser extensions: Several extensions let you switch user-agents per tab from a toolbar button. Search the Chrome Web Store for “User-Agent” for current options. Note that extensions send the spoofed string from your browser; the request still originates from your IP address, not Google’s, so this tests server-side logic only, not IP-based filtering.

Reverse DNS lookup

Some servers, particularly those with bot protection or security plugins, perform a reverse DNS lookup to verify that a Googlebot request comes from a Google IP. When you spoof Googlebot's user-agent from your own machine, that check fails and the server may block the request. This creates a false positive: you see a 403 or redirect and conclude Googlebot is blocked, when real Googlebot would pass the check. Use the URL Inspection tool in Google Search Console (Live Test) to confirm what Google's crawler actually sees.

What is cloaking?

Cloaking is serving meaningfully different content to search crawlers than to users, with the intent of manipulating rankings. Google treats intentional cloaking as a manual action violation.3

Accidental cloaking is more common and is not penalised once identified and corrected. Common causes:

  • Login or paywall redirects: a logged-out user sees a gate; Googlebot gets redirected to a different URL entirely
  • A/B testing frameworks: some tools serve an unmodified page to bots and a variant to real users, meaning the indexed content does not match what most visitors see
  • JavaScript-heavy pages: the raw HTML fetched at crawl time may look different from the rendered page. This is not cloaking as long as the same content is available to both after rendering. See Crawl, render, and index for how Google’s rendering pipeline works.
  • Consent management scripts: some CMP implementations serve different page structure to clients that do not present a cookie consent cookie

To check: run the cURL commands above and compare the output to your browser’s View Source (unrendered HTML) and to the fully rendered DOM in DevTools. The rendered DOM reflects what Google sees after its second-wave rendering pass.

When is UA-based content variation acceptable?

Serving different content by user-agent is acceptable when it reflects a genuine difference in client capability, not an attempt to influence what a crawler indexes.

Acceptable:

  • Dynamic serving for mobile vs. desktop: different HTML for mobile user-agents. Google prefers responsive design but supports dynamic serving when the Vary: User-Agent header is present.4
  • Language or regional variants negotiated via Accept-Language.

Not acceptable:

  • Serving keyword-dense content to crawlers while showing a different experience to users
  • Hiding pop-ups, interstitials, or paywalls from crawlers while showing them to users
  • Serving a clean canonical to bots while users see a paginated or faceted version

Frequently asked questions

Can I rely on the user-agent string to confirm a request is from real Googlebot?
No. User-agent strings can be spoofed by any client. To confirm a request is genuinely from Google, perform a reverse DNS lookup on the source IP: the hostname must resolve to googlebot.com or google.com, and that hostname must then resolve back to the same IP (forward-confirmed reverse DNS). See Log File Analysis for detail on the verification method.

Does Googlebot execute JavaScript?
Not at the crawl stage. Googlebot fetches raw HTML first, which is what cURL returns. It later processes the page through Google’s Web Rendering Service, which executes JavaScript. The cURL output shows what Googlebot sees before rendering. See Crawl, render, and index.

Is it safe to use Googlebot’s user-agent string for testing my own site?
Yes. Google recommends using its published user-agent strings to test how your server responds to Googlebot requests.1

Where is robots.txt syntax covered?
In Crawlability and robots.txt, which covers User-agent: directives, Disallow:, and how to apply rules to specific crawlers.

Footnotes

  1. Google crawlers and user agents — Google Search Central 2

  2. x-robots-tag HTTP header — Google Search Central

  3. Cloaking — Google Search Central spam policies

  4. Dynamic serving — Google Search Central