Checklist

Technical SEO Audit Checklist

Run this checklist against an existing live site to identify technical SEO issues. For a new site or redesign going live, use the Go-Live Checklist instead. For the full audit reasoning and process behind each section, see the SEO Audit Guide.

Before starting: crawl the site with Screaming Frog, Sitebulb, or a similar tool. Export the crawl data: most checks below reference it. Also open Google Search Console; you will need the Coverage, Core Web Vitals, and URL Inspection reports.


1. Crawlability

  • robots.txt is accessible at /robots.txt and returns a 200 status code
  • robots.txt does not block any pages that should be indexed (check key page types: homepage, category pages, key landing pages)
  • robots.txt does not block CSS or JavaScript files Googlebot needs to render pages
  • No accidental Disallow: / in production robots.txt
  • Crawl budget is not wasted on low-value URLs: parameterised URLs, faceted navigation, session IDs, internal search results
  • Orphan pages (pages with no internal links) are either removed or linked internally
  • Crawl depth: key pages are reachable within 3-4 clicks from the homepage
  • No crawler traps: infinite pagination, calendar archives generating unlimited URLs, or session-based URL parameters creating unique URLs per visit
  • XML sitemap is declared in robots.txt via Sitemap: directive
  • AI crawler access is intentional: check whether GPTBot, OAI-SearchBot, ClaudeBot, and PerplexityBot are allowed or blocked in robots.txt and confirm this reflects a deliberate decision rather than an inherited default. Note: GPTBot (training) and OAI-SearchBot (ChatGPT Search citations) are separate OpenAI agents — blocking one does not block the other
  • No Googlebot IP ranges blocked by geo-restriction rules or Content Security Policy headers. Review hosting and CDN configuration if the site uses IP-based access controls

For more, see Crawlability and robots.txt, the Robots.txt reference, and Crawl budget.


2. Indexation

  • Google Search Console Coverage report reviewed: note all “Error”, “Valid with warning”, and “Excluded” pages
  • Key pages are indexed. Spot-check with URL Inspection in Search Console
  • No important pages in “Crawled - currently not indexed” state. This indicates a quality or duplication issue
  • No important pages in “Discovered - currently not indexed” state. This may indicate crawl budget or internal linking issues
  • No important pages with noindex tag (check crawl export for noindex on pages that should rank)
  • No important pages blocked by robots.txt but still appearing in the index. URLs appear without description in SERPs if this occurs
  • Paginated pages handled correctly: paginated content is either indexable or canonicalised
  • Faceted navigation / filter URLs are either blocked, canonicalised, or have noindex as appropriate
  • No soft 404s: pages returning 200 with “not found” or empty content (check Search Console > Coverage > “Excluded” > “Soft 404”)
  • Index count is plausible relative to actual page count (site:example.com estimate as a rough check)

For more, see Indexing, When to use noindex, and Why isn’t my page indexed?.


3. Canonical tags

  • All key pages have a self-referencing canonical tag
  • Canonical tags use absolute URLs (not relative paths)
  • No canonical tags pointing to redirecting URLs
  • No canonical tags pointing to non-200 pages
  • Canonical tags are consistent with hreflang tags (hreflang must reference canonical URLs)
  • Paginated pages: page 2+ either have self-canonicals or canonical back to page 1 if consolidation is intended
  • www vs. non-www is consistent: one resolves to the other via 301, and canonicals reflect the chosen version
  • HTTPS vs. HTTP is consistent: all canonicals use HTTPS

For more, see Canonical tags and Duplicate content.


4. Redirects

  • No redirect chains longer than 2 hops (A → B → C should be resolved to A → C)
  • No redirect loops (A → B → A)
  • All redirects are 301 (permanent) unless intentionally temporary (in which case use 302)
  • 302s are reviewed. Confirm each is genuinely temporary, not a permanent move misconfigured as temporary
  • Internal links point directly to final destination URLs, not through intermediate redirects
  • Old URLs from previous migrations are still redirecting correctly
  • No broken redirect destinations. Redirects should not point to 404 pages

For more, see Redirects and HTTP status codes.


5. Status codes

  • No important pages returning 4xx. Check crawl export, filter by 4xx
  • No important pages returning 5xx
  • 404 pages return a true 404 status code. Soft 404s returning 200 must be avoided
  • No important resources (CSS, JS, images) returning 4xx. Check by filtering crawl by resource type
  • Search Console: Coverage > Error > “Not found (404)” reviewed and resolved or accepted

6. Site architecture and internal linking

  • Navigation links to all top-level sections
  • Important pages receive internal links from relevant, high-authority pages on the site
  • No key pages are orphaned. All key pages should have at least one internal link pointing to them
  • Breadcrumbs are present and correctly structured
  • Anchor text in internal links is descriptive and varied. Avoid “click here” or exact-match keyword repetition
  • No broken internal links. Filter crawl export by 4xx on internal links
  • Pagination: multi-page sections are linked sequentially
  • Site hierarchy is logical: category pages link to subcategories, subcategories link to content

For more, see Site architecture.


7. HTTPS and security

  • SSL certificate is valid (not expired, not self-signed)
  • HTTP redirects to HTTPS via 301, not 302
  • www and non-www both redirect to the canonical version
  • No mixed content (HTTP resources loaded on HTTPS pages). Check browser console on key pages
  • HSTS header is set (Strict-Transport-Security)
  • No security warnings in Chrome for the domain

For more, see HTTPS and security.


8. Core Web Vitals

  • Search Console Core Web Vitals report reviewed. Note which URL groups fail on mobile and desktop
  • LCP is under 2.5 seconds (check PageSpeed Insights field data for key templates)
  • INP is under 200ms
  • CLS is under 0.1
  • No render-blocking resources identified in Lighthouse that have not been addressed
  • LCP image (if applicable) has fetchpriority="high". It should not be lazy-loaded
  • Images have explicit width and height attributes set
  • Web fonts are preloaded. Alternatively, use font-display: optional or font-display: swap

For more, see Core Web Vitals and the Core Web Vitals Optimisation Guide.


9. Mobile

  • Site passes Google’s Mobile-Friendly Test for key page templates
  • Viewport meta tag is present on all pages
  • Content is the same on mobile and desktop. Mobile-first indexing means Google indexes the mobile version
  • No content hidden behind expandable elements inaccessible to Googlebot
  • No intrusive interstitials that would trigger a mobile usability penalty (full-screen popups on page load)
  • Tap targets are appropriately sized (Google recommends 48x48px minimum, 8px spacing)1
  • Text is readable without zooming: no font sizes below 12px for body text

For more, see Mobile SEO.


10. Structured data

  • Structured data is present on appropriate page types: articles, products, local business, breadcrumbs. Note: FAQPage schema no longer produces rich results in Google Search; it retains value for AI extraction only. Sitelinks Search Box (SearchAction on WebSite) has been deprecated and no longer produces a visual result
  • No structured data errors in Search Console (Rich Results report)
  • Structured data validated with Google’s Rich Results Test for key templates
  • No structured data present for content not visible to users. Hidden structured data violates Google’s guidelines
  • Organisation schema on homepage with correct name, url, logo, and sameAs social profiles
  • BreadcrumbList schema matches visible breadcrumb navigation
  • Article schema includes datePublished and dateModified

For more, see Structured data.


11. Page speed and performance

  • Server response time (TTFB) is under 800ms for key pages
  • Images are compressed and served in WebP or AVIF format
  • Images are correctly sized for their display dimensions. Avoid serving 2000px images in a 400px slot
  • Images have descriptive alt text. Empty alt on informational images prevents image search indexation and fails accessibility requirements
  • Lazy loading is applied to below-the-fold images (loading="lazy")
  • JavaScript is deferred or asynced where appropriate
  • Third-party scripts (analytics, chat, ads) are loaded asynchronously. They should not block rendering
  • A CDN is in use. It reduces latency for geographically distributed users
  • Browser caching headers are set appropriately. Static assets should be cached per your TTL strategy

12. International (if applicable)

  • hreflang tags are implemented on all language/region variants
  • All hreflang tags are reciprocated across all versions in the cluster
  • hreflang tags use correct ISO 639-1 language codes and ISO 3166-1 alpha-2 region codes
  • hreflang URLs resolve to 200 responses (not redirects or 404s)
  • x-default hreflang is set
  • Geo-targeting is configured in Search Console (for subfolder/subdomain structures)
  • Content is genuinely localised. Machine-translated content must be reviewed before publishing

For more, see Hreflang and international SEO and the International SEO guide.


13. JavaScript rendering

  • Key content (headings, body text, links) is present in the HTML source. JavaScript should not be the only source of content
  • Compare View Source against rendered DOM in DevTools. Content visible in browser should appear in source or be present after rendering
  • Navigation links are crawlable <a href> elements. Avoid JavaScript onClick handlers without href
  • Google’s cached version of key pages shows the correct rendered content
  • Internal links in JavaScript-rendered content use standard <a> tags

For more, see JavaScript SEO.


14. Log file analysis (if access available)

  • Googlebot is crawling key pages. Confirm in server logs filtered by Googlebot user agent
  • Googlebot is not crawling large volumes of low-value URLs. Parameterised, session-based, and internal search URLs should be blocked
  • Crawl frequency on key pages aligns with update frequency. High-priority pages should be crawled regularly
  • 5xx errors visible in logs but not in crawl tools. Some errors are transient but worth monitoring
  • No suspicious non-Google bots consuming significant crawl bandwidth

For more, see Log file analysis.


Prioritising findings

Not all technical issues are equal. Prioritise by impact:

Critical (fix immediately):

  • Crawl blocks on pages that should be indexed
  • noindex on pages that should rank
  • 5xx errors on key pages
  • Redirect loops

High (fix in current sprint):

  • Redirect chains of 3+ hops
  • Broken internal links to important pages
  • Canonical tags pointing to wrong URLs
  • Core Web Vitals failures on high-traffic templates

Medium (schedule):

  • Orphan pages with some value
  • Missing structured data on eligible page types
  • Mixed content warnings
  • Large image files not yet converted to WebP

Low (backlog):

  • Minor crawl inefficiencies on low-traffic sections
  • Structured data enhancements (not errors)
  • Crawl depth improvements beyond 4 clicks

Footnotes

  1. Tap targets are not sized appropriately — Chrome for Developers