Checklist

Technical SEO Audit Checklist

Run this checklist against an existing live site to identify technical SEO issues. For a new site or redesign going live, use the Go-Live Checklist instead.

Before starting: crawl the site with Screaming Frog, Sitebulb, or a similar tool. Export the crawl data — most checks below reference it. Also open Google Search Console; you will need the Coverage, Core Web Vitals, and URL Inspection reports.


1. Crawlability

  • robots.txt is accessible at /robots.txt and returns a 200 status code
  • robots.txt does not block any pages that should be indexed (check key page types: homepage, category pages, key landing pages)
  • robots.txt does not block CSS or JavaScript files Googlebot needs to render pages
  • No accidental Disallow: / in production robots.txt
  • Crawl budget is not wasted on low-value URLs: parameterised URLs, faceted navigation, session IDs, internal search results
  • Orphan pages (pages with no internal links) are either removed or linked internally
  • Crawl depth: key pages are reachable within 3–4 clicks from the homepage
  • No crawler traps: infinite pagination, calendar archives generating unlimited URLs, or session-based URL parameters creating unique URLs per visit
  • XML sitemap is declared in robots.txt via Sitemap: directive

2. Indexation

  • Google Search Console Coverage report reviewed — note all “Error”, “Valid with warning”, and “Excluded” pages
  • Key pages are indexed — spot-check with URL Inspection in Search Console
  • No important pages in “Crawled — currently not indexed” state (indicates a quality or duplication issue)
  • No important pages in “Discovered — currently not indexed” state (may indicate crawl budget or internal linking issue)
  • No important pages with noindex tag (check crawl export for noindex on pages that should rank)
  • No important pages blocked by robots.txt but still appearing in the index (URL without description in SERPs)
  • Paginated pages handled correctly — paginated content is either indexable or canonicalised
  • Faceted navigation / filter URLs are either blocked, canonicalised, or have noindex as appropriate
  • No soft 404s — pages returning 200 with “not found” or empty content (check Search Console > Coverage > “Excluded” > “Soft 404”)
  • Index count is plausible relative to actual page count (site:example.com estimate as a rough check)

3. Canonical tags

  • All key pages have a self-referencing canonical tag
  • Canonical tags use absolute URLs (not relative paths)
  • No canonical tags pointing to redirecting URLs
  • No canonical tags pointing to non-200 pages
  • Canonical tags are consistent with hreflang tags (hreflang must reference canonical URLs)
  • Paginated pages: page 2+ either have self-canonicals or canonical back to page 1 (if consolidation is intended)
  • www vs. non-www is consistent — one resolves to the other via 301, and canonicals reflect the chosen version
  • HTTPS vs. HTTP is consistent — all canonicals use HTTPS

4. Redirects

  • No redirect chains longer than 2 hops (A → B → C should be resolved to A → C)
  • No redirect loops (A → B → A)
  • All redirects are 301 (permanent) unless intentionally temporary
  • 302s are reviewed — confirm each is genuinely temporary, not a permanent move misconfigured as temporary
  • Internal links point directly to final destination URLs, not through redirects
  • Old URLs from previous migrations are still redirecting correctly
  • No broken redirect destinations (redirecting to a 404)

5. Status codes

  • No important pages returning 4xx (check crawl export, filter by 4xx)
  • No important pages returning 5xx
  • 404 pages return a true 404 status code (not a soft 404 returning 200)
  • No important resources (CSS, JS, images) returning 4xx — check by filtering crawl by resource type
  • Search Console > Coverage > Error > “Not found (404)” reviewed and resolved or accepted

6. Site architecture and internal linking

  • Navigation links to all top-level sections
  • Important pages receive internal links from relevant, high-authority pages on the site
  • No key pages are orphaned (zero internal links pointing to them)
  • Breadcrumbs are present and correctly structured
  • Anchor text in internal links is descriptive and varied — not all “click here” or exact-match keyword repetition
  • No broken internal links (crawl export filtered by 4xx on internal links)
  • Pagination: multi-page sections are linked sequentially
  • Site hierarchy is logical — category pages link to subcategories, subcategories link to content

7. HTTPS and security

  • SSL certificate is valid (not expired, not self-signed)
  • HTTP redirects to HTTPS via 301 (not 302)
  • www and non-www both redirect to the canonical version
  • No mixed content (HTTP resources loaded on HTTPS pages) — check browser console on key pages
  • HSTS header is set (Strict-Transport-Security)
  • No security warnings in Chrome for the domain

8. Core Web Vitals

  • Search Console Core Web Vitals report reviewed — note which URL groups fail on mobile and desktop
  • LCP is under 2.5 seconds (check PageSpeed Insights field data for key templates)
  • INP is under 200ms
  • CLS is under 0.1
  • No render-blocking resources identified in Lighthouse that have not been addressed
  • LCP image (if applicable) has fetchpriority="high" and is not lazy-loaded
  • Images have explicit width and height attributes set
  • Web fonts are preloaded or served with font-display: optional or font-display: swap

For specific fixes, see the Core Web Vitals Optimisation Guide.


9. Mobile

  • Site passes Google’s Mobile-Friendly Test for key page templates
  • Viewport meta tag is present on all pages
  • Content is the same on mobile and desktop (mobile-first indexing — Google indexes the mobile version)
  • No content hidden behind expandable elements that Googlebot cannot access
  • No intrusive interstitials that would trigger a mobile usability penalty (full-screen popups on page load)
  • Tap targets are appropriately sized (Google recommends 48x48px minimum, 8px spacing)
  • Text is readable without zooming — no font sizes below 12px for body text

10. Structured data

  • Structured data is present on appropriate page types: articles, products, FAQs, local business, breadcrumbs, sitelinks searchbox
  • No structured data errors in Search Console (Rich Results report)
  • Structured data validated with Google’s Rich Results Test for key templates
  • No structured data present for content not visible to users (hidden structured data is against Google’s guidelines)
  • Organisation schema on homepage with correct name, url, logo, and sameAs social profiles
  • BreadcrumbList schema matches visible breadcrumb navigation
  • Article schema includes datePublished and dateModified

11. Page speed and performance

  • Server response time (TTFB) is under 800ms for key pages
  • Images are compressed and served in WebP or AVIF format
  • Images are correctly sized for their display dimensions (not serving 2000px images in a 400px slot)
  • Lazy loading is applied to below-the-fold images (loading="lazy")
  • JavaScript is deferred or asynced where appropriate
  • Third-party scripts (analytics, chat, ads) are loaded asynchronously and do not block rendering
  • A CDN is in use to reduce latency for geographically distributed users
  • Browser caching headers are set appropriately for static assets

12. International (if applicable)

  • hreflang tags are implemented on all language/region variants
  • All hreflang tags are reciprocated across all versions in the cluster
  • hreflang tags use correct ISO 639-1 language codes and ISO 3166-1 alpha-2 region codes
  • hreflang URLs resolve to 200 responses (not redirects or 404s)
  • x-default hreflang is set
  • Geo-targeting is configured in Search Console (for subfolder/subdomain structures)
  • Content is genuinely localised, not machine-translated without review

13. JavaScript rendering

  • Key content (headings, body text, links) is present in the HTML source, not only injected by JavaScript
  • Compare View Source against rendered DOM in DevTools — content visible in browser should appear in source or be present after rendering
  • Navigation links are crawlable <a href> elements, not JavaScript onClick handlers without href
  • Google’s cached version of key pages reflects the correct rendered content
  • Internal links in JavaScript-rendered content use standard <a> tags

14. Log file analysis (if access available)

  • Googlebot is crawling key pages (confirm in server logs filtered by Googlebot user agent)
  • Googlebot is not crawling large volumes of low-value URLs (parameterised, session-based, internal search)
  • Crawl frequency on key pages aligns with update frequency — high-priority pages being crawled regularly
  • 5xx errors visible in logs but not in crawl tools (some errors are transient but worth monitoring)
  • No suspicious non-Google bots consuming significant crawl bandwidth

Prioritising findings

Not all technical issues are equal. Prioritise by impact:

Critical (fix immediately):

  • Crawl blocks on pages that should be indexed
  • noindex on pages that should rank
  • 5xx errors on key pages
  • Redirect loops

High (fix in current sprint):

  • Redirect chains of 3+ hops
  • Broken internal links to important pages
  • Canonical tags pointing to wrong URLs
  • Core Web Vitals failures on high-traffic templates

Medium (schedule):

  • Orphan pages with some value
  • Missing structured data on eligible page types
  • Mixed content warnings
  • Large image files not yet converted to WebP

Low (backlog):

  • Minor crawl inefficiencies on low-traffic sections
  • Structured data enhancements (not errors)
  • Crawl depth improvements beyond 4 clicks