Technical SEO Audit Checklist
Last updated
Run this checklist against an existing live site to identify technical SEO issues. For a new site or redesign going live, use the Go-Live Checklist instead.
Before starting: crawl the site with Screaming Frog, Sitebulb, or a similar tool. Export the crawl data — most checks below reference it. Also open Google Search Console; you will need the Coverage, Core Web Vitals, and URL Inspection reports.
1. Crawlability
- robots.txt is accessible at
/robots.txtand returns a 200 status code - robots.txt does not block any pages that should be indexed (check key page types: homepage, category pages, key landing pages)
- robots.txt does not block CSS or JavaScript files Googlebot needs to render pages
- No accidental
Disallow: /in production robots.txt - Crawl budget is not wasted on low-value URLs: parameterised URLs, faceted navigation, session IDs, internal search results
- Orphan pages (pages with no internal links) are either removed or linked internally
- Crawl depth: key pages are reachable within 3–4 clicks from the homepage
- No crawler traps: infinite pagination, calendar archives generating unlimited URLs, or session-based URL parameters creating unique URLs per visit
- XML sitemap is declared in robots.txt via
Sitemap:directive
2. Indexation
- Google Search Console Coverage report reviewed — note all “Error”, “Valid with warning”, and “Excluded” pages
- Key pages are indexed — spot-check with URL Inspection in Search Console
- No important pages in “Crawled — currently not indexed” state (indicates a quality or duplication issue)
- No important pages in “Discovered — currently not indexed” state (may indicate crawl budget or internal linking issue)
- No important pages with noindex tag (check crawl export for noindex on pages that should rank)
- No important pages blocked by robots.txt but still appearing in the index (URL without description in SERPs)
- Paginated pages handled correctly — paginated content is either indexable or canonicalised
- Faceted navigation / filter URLs are either blocked, canonicalised, or have noindex as appropriate
- No soft 404s — pages returning 200 with “not found” or empty content (check Search Console > Coverage > “Excluded” > “Soft 404”)
- Index count is plausible relative to actual page count (
site:example.comestimate as a rough check)
3. Canonical tags
- All key pages have a self-referencing canonical tag
- Canonical tags use absolute URLs (not relative paths)
- No canonical tags pointing to redirecting URLs
- No canonical tags pointing to non-200 pages
- Canonical tags are consistent with hreflang tags (hreflang must reference canonical URLs)
- Paginated pages: page 2+ either have self-canonicals or canonical back to page 1 (if consolidation is intended)
- www vs. non-www is consistent — one resolves to the other via 301, and canonicals reflect the chosen version
- HTTPS vs. HTTP is consistent — all canonicals use HTTPS
4. Redirects
- No redirect chains longer than 2 hops (A → B → C should be resolved to A → C)
- No redirect loops (A → B → A)
- All redirects are 301 (permanent) unless intentionally temporary
- 302s are reviewed — confirm each is genuinely temporary, not a permanent move misconfigured as temporary
- Internal links point directly to final destination URLs, not through redirects
- Old URLs from previous migrations are still redirecting correctly
- No broken redirect destinations (redirecting to a 404)
5. Status codes
- No important pages returning 4xx (check crawl export, filter by 4xx)
- No important pages returning 5xx
- 404 pages return a true 404 status code (not a soft 404 returning 200)
- No important resources (CSS, JS, images) returning 4xx — check by filtering crawl by resource type
- Search Console > Coverage > Error > “Not found (404)” reviewed and resolved or accepted
6. Site architecture and internal linking
- Navigation links to all top-level sections
- Important pages receive internal links from relevant, high-authority pages on the site
- No key pages are orphaned (zero internal links pointing to them)
- Breadcrumbs are present and correctly structured
- Anchor text in internal links is descriptive and varied — not all “click here” or exact-match keyword repetition
- No broken internal links (crawl export filtered by 4xx on internal links)
- Pagination: multi-page sections are linked sequentially
- Site hierarchy is logical — category pages link to subcategories, subcategories link to content
7. HTTPS and security
- SSL certificate is valid (not expired, not self-signed)
- HTTP redirects to HTTPS via 301 (not 302)
- www and non-www both redirect to the canonical version
- No mixed content (HTTP resources loaded on HTTPS pages) — check browser console on key pages
- HSTS header is set (
Strict-Transport-Security) - No security warnings in Chrome for the domain
8. Core Web Vitals
- Search Console Core Web Vitals report reviewed — note which URL groups fail on mobile and desktop
- LCP is under 2.5 seconds (check PageSpeed Insights field data for key templates)
- INP is under 200ms
- CLS is under 0.1
- No render-blocking resources identified in Lighthouse that have not been addressed
- LCP image (if applicable) has
fetchpriority="high"and is not lazy-loaded - Images have explicit
widthandheightattributes set - Web fonts are preloaded or served with
font-display: optionalorfont-display: swap
For specific fixes, see the Core Web Vitals Optimisation Guide.
9. Mobile
- Site passes Google’s Mobile-Friendly Test for key page templates
- Viewport meta tag is present on all pages
- Content is the same on mobile and desktop (mobile-first indexing — Google indexes the mobile version)
- No content hidden behind expandable elements that Googlebot cannot access
- No intrusive interstitials that would trigger a mobile usability penalty (full-screen popups on page load)
- Tap targets are appropriately sized (Google recommends 48x48px minimum, 8px spacing)
- Text is readable without zooming — no font sizes below 12px for body text
10. Structured data
- Structured data is present on appropriate page types: articles, products, FAQs, local business, breadcrumbs, sitelinks searchbox
- No structured data errors in Search Console (Rich Results report)
- Structured data validated with Google’s Rich Results Test for key templates
- No structured data present for content not visible to users (hidden structured data is against Google’s guidelines)
- Organisation schema on homepage with correct
name,url,logo, andsameAssocial profiles - BreadcrumbList schema matches visible breadcrumb navigation
- Article schema includes
datePublishedanddateModified
11. Page speed and performance
- Server response time (TTFB) is under 800ms for key pages
- Images are compressed and served in WebP or AVIF format
- Images are correctly sized for their display dimensions (not serving 2000px images in a 400px slot)
- Lazy loading is applied to below-the-fold images (
loading="lazy") - JavaScript is deferred or asynced where appropriate
- Third-party scripts (analytics, chat, ads) are loaded asynchronously and do not block rendering
- A CDN is in use to reduce latency for geographically distributed users
- Browser caching headers are set appropriately for static assets
12. International (if applicable)
- hreflang tags are implemented on all language/region variants
- All hreflang tags are reciprocated across all versions in the cluster
- hreflang tags use correct ISO 639-1 language codes and ISO 3166-1 alpha-2 region codes
- hreflang URLs resolve to 200 responses (not redirects or 404s)
- x-default hreflang is set
- Geo-targeting is configured in Search Console (for subfolder/subdomain structures)
- Content is genuinely localised, not machine-translated without review
13. JavaScript rendering
- Key content (headings, body text, links) is present in the HTML source, not only injected by JavaScript
- Compare
View Sourceagainst rendered DOM in DevTools — content visible in browser should appear in source or be present after rendering - Navigation links are crawlable
<a href>elements, not JavaScriptonClickhandlers without href - Google’s cached version of key pages reflects the correct rendered content
- Internal links in JavaScript-rendered content use standard
<a>tags
14. Log file analysis (if access available)
- Googlebot is crawling key pages (confirm in server logs filtered by Googlebot user agent)
- Googlebot is not crawling large volumes of low-value URLs (parameterised, session-based, internal search)
- Crawl frequency on key pages aligns with update frequency — high-priority pages being crawled regularly
- 5xx errors visible in logs but not in crawl tools (some errors are transient but worth monitoring)
- No suspicious non-Google bots consuming significant crawl bandwidth
Prioritising findings
Not all technical issues are equal. Prioritise by impact:
Critical (fix immediately):
- Crawl blocks on pages that should be indexed
- noindex on pages that should rank
- 5xx errors on key pages
- Redirect loops
High (fix in current sprint):
- Redirect chains of 3+ hops
- Broken internal links to important pages
- Canonical tags pointing to wrong URLs
- Core Web Vitals failures on high-traffic templates
Medium (schedule):
- Orphan pages with some value
- Missing structured data on eligible page types
- Mixed content warnings
- Large image files not yet converted to WebP
Low (backlog):
- Minor crawl inefficiencies on low-traffic sections
- Structured data enhancements (not errors)
- Crawl depth improvements beyond 4 clicks