Technical SEO Audit Checklist
Last updated
Run this checklist against an existing live site to identify technical SEO issues. For a new site or redesign going live, use the Go-Live Checklist instead. For the full audit reasoning and process behind each section, see the SEO Audit Guide.
Before starting: crawl the site with Screaming Frog, Sitebulb, or a similar tool. Export the crawl data: most checks below reference it. Also open Google Search Console; you will need the Coverage, Core Web Vitals, and URL Inspection reports.
1. Crawlability
- robots.txt is accessible at
/robots.txtand returns a 200 status code - robots.txt does not block any pages that should be indexed (check key page types: homepage, category pages, key landing pages)
- robots.txt does not block CSS or JavaScript files Googlebot needs to render pages
- No accidental
Disallow: /in production robots.txt - Crawl budget is not wasted on low-value URLs: parameterised URLs, faceted navigation, session IDs, internal search results
- Orphan pages (pages with no internal links) are either removed or linked internally
- Crawl depth: key pages are reachable within 3-4 clicks from the homepage
- No crawler traps: infinite pagination, calendar archives generating unlimited URLs, or session-based URL parameters creating unique URLs per visit
- XML sitemap is declared in robots.txt via
Sitemap:directive - AI crawler access is intentional: check whether GPTBot, OAI-SearchBot, ClaudeBot, and PerplexityBot are allowed or blocked in robots.txt and confirm this reflects a deliberate decision rather than an inherited default. Note: GPTBot (training) and OAI-SearchBot (ChatGPT Search citations) are separate OpenAI agents — blocking one does not block the other
- No Googlebot IP ranges blocked by geo-restriction rules or Content Security Policy headers. Review hosting and CDN configuration if the site uses IP-based access controls
For more, see Crawlability and robots.txt, the Robots.txt reference, and Crawl budget.
2. Indexation
- Google Search Console Coverage report reviewed: note all “Error”, “Valid with warning”, and “Excluded” pages
- Key pages are indexed. Spot-check with URL Inspection in Search Console
- No important pages in “Crawled - currently not indexed” state. This indicates a quality or duplication issue
- No important pages in “Discovered - currently not indexed” state. This may indicate crawl budget or internal linking issues
- No important pages with noindex tag (check crawl export for noindex on pages that should rank)
- No important pages blocked by robots.txt but still appearing in the index. URLs appear without description in SERPs if this occurs
- Paginated pages handled correctly: paginated content is either indexable or canonicalised
- Faceted navigation / filter URLs are either blocked, canonicalised, or have noindex as appropriate
- No soft 404s: pages returning 200 with “not found” or empty content (check Search Console > Coverage > “Excluded” > “Soft 404”)
- Index count is plausible relative to actual page count (
site:example.comestimate as a rough check)
For more, see Indexing, When to use noindex, and Why isn’t my page indexed?.
3. Canonical tags
- All key pages have a self-referencing canonical tag
- Canonical tags use absolute URLs (not relative paths)
- No canonical tags pointing to redirecting URLs
- No canonical tags pointing to non-200 pages
- Canonical tags are consistent with hreflang tags (hreflang must reference canonical URLs)
- Paginated pages: page 2+ either have self-canonicals or canonical back to page 1 if consolidation is intended
- www vs. non-www is consistent: one resolves to the other via 301, and canonicals reflect the chosen version
- HTTPS vs. HTTP is consistent: all canonicals use HTTPS
For more, see Canonical tags and Duplicate content.
4. Redirects
- No redirect chains longer than 2 hops (A → B → C should be resolved to A → C)
- No redirect loops (A → B → A)
- All redirects are 301 (permanent) unless intentionally temporary (in which case use 302)
- 302s are reviewed. Confirm each is genuinely temporary, not a permanent move misconfigured as temporary
- Internal links point directly to final destination URLs, not through intermediate redirects
- Old URLs from previous migrations are still redirecting correctly
- No broken redirect destinations. Redirects should not point to 404 pages
For more, see Redirects and HTTP status codes.
5. Status codes
- No important pages returning 4xx. Check crawl export, filter by 4xx
- No important pages returning 5xx
- 404 pages return a true 404 status code. Soft 404s returning 200 must be avoided
- No important resources (CSS, JS, images) returning 4xx. Check by filtering crawl by resource type
- Search Console: Coverage > Error > “Not found (404)” reviewed and resolved or accepted
6. Site architecture and internal linking
- Navigation links to all top-level sections
- Important pages receive internal links from relevant, high-authority pages on the site
- No key pages are orphaned. All key pages should have at least one internal link pointing to them
- Breadcrumbs are present and correctly structured
- Anchor text in internal links is descriptive and varied. Avoid “click here” or exact-match keyword repetition
- No broken internal links. Filter crawl export by 4xx on internal links
- Pagination: multi-page sections are linked sequentially
- Site hierarchy is logical: category pages link to subcategories, subcategories link to content
For more, see Site architecture.
7. HTTPS and security
- SSL certificate is valid (not expired, not self-signed)
- HTTP redirects to HTTPS via 301, not 302
- www and non-www both redirect to the canonical version
- No mixed content (HTTP resources loaded on HTTPS pages). Check browser console on key pages
- HSTS header is set (
Strict-Transport-Security) - No security warnings in Chrome for the domain
For more, see HTTPS and security.
8. Core Web Vitals
- Search Console Core Web Vitals report reviewed. Note which URL groups fail on mobile and desktop
- LCP is under 2.5 seconds (check PageSpeed Insights field data for key templates)
- INP is under 200ms
- CLS is under 0.1
- No render-blocking resources identified in Lighthouse that have not been addressed
- LCP image (if applicable) has
fetchpriority="high". It should not be lazy-loaded - Images have explicit
widthandheightattributes set - Web fonts are preloaded. Alternatively, use
font-display: optionalorfont-display: swap
For more, see Core Web Vitals and the Core Web Vitals Optimisation Guide.
9. Mobile
- Site passes Google’s Mobile-Friendly Test for key page templates
- Viewport meta tag is present on all pages
- Content is the same on mobile and desktop. Mobile-first indexing means Google indexes the mobile version
- No content hidden behind expandable elements inaccessible to Googlebot
- No intrusive interstitials that would trigger a mobile usability penalty (full-screen popups on page load)
- Tap targets are appropriately sized (Google recommends 48x48px minimum, 8px spacing)1
- Text is readable without zooming: no font sizes below 12px for body text
For more, see Mobile SEO.
10. Structured data
- Structured data is present on appropriate page types: articles, products, local business, breadcrumbs. Note: FAQPage schema no longer produces rich results in Google Search; it retains value for AI extraction only. Sitelinks Search Box (SearchAction on WebSite) has been deprecated and no longer produces a visual result
- No structured data errors in Search Console (Rich Results report)
- Structured data validated with Google’s Rich Results Test for key templates
- No structured data present for content not visible to users. Hidden structured data violates Google’s guidelines
- Organisation schema on homepage with correct
name,url,logo, andsameAssocial profiles - BreadcrumbList schema matches visible breadcrumb navigation
- Article schema includes
datePublishedanddateModified
For more, see Structured data.
11. Page speed and performance
- Server response time (TTFB) is under 800ms for key pages
- Images are compressed and served in WebP or AVIF format
- Images are correctly sized for their display dimensions. Avoid serving 2000px images in a 400px slot
- Images have descriptive alt text. Empty alt on informational images prevents image search indexation and fails accessibility requirements
- Lazy loading is applied to below-the-fold images (
loading="lazy") - JavaScript is deferred or asynced where appropriate
- Third-party scripts (analytics, chat, ads) are loaded asynchronously. They should not block rendering
- A CDN is in use. It reduces latency for geographically distributed users
- Browser caching headers are set appropriately. Static assets should be cached per your TTL strategy
12. International (if applicable)
- hreflang tags are implemented on all language/region variants
- All hreflang tags are reciprocated across all versions in the cluster
- hreflang tags use correct ISO 639-1 language codes and ISO 3166-1 alpha-2 region codes
- hreflang URLs resolve to 200 responses (not redirects or 404s)
- x-default hreflang is set
- Geo-targeting is configured in Search Console (for subfolder/subdomain structures)
- Content is genuinely localised. Machine-translated content must be reviewed before publishing
For more, see Hreflang and international SEO and the International SEO guide.
13. JavaScript rendering
- Key content (headings, body text, links) is present in the HTML source. JavaScript should not be the only source of content
- Compare
View Sourceagainst rendered DOM in DevTools. Content visible in browser should appear in source or be present after rendering - Navigation links are crawlable
<a href>elements. Avoid JavaScriptonClickhandlers without href - Google’s cached version of key pages shows the correct rendered content
- Internal links in JavaScript-rendered content use standard
<a>tags
For more, see JavaScript SEO.
14. Log file analysis (if access available)
- Googlebot is crawling key pages. Confirm in server logs filtered by Googlebot user agent
- Googlebot is not crawling large volumes of low-value URLs. Parameterised, session-based, and internal search URLs should be blocked
- Crawl frequency on key pages aligns with update frequency. High-priority pages should be crawled regularly
- 5xx errors visible in logs but not in crawl tools. Some errors are transient but worth monitoring
- No suspicious non-Google bots consuming significant crawl bandwidth
For more, see Log file analysis.
Prioritising findings
Not all technical issues are equal. Prioritise by impact:
Critical (fix immediately):
- Crawl blocks on pages that should be indexed
- noindex on pages that should rank
- 5xx errors on key pages
- Redirect loops
High (fix in current sprint):
- Redirect chains of 3+ hops
- Broken internal links to important pages
- Canonical tags pointing to wrong URLs
- Core Web Vitals failures on high-traffic templates
Medium (schedule):
- Orphan pages with some value
- Missing structured data on eligible page types
- Mixed content warnings
- Large image files not yet converted to WebP
Low (backlog):
- Minor crawl inefficiencies on low-traffic sections
- Structured data enhancements (not errors)
- Crawl depth improvements beyond 4 clicks