Site Architecture
Last updated
Site architecture refers to how the pages of a website are structured, organised, and linked together. It determines how search engines discover and crawl content, how PageRank distributes across pages, and how users navigate between sections.
A good site architecture serves both crawlers and users: it ensures every important page is reachable efficiently, and it communicates which pages are most significant through their position in the hierarchy and the volume of internal links pointing at them.
Why site architecture matters for SEO
Search engines have limited crawl resources for any given site. Architecture shapes how those resources are spent. A page buried six clicks from the homepage will be crawled less frequently than one two clicks from the root, simply because the discovery path is longer and there are fewer internal links passing authority to it.
Architecture also distributes PageRank. Each internal link passes a share of the linking page’s equity to the destination. Pages near the top of a hierarchy, linked from the homepage or major section pages, accumulate more equity than pages deep in the tree. This makes architectural decisions inseparable from keyword targeting: the pages that deserve the highest organic visibility should also sit highest in the architecture.
Flat versus deep architectures
A flat architecture minimises the number of clicks required to reach any page from the homepage, typically aiming for three clicks or fewer. Every important page sits close to the root, receives links from section indexes and the homepage, and is regularly crawled.
A deep architecture layers pages behind multiple levels of subcategories. A product might sit at /shop/clothing/mens/casual/t-shirts/navy-stripe/: six levels deep. Googlebot can reach this page, but it requires many hops, receives fewer internal links, and carries less accumulated authority than the same page at /shop/t-shirts/navy-stripe/.
Neither structure is inherently wrong. A large e-commerce site with tens of thousands of products needs hierarchy. The question is whether important category and product pages are as close to the root as the taxonomy allows, or whether they are unnecessarily deep because the navigation grew without architectural planning.
URL structure
URL structure reflects architecture directly. Consistent, predictable URLs make the hierarchy legible to both crawlers and users.
Principles for well-structured URLs:
- Use hyphens to separate words, not underscores or spaces
- Keep paths lowercase and consistent
- Avoid session IDs, tracking parameters, and dates in URLs for evergreen pages
- Match URL depth to content hierarchy: section pages one level deep, cluster pages two levels, individual items three
Changing URL structure on an established site requires 301 redirects for every affected path. URL restructuring without redirects destroys accumulated link equity and drops pages from the index.
Subdomains versus subdirectories
The choice between hosting content at blog.example.com versus example.com/blog/ has been debated for years. Google’s documented position is that it can treat either as part of the same site, and that there is no inherent SEO advantage to one over the other.
In practice, subdirectories tend to consolidate authority more reliably. Links pointing to example.com/blog/article/ contribute to the authority of example.com as a whole. Links pointing to blog.example.com/article/ build authority on the subdomain, which is a separate entity from the main domain in most third-party tools and in some internal Google systems.
Use a subdomain when:
- The content is genuinely distinct from the main site and serves a different audience (e.g., a developer API portal, a support knowledge base, a regional site)
- There are technical or platform reasons that make subdirectory hosting impractical
Use a subdirectory when:
- The content is on the same topic area and should build authority for the main domain
- You want link equity from that section to compound with the rest of the site
Siloed architecture
A silo organises content into topically coherent groups. Pages within a silo link to each other and to the top-level section page, but have minimal links to unrelated silos. This concentrates topical relevance signals on section pages and category hubs.
In practice, most well-run content sites implement a looser version: a pillar page covering a broad topic, supported by cluster pages on subtopics, with internal links flowing primarily within each cluster. This pattern works for both topical relevance and crawl efficiency without requiring rigid silo enforcement that blocks natural editorial cross-linking.
The topic cluster model and its planning implications are covered in topic clusters and pillar pages.
Internal link architecture
Internal links are the primary mechanism through which architecture transmits authority. A page with many internal links from high-authority pages accumulates more equity than one with few or weak links, regardless of its URL depth.
Key architectural link patterns:
- Hub pages (category indexes, pillar pages) should link to every important child page within their section
- Homepage should link directly to the highest-priority section hubs
- Breadcrumb navigation provides consistent structural links that help crawlers and users understand the hierarchy
- Footer links to major sections distribute crawl priority globally, but are discounted compared to editorial in-body links
Orphan pages, those with no internal links pointing to them, receive no crawl priority signal and no internal equity. Even high-quality content on an orphaned page will be crawled infrequently and rank below its potential.
Auditing site architecture
Signs that architecture needs attention:
- Important pages require more than three clicks from the homepage
- Category or section pages are rarely crawled according to server logs
- URL structure is inconsistent across sections, with mixed depth and naming conventions
- Orphan pages appear in log analysis (crawled by Googlebot but not linked from any other page)
- Internal link equity concentrates on low-value pages (e.g., the contact page or 404 handler)
A crawl tool such as Screaming Frog can map click depth, identify orphaned pages, and flag structural inconsistencies across a site. Log file analysis shows which pages Googlebot actually visits and how frequently, which reveals real crawl prioritisation regardless of how the theoretical architecture looks.