How Search Engines Work

Search engines do three things: they discover web pages, they store and process those pages, and they decide which pages to show when someone searches. Understanding how each stage works makes it much easier to diagnose why a page might not be ranking.

Stage one: crawling

Crawling is how search engines discover pages. A search engine sends out automated programs called crawlers (Google’s is called Googlebot) that visit web pages, read their content, and follow any links they find. Those links lead to more pages, which contain more links, and so on.

This is why internal linking matters. If a page has no links pointing to it from anywhere on the web or within your own site, crawlers are unlikely to find it. A page that is never crawled cannot be indexed or ranked.

Crawlers do not visit every page on the internet every day. How frequently they return to a site depends on how often it publishes new content, how authoritative it is, and technical signals like the sitemap and crawl budget.

Stage two: indexing

Once a crawler visits a page, the content is sent back to the search engine’s servers to be processed and stored in the index. The index is a vast database of pages and their content, organised so that relevant pages can be retrieved quickly when a query arrives.

During indexing, the search engine analyses the text, images, links, and structured data on a page. It tries to understand what the page is about, who wrote it, how it relates to other pages, and how trustworthy the source appears.

Not every page that gets crawled makes it into the index. Pages can be excluded by directives in the site’s code (a noindex tag, for example), or because the search engine judges the content to be too thin, duplicate, or low quality to be worth storing.

Stage three: ranking

When someone types a query, the search engine retrieves pages from the index and ranks them in order of relevance and quality. This is where most SEO work has its effect.

The ranking algorithm weighs hundreds of signals, grouped broadly into:

  • Relevance: does the page content match what the searcher is looking for?
  • Authority: do reputable sites link to this page, and does the broader site have a track record of producing useful content?
  • Experience: does the page load quickly, work on mobile, and deliver what it promises?

The algorithm is not static. Google updates it thousands of times per year, with periodic larger updates that can shift rankings significantly. Chasing algorithm changes is rarely productive. Building pages that genuinely serve a query tends to be stable over time.

What this means for SEO

Each stage is a potential point of failure:

  • A page that cannot be crawled will never appear in results.
  • A page that is crawled but not indexed cannot rank.
  • A page that is indexed but does not match the signals the algorithm rewards will rank poorly.

Technical SEO addresses the crawling and indexing stages. On-page SEO addresses the relevance signals the ranking algorithm reads. Off-page SEO addresses the authority signals. The disciplines are distinct, but they address different parts of the same process.