Building a Keyword Universe

A keyword universe is the complete inventory of search queries relevant to your business, assembled before any filtering or prioritisation takes place. It is the raw material of keyword research: a multi-source collection that covers the full territory of what your audience searches for.

Most keyword research workflows skip this step. They start with a handful of terms, run them through one tool, filter by volume and difficulty, and act on the short list. The result is a filtered output from a single source, structurally blind to demand it never surfaced. A keyword universe fixes this by separating discovery from evaluation. Build the complete picture first, then filter.

What is a keyword universe?

A keyword universe is the total collection of search queries that could drive relevant traffic to your site, drawn from every available source, before prioritisation decisions are applied.

Unlike a keyword list (a filtered output built for a specific campaign or content plan), a keyword universe is deliberately unfiltered. It includes obvious target queries, long-tail variants, question-format searches, competitor coverage, and queries at every stage of the customer journey. The universe is complete by design; filtering comes later.

Size varies by site type and scope. What matters is not the number of terms but the coverage: the universe should reflect the full shape of your audience’s demand before you decide which parts of it to pursue.

Treated as a persistent business asset, a keyword universe is updated as new queries surface, as competitor coverage changes, and as the business expands into new topics. It is not a one-off deliverable built for a single project.

Where do keyword ideas come from?

Building a complete universe requires pulling from multiple sources. Each source surfaces different kinds of demand; no single source covers everything.

Seed term expansion. Start with the terms that describe your product, service, or topic: one or two-word head terms like “SEO”, “keyword research”, “running shoes”. Enter each into a keyword tool to generate the broader set of related queries. Seed terms are starting points, not targets; their value is as inputs to the discovery process.

Keyword tool suggestions. Tools like Ahrefs Keyword Explorer, Semrush Keyword Magic Tool, and Google Keyword Planner generate keyword ideas from a seed term, grouped by topic and scored by volume and difficulty. Each tool draws on its own database, so running the same seed through multiple tools surfaces queries that one tool’s index may miss.

Google Autocomplete and related searches. Autocomplete suggestions reflect real queries: Google generates them from actual search behaviour. Entering a seed term and collecting the autocomplete variants (including using operators like keyword _ or appending letters a through z) surfaces phrasing, modifiers, and intent variants that tools may undercount. “People also search for” and “Searches related to” at the bottom of a results page serve the same function.

People Also Ask. PAA boxes surface question-format queries directly connected to a topic. They are particularly useful for building the informational layer of a keyword universe: questions that signal research intent and map to FAQ content, comparison content, or definition pages. PAA questions expand dynamically, making them a deep source of long-tail question variants.

Google Search Console. GSC’s Performance report shows the queries your pages already appear for, including low-impression queries invisible in external tools. These are confirmed demand signals: real searches where you have some presence. Filtering for queries where you rank between positions 8 and 20 surfaces near-miss opportunities where a content improvement could move a page into meaningful click-earning positions.

Competitor rankings. Queries where competitors rank but you do not represent demand your audience has and content you have not yet produced. Keyword gap tools (Ahrefs Content Gap, Semrush Keyword Gap) allow bulk export of these queries across multiple competitors simultaneously. Queries where two or more competitors rank, but you do not, are the highest-confidence targets: multiple ranking sites confirm the demand is real.

Customer language. The vocabulary your audience uses in reviews, forums, support tickets, and community discussions often differs from the language used in keyword tools. Mining sources like Reddit, G2, Trustpilot, and industry forums surfaces phrasing that reflects how real users describe their problems, which tends to match the long-tail searches they actually perform. This is particularly useful for commercial-intent and comparison queries.

How do you organise a keyword universe?

The raw output from seven sources will contain duplicates, near-duplicates, and clearly irrelevant terms. Initial organisation has three stages.

Deduplication. Merge exact duplicates and consolidate near-duplicates (singular/plural, minor phrasing variants) into single entries. Tools handle some of this automatically; spreadsheet deduplication handles the rest.

Add metadata. For each query, record: search volume (from one tool, consistently), keyword difficulty score, source (which channel surfaced it), and a brief intent classification (informational, commercial, navigational, transactional). Volume and KD need not be precise at this stage; they are filters to apply later, not decisions to make now.

Remove obvious exclusions. Drop queries that are clearly out of scope: competitor brand terms, navigational queries pointing to specific external products or brands, and queries with intent that cannot match any content you could produce. Apply exclusions conservatively. If you are uncertain whether a query belongs, keep it.

Do not filter by difficulty or volume during this stage. That happens downstream, when building specific content plans or prioritising a backlog. The universe exists to preserve optionality; filtering early discards demand before you have had the chance to evaluate it.

How a keyword universe connects to the rest of keyword research

The keyword universe is the input to every downstream research activity.

Keyword difficulty filtering and intent filtering are applied to a subset of the universe when building a content plan or deciding which pages to create or improve next. The keyword difficulty article covers how to use KD scores as a coarse first-pass filter within that process.

Keyword mapping assigns universe queries to specific pages, ensuring each page has a clear primary target and that queries are not split across multiple pages in ways that create cannibalisation. Keyword cannibalisation typically surfaces when similar queries from the universe have been mapped to different pages without checking for overlap.

Competitor gap analysis is one input source to the universe, but revisiting competitor rankings on a regular cadence is also how the universe stays current as competitors expand their coverage.

The pillar-and-cluster model uses the universe to define which topics have enough query volume and depth to justify a full content cluster, versus which topics are single-page targets.

Frequently asked questions

How big does a keyword universe need to be?
There is no minimum. The goal is coverage, not volume. A universe of 200 highly relevant queries for a focused niche site is more useful than a universe of 10,000 loosely relevant queries for the same site. Start with the seven sources, remove obvious exclusions, and let the size reflect the actual scope of your audience’s demand.

How often should a keyword universe be updated?
At minimum, when the business changes (new products, new markets, new topics). In practice, a quarterly pass through competitor rankings and GSC queries keeps the universe current without requiring a full rebuild. New queries surface continuously; the universe should reflect that.

Do you need paid tools to build a keyword universe?
No, but they accelerate the process considerably. Google Autocomplete, PAA, GSC, and competitor SERP research are all available without paid tools. Ahrefs and Semrush add bulk keyword suggestions, competitor gap exports, and volume data that would otherwise require manual assembly from free sources.

Should branded queries be included?
Yes, as a record of the demand landscape, but flag them clearly. Branded queries for competitors reveal intent you may be able to capture with comparison or alternative content. Your own brand queries reveal whether navigational demand for your site is growing. Neither type belongs in the same priority queue as generic target queries, but both are worth tracking.