Agent-readiness

Last updated 14 July 2026

Traditional SEO has two layers: a content layer (what a page says) and a technical layer (whether search engines can access it). The same split is forming for AI agents. Generative Engine Optimisation and agentic search address the content layer: what agents retrieve and how well individual passages are selected for citation. Agent-readiness addresses the infrastructure layer: whether agents can discover the site, read its content in formats suited to machine consumption, and take actions on it. A well-structured page agents cannot access is no more useful to an agent than a well-ranked page a search engine cannot crawl.

What does agent-readiness mean?

Two distinct requirements sit under the umbrella, and they are often conflated.

Can agents find and read your content? This covers discovery signals (robots.txt, sitemaps, response headers), machine-readable content formats (structured data, content negotiation), and access declarations (what agents are permitted to do with content they retrieve).

Can agents take actions on your site? This covers protocol endpoints, callable tool definitions, and transaction interfaces: the infrastructure that lets an agent complete a task on behalf of a user rather than simply read what is on the page.

Most content and editorial sites need only address the first. E-commerce, booking, SaaS, and tool sites need both. The action layer is covered in depth in WebMCP. This article covers the full infrastructure picture across both requirements.

How are major platforms defining agent-readiness?

No single authority owns the definition. Four platforms are converging on the concept simultaneously, each from a different vantage point.

Cloudflare launched isitagentready.com, a public scanner that runs 13 checks across five categories: discoverability, content format, bot access control, protocol and MCP discovery, and commerce.¹ The scanner is designed for the full range of Cloudflare’s platform, from content sites to API services to commerce. Its Commerce category is explicitly marked optional and non-scoring for non-commerce sites, reflecting that the checks span the spectrum of what Cloudflare hosts, not what every site needs.

Cloudflare’s scan of 200,000 top domains (filtered to exclude redirects, ad servers, and tunnelling services) shows how far the web is from agent-ready by default: 78% of sites have a robots.txt, but the vast majority were written for search engine crawlers, not AI agents. Only 4% have declared AI usage preferences via Content Signals. Only 3.9% support Markdown content negotiation. Fewer than 15 sites in the entire dataset have MCP Server Cards or API Catalogs.¹

Shopify is making every store agent-ready by default through its Agentic Storefronts initiative.² Its own scanner runs 31 checks across 41 distinct URLs, checking for /llms.txt, /agents.md, and /.well-known/ucp among other endpoints. The emphasis is on machine-readable product data: price, title, dimensions, and availability must appear in structured fields, not embedded in Liquid templates or marketing copy. Shopify reports that AI-driven traffic to its stores has grown eight times year-on-year since January 2025, and that orders from AI-powered search have increased fifteen times over the same period.³ Both figures are the platform’s own, published in support of the initiative, so read them as directional rather than independently established.

WordPress introduced an MCP Adapter as part of its AI Building Blocks initiative, announced in early 2026.⁴ The Adapter exposes every registered WordPress ability as a callable MCP tool, allowing AI agents to discover and execute WordPress functionality directly. The Abilities API provides a standardised way for plugins to register callable actions; once a plugin registers an ability, an AI agent can discover and invoke it. For WordPress-hosted sites, agent-readiness at the action layer is becoming a platform default rather than a custom implementation.

Google and Chrome are moving WebMCP from an experimental Canary flag to a formal origin trial in Chrome 149.⁵ Sites declare callable tools (JavaScript functions and HTML forms) that in-browser agents can invoke without simulating clicks. Early participants include Expedia, Booking.com, Shopify, Etsy, Instacart, and Target. WebMCP is covered in full at WebMCP.

Chrome’s Lighthouse also formalised agent-readiness auditing in version 13.3 (May 2026), moving an Agentic Browsing category from experimental to its default configuration.⁶ It runs alongside Performance, Accessibility, Best Practices, and SEO on every Lighthouse run. The category checks for llms.txt presence, WebMCP tool declarations, accessibility tree quality, and visual stability. Unlike other Lighthouse categories it does not produce a single 0–100 score; the spec team’s stated rationale is that agentic web standards are still emerging and the priority is actionable signals rather than a definitive ranking.

The practical implication of these four converging definitions: agent-readiness is becoming platform-level infrastructure rather than a DIY checklist. The common infrastructure threads across all four approaches are what matter for most publishers.

The five infrastructure layers

Synthesised from the common threads across platform approaches, five layers describe what agent-readiness requires in practice.

1. Discovery

How agents learn a site exists and navigate its structure. The basic signals are the same as those for search: a valid robots.txt with named crawler rules, a current XML sitemap, and structured markup that identifies what content exists. HTTP Link headers (RFC 8288) add a machine-readable pointer to the sitemap and key resources directly in the HTTP response, without requiring agents to parse robots.txt first. llms.txt, placed at the domain root, provides a curated Markdown index of site structure that in-browser agents can read without crawling the full site; Chrome’s Lighthouse Agentic Browsing audit checks for its presence in the default configuration. For sites running MCP servers or agent skills, /.well-known/ endpoints advertise those capabilities at a known, discoverable location.

2. Content format

Whether agents can receive content in efficient, machine-readable form. Structured data (schema.org Article, Product, Organization, FAQPage) is the primary signal: it tells agents what a page contains and how to interpret its entities. Markdown content negotiation serves .md responses when a request includes Accept: text/markdown, reducing token overhead for agent retrieval without affecting how browsers receive HTML. Microsoft’s Web IQ, a grounding API built on Bing’s index, evaluates retrieved passages against GDSAT (grounding satisfaction), a metric covering completeness, freshness, and authority.⁷ Microsoft’s President of Search and AI has said Web IQ is used directly in Copilot and by ChatGPT “for some of its web answers”,⁸ so it accounts for a substantial share of ChatGPT’s web retrieval without being the whole of it. What that tells you is that on this retrieval path, freshness and authoritative sourcing are assessed passage by passage rather than at page or domain level, which is why structured data and clear attribution matter at section granularity. For commerce sites, machine-readable product data (structured fields rather than marketing copy) is the commerce-layer equivalent. Markdown for Agents covers the implementation in full, including its limits.

A related convention is emerging for curated knowledge rather than page content. In June 2026 Google Cloud published the Open Knowledge Format (OKF), a draft spec that represents knowledge as a directory of Markdown files with YAML frontmatter for AI agents to consume. Google scopes it to internal organisational data, not web search, but it is the same Markdown-with-frontmatter shape as llms.txt and agents.md, and some practitioners are asking whether sites could publish OKF bundles externally. No AI platform has confirmed it reads external bundles at inference time, so for now it sits in the watch column alongside the other draft conventions of the agentic web.

3. Access declarations

What agents are permitted to do with content they access. Per-crawler rules in robots.txt handle allow and disallow at the access layer; AI crawlers covers the major user-agents and compliance behaviour.

The layer above robots.txt is becoming the decisive one. From 15 September 2026, Cloudflare blocks Training and Agent crawlers by default on ad-supported pages for new sites and new customers, sorting AI crawlers into Search, Agent and Training categories and enforcing the policy at the network layer rather than through the file.⁹ Two things follow for agent-readiness. Your access declarations may no longer be the operative control if your CDN is enforcing its own, and blocking the Training category there can also block Googlebot, Bingbot and Applebot, which crawl for both search and training. Any agent-readiness audit that stops at robots.txt will now miss the layer that actually decides what reaches your server.

Content Signals, a draft IETF specification originally published as draft-romm-aipref-contentsignals-00 (October 2025, published at contentsignals.org), adds a separate layer of semantic intent. That specific draft expired in April 2026 under IETF’s standard 6-month lifecycle; the underlying work continues through the IETF aipref working group.¹⁰ A Content-Signal: directive in robots.txt declares preferences for three specific uses:

ai-train: permission to use content for training AI model weights
search: permission to include content in search indexing and results
ai-input: permission to use content as input to AI at inference time (retrieval, RAG)

The distinction between ai-train and ai-input is not obvious but matters: a site can permit retrieval citation (ai-input=yes) while declining to have its content absorbed into model training data (ai-train=no). The spec is a draft with no confirmed adoption by major platforms yet. The implementation cost is one line in robots.txt.

Microsoft is separately engaging with IETF on publisher controls for AI infrastructure through its Web IQ platform,⁷ part of a broader industry movement toward formalised standards in this space. A related strand is request-time verification rather than declared preferences: in June 2026 Cloudflare, Google, Microsoft, Mozilla and Shopify announced PACT, a draft protocol for cryptographically proving a human, or an agent authorised by one, is behind a request. It is in development with no deployment timeline, but signals where the access layer is heading: from self-reported user-agents towards proof of who a client is and on whose behalf it acts.

agents.md, introduced in the Shopify agentic commerce context, is a plain-language description of what agents can and cannot do on a site, including usage terms for AI interaction. It is specific to commerce at present but the convention may broaden.

4. Protocol and action layer

Whether agents can do things on the site beyond reading. MCP Server Cards (/.well-known/mcp/server-card.json) advertise an MCP server’s endpoint, capabilities, and transport method. The spec is an open draft (PR-2127) not yet merged into the core MCP specification, but Server Cards are a named priority on the official MCP roadmap under an active Server Card Working Group,¹¹¹² and real-world implementations already exist. Agent Skills indices (/.well-known/agent-skills/index.json) list callable actions with name, type, description, URL, and a SHA256 digest. Both are relevant only for sites that run MCP servers or expose agent-callable actions. WebMCP handles the browser-native variant. NLWeb takes a server-side approach: it converts existing schema.org structured data into queryable endpoints for MCP-compatible agents, making a site’s catalogue accessible without custom JavaScript. It is most relevant for commerce, media, and documentation sites with deep content catalogues.

Agentic Resource Discovery (ARD) sits above these per-capability artifacts as a unifying discovery layer. Announced in June 2026 by Google, Microsoft, and Hugging Face and built on the Linux Foundation’s AI Catalog data model, ARD defines an ai-catalog.json manifest at /.well-known/ai-catalog.json that lists a site’s agentic resources (MCP servers, A2A agents, skills, APIs) in one place, plus a registry API that crawls and indexes those catalogs so agents can find capabilities by natural-language query. A catalog can also be advertised through an Agentmap directive in robots.txt or a <link rel="ai-catalog"> tag. The spec is an early v0.9 draft (Apache 2.0) with a broad backing coalition but no confirmed registry or agent adoption at scale yet; like Server Cards, it is relevant only to sites that expose agent-callable capabilities. See the ARD announcement for detail.

A2A (Agent-to-Agent protocol), originally developed by Google and now governed by the Linux Foundation’s Agentic AI Foundation alongside MCP, sits at a different layer again. Where MCP connects an agent to tools and data, A2A defines how agents communicate with and delegate tasks to other agents across organisational boundaries. As of April 2026, more than 150 organisations support A2A in production, including Google, Microsoft, AWS, Salesforce, and SAP; Azure AI Foundry, Amazon Bedrock AgentCore, and Google Cloud have all integrated it natively.¹³ Publishers do not implement A2A: it is infrastructure for agent networks, not for individual sites. It is relevant here because it explains why the action layer is consolidating at the platform level rather than requiring site-specific implementation: the agent coordination infrastructure is already built into the cloud platforms that power the agents visiting your site.

For most content and editorial sites this layer is a watch item, not a current implementation priority.

5. Commerce and transaction layer

Whether agents can complete transactions on behalf of users. The Universal Commerce Protocol (UCP, at /.well-known/ucp) is an open standard co-developed by Shopify and Google, announced at NRF in January 2026 with backing from more than 20 retailers and payment networks including Target, Walmart, and Wayfair.¹⁴ It covers four stages: product discovery, capability negotiation, checkout, and post-purchase handoff, and is already wired into Google AI Mode and the Gemini app. AP2 (Agent Payments Protocol) is Google’s companion standard for agentic payment authorisation; UCP explicitly cites AP2 as its recommended payments layer.¹⁵ ACP (Agentic Commerce Protocol, from Stripe and OpenAI) and x402 (Coinbase) are the other significant protocols in this space; all three are designed to compose rather than compete. This layer applies to commerce sites only; informational and editorial sites skip it entirely.

Infrastructure requirements by site type

Site type	Discovery	Content format	Access declarations	Protocol
Content/editorial	robots.txt, sitemap, Link headers	Markdown, schema	Content Signals	Monitor only
E-commerce	+ product sitemaps, agents.md	+ machine-readable product data	ai-input=yes	WebMCP, UCP
CMS (WordPress)	Standard + /.well-known/mcp	Standard	Standard	Abilities API, MCP Adapter
SaaS/tools	+ API catalog, OAuth discovery	API schemas	Standard	MCP server, agent skills

Commerce layer protocols (UCP, AP2, ACP, x402) apply to e-commerce sites only; the other site types skip that layer entirely.

“Monitor only” for the protocol layer means understanding the protocols without implementing them. As platforms mature, the action layer will often be handled at the hosting or CMS level without site-specific configuration.

What to implement now

For content and editorial sites, three things are actionable today.

Content Signals (minutes): add one line to robots.txt:

Content-Signal: ai-train=yes, search=yes, ai-input=yes

Set ai-train to no to decline training use while permitting retrieval. The original draft (-00) expired April 2026; the specification work continues in the IETF aipref working group. Adoption by major platforms is unconfirmed, and the cost is near zero.

If Cloudflare serves your site it may already have added the directive for you, so check what your robots.txt currently says before writing it. Note also that Google has stated the directive has no effect on any crawler or model (see the FAQ below): this is a low-cost bet on future adoption, not a control that does anything today.

HTTP Link headers (minutes): add one response header pointing to the sitemap. In Cloudflare’s _headers file:

Link: </sitemap-index.xml>; rel="sitemap"

RFC 8288 is an established standard. This adds a machine-readable pointer to site structure that agents can read without parsing robots.txt.

Markdown content negotiation (hours to days): configure the server to return .md responses when requests include Accept: text/markdown. Implementation options, limits, and risks are in Markdown for Agents.

llms.txt (minutes): a Markdown file at the domain root listing the site’s key pages with short descriptions. Chrome’s Lighthouse Agentic Browsing audit checks for its presence; without it, agents spend more time crawling the site to understand its structure. No SEO or citation benefit, but low cost and now audited by default. See llms.txt for syntax and implementation.

Strong structured data (Article, Organization, FAQPage) underlies all three and remains the most load-bearing agent-facing investment for a content site. It is not new, but it is the primary signal agents use to identify what a page contains and who produced it.

What not to implement: API catalog, OAuth discovery, MCP Server Card, Agent Skills, WebMCP, and commerce layer protocols are irrelevant for informational sites. Publishing /.well-known/openid-configuration or similar OAuth discovery files without actual protected APIs creates a worse agent experience than omitting them: it signals capabilities the site does not have.

Current state and roadmap

Agent-readiness is moving from opt-in to default at the platform level. Shopify is making every store agent-ready without merchant action. WordPress is building the action layer into the core CMS. WebMCP is moving from an experimental flag to a formal browser origin trial with major commerce and travel brands already participating. Cloudflare’s scanner formalises checks that may appear in CDN-level defaults.

For most content publishers the near-term implication is limited: the action layer is handled by platforms, not individual sites. The decisions that remain are about access declarations (what Content Signals values to publish) and content formats (whether to serve Markdown). The infrastructure layer is developing quickly enough that checks which require monitoring in 2026 may require implementation in 2027.

As specific platforms (Shopify, WordPress) publish their own agent-readiness requirements, those will be covered in platform-specific articles. This article covers the common infrastructure layer that applies across site types.

Frequently asked questions

Is agent-readiness the same as technical SEO? Related but distinct. Technical SEO ensures search engines can crawl, index, and understand a site. Agent-readiness ensures AI agents can access, read, and act on it. The two overlap (structured data, crawlability, response codes) but agent-readiness extends into areas technical SEO does not cover: access declarations for different AI uses, machine-readable content formats, and action protocols.

Does improving agent-readiness affect search rankings? No confirmed mechanism. The infrastructure layer affects whether agents can access content, not how search algorithms rank it.

Should I implement WebMCP for a content site? No. WebMCP is the action layer, relevant when the site’s primary value involves completing tasks on behalf of users. For informational sites the relevant investment is the discovery and content format layers.

What is Content Signals and does it work? Content Signals is a draft IETF specification adding semantic intent to robots.txt: three fields declaring preferences for AI training, search indexing, and inference input. The original draft (draft-romm-aipref-contentsignals-00) expired in April 2026; the work continues through the IETF aipref working group. No major platform has publicly confirmed it reads these directives, and in July 2026 Google’s John Mueller said that, as far as he was aware, no crawler or LLM acts on the directive, calling it something “made up by a CDN” that “just adds bloat” to robots.txt.¹⁶ The implementation cost is one line, so treat it as a low-risk optional bet on future adoption rather than a change with confirmed impact today.

How do I audit my site’s agent-readiness? isitagentready.com is a useful starting point. Use the Content Site preset rather than All Checks; the default preset penalises informational sites for missing API and commerce features they have no reason to implement. Of the 13 checks the scanner runs, seven cover API, authentication, and commerce infrastructure that does not apply to content sites. A low score on those checks reflects the scanner’s broad scope, not a site deficiency.¹

Agent-readiness

What does agent-readiness mean?

How are major platforms defining agent-readiness?

The five infrastructure layers

Infrastructure requirements by site type

What to implement now

Current state and roadmap

Frequently asked questions

Guides, Checklists & References

How to Target SERP Features

How to Build a Keyword Research Process from Scratch

SEO Recovery

How to Learn SEO: A Beginner's Learning Path

Link Building Guide

Local SEO Guide

SEO Glossary

SEO News + Updates

OpenAI Retires ChatGPT Atlas, Folding Agentic Browsing Into ChatGPT

Google revamps Image Search and brings image generation into AI Overviews

ChatGPT Citations Shift When Its Hidden Search Pipelines Switch

Cloudflare Splits AI Crawlers Into Search, Agent and Training, With Default Blocks From 15 September

Google Search Console Adds Platform Properties for Social and Video Content

What does agent-readiness mean?

How are major platforms defining agent-readiness?

The five infrastructure layers

Infrastructure requirements by site type

What to implement now

Current state and roadmap

Frequently asked questions

Footnotes

See also

WebMCP (Web Model Context Protocol)

llms.txt

robots.txt and crawlability

Guides, Checklists & References

How to Target SERP Features

How to Build a Keyword Research Process from Scratch

SEO Recovery

How to Learn SEO: A Beginner's Learning Path

Link Building Guide

Local SEO Guide

SEO Glossary

SEO News + Updates

OpenAI Retires ChatGPT Atlas, Folding Agentic Browsing Into ChatGPT

Google revamps Image Search and brings image generation into AI Overviews

ChatGPT Citations Shift When Its Hidden Search Pipelines Switch

Cloudflare Splits AI Crawlers Into Search, Agent and Training, With Default Blocks From 15 September

Google Search Console Adds Platform Properties for Social and Video Content