Agent-readiness
Last updated
Traditional SEO has two layers: a content layer (what a page says) and a technical layer (whether search engines can access it). The same split is forming for AI agents. Generative Engine Optimisation and agentic SEO address the content layer: what agents retrieve and how well individual passages are selected for citation. Agent-readiness addresses the infrastructure layer: whether agents can discover the site, read its content in formats suited to machine consumption, and take actions on it. A well-structured page agents cannot access is no more useful to an agent than a well-ranked page a search engine cannot crawl.
What does agent-readiness mean?
Two distinct requirements sit under the umbrella, and they are often conflated.
Can agents find and read your content? This covers discovery signals (robots.txt, sitemaps, response headers), machine-readable content formats (structured data, content negotiation), and access declarations (what agents are permitted to do with content they retrieve).
Can agents take actions on your site? This covers protocol endpoints, callable tool definitions, and transaction interfaces: the infrastructure that lets an agent complete a task on behalf of a user rather than simply read what is on the page.
Most content and editorial sites need only address the first. E-commerce, booking, SaaS, and tool sites need both. The action layer is covered in depth in WebMCP. This article covers the full infrastructure picture across both requirements.
How are major platforms defining agent-readiness?
No single authority owns the definition. Four platforms are converging on the concept simultaneously, each from a different vantage point.
Cloudflare launched isitagentready.com, a public scanner that runs 13 checks across five categories: discoverability, content format, bot access control, protocol and MCP discovery, and commerce.1 The scanner is designed for the full range of Cloudflare’s platform, from content sites to API services to commerce. Its Commerce category is explicitly marked optional and non-scoring for non-commerce sites, reflecting that the checks span the spectrum of what Cloudflare hosts, not what every site needs.
Cloudflare’s scan of 200,000 top domains (filtered to exclude redirects, ad servers, and tunnelling services) shows how far the web is from agent-ready by default: 78% of sites have a robots.txt, but the vast majority were written for search engine crawlers, not AI agents. Only 4% have declared AI usage preferences via Content Signals. Only 3.9% support Markdown content negotiation. Fewer than 15 sites in the entire dataset have MCP Server Cards or API Catalogs.1
Shopify is making every store agent-ready by default through its Agentic Storefronts initiative.2 Its own scanner runs 31 checks across 41 distinct URLs, checking for /llms.txt, /agents.md, and /.well-known/ucp among other endpoints. The emphasis is on machine-readable product data: price, title, dimensions, and availability must appear in structured fields, not embedded in Liquid templates or marketing copy. AI-driven traffic to Shopify stores has grown eight times year-on-year since January 2025; orders from AI-powered search have increased fifteen times over the same period.3
WordPress introduced an MCP Adapter as part of its AI Building Blocks initiative, announced in early 2026.4 The Adapter exposes every registered WordPress ability as a callable MCP tool, allowing AI agents to discover and execute WordPress functionality directly. The Abilities API provides a standardised way for plugins to register callable actions; once a plugin registers an ability, an AI agent can discover and invoke it. For WordPress-hosted sites, agent-readiness at the action layer is becoming a platform default rather than a custom implementation.
Google and Chrome are moving WebMCP from an experimental Canary flag to a formal origin trial in Chrome 149.5 Sites declare callable tools (JavaScript functions and HTML forms) that in-browser agents can invoke without simulating clicks. Early participants include Expedia, Booking.com, Shopify, Etsy, Instacart, and Target. WebMCP is covered in full at WebMCP.
The practical implication of these four converging definitions: agent-readiness is becoming platform-level infrastructure rather than a DIY checklist. The common infrastructure threads across all four approaches are what matter for most publishers.
The five infrastructure layers
Synthesised from the common threads across platform approaches, five layers describe what agent-readiness requires in practice.
1. Discovery
How agents learn a site exists and navigate its structure. The basic signals are the same as those for search: a valid robots.txt with named crawler rules, a current XML sitemap, and structured markup that identifies what content exists. HTTP Link headers (RFC 8288) add a machine-readable pointer to the sitemap and key resources directly in the HTTP response, without requiring agents to parse robots.txt first. For sites running MCP servers or agent skills, /.well-known/ endpoints advertise those capabilities at a known, discoverable location.
Whether agents can receive content in efficient, machine-readable form. Structured data (schema.org Article, Product, Organization, FAQPage) is the primary signal: it tells agents what a page contains and how to interpret its entities. Markdown content negotiation serves .md responses when a request includes Accept: text/markdown, reducing token overhead for agent retrieval without affecting how browsers receive HTML. For commerce sites, machine-readable product data (structured fields rather than marketing copy) is the commerce-layer equivalent. Markdown for Agents covers the implementation in full, including its limits.
3. Access declarations
What agents are permitted to do with content they access. Per-crawler rules in robots.txt handle allow and disallow at the access layer; AI crawlers covers the major user-agents and compliance behaviour.
Content Signals, a draft IETF specification (draft-romm-aipref-contentsignals, published at contentsignals.org), adds a separate layer of semantic intent. A Content-Signal: directive in robots.txt declares preferences for three specific uses:
ai-train: permission to use content for training AI model weightssearch: permission to include content in search indexing and resultsai-input: permission to use content as input to AI at inference time (retrieval, RAG)
The distinction between ai-train and ai-input is not obvious but matters: a site can permit retrieval citation (ai-input=yes) while declining to have its content absorbed into model training data (ai-train=no). The spec is a draft with no confirmed adoption by major platforms yet. The implementation cost is one line in robots.txt.
agents.md, introduced in the Shopify agentic commerce context, is a plain-language description of what agents can and cannot do on a site, including usage terms for AI interaction. It is specific to commerce at present but the convention may broaden.
4. Protocol and action layer
Whether agents can do things on the site beyond reading. MCP Server Cards (/.well-known/mcp/server-card.json) advertise an MCP server’s endpoint, capabilities, and transport method. The spec is an open draft (PR-2127) not yet merged into the core MCP specification, but Server Cards are a named priority on the official MCP roadmap under an active Server Card Working Group,67 and real-world implementations already exist. Agent Skills indices (/.well-known/agent-skills/index.json) list callable actions with name, type, description, URL, and a SHA256 digest. Both are relevant only for sites that run MCP servers or expose agent-callable actions. WebMCP handles the browser-native variant. For most content and editorial sites this layer is a watch item, not a current implementation priority.
5. Commerce and transaction layer
Whether agents can complete transactions on behalf of users. The Universal Commerce Protocol (UCP, at /.well-known/ucp) is Shopify’s primary agentic commerce endpoint. x402, MPP, and ACP are other emerging payment and transaction protocols in this space. This layer applies to commerce sites only; informational and editorial sites skip it entirely.
Infrastructure requirements by site type
| Site type | Discovery | Content format | Access declarations | Protocol |
|---|---|---|---|---|
| Content/editorial | robots.txt, sitemap, Link headers | Markdown, schema | Content Signals | Monitor only |
| E-commerce | + product sitemaps, agents.md | + machine-readable product data | ai-input=yes | WebMCP, UCP |
| CMS (WordPress) | Standard + /.well-known/mcp | Standard | Standard | Abilities API, MCP Adapter |
| SaaS/tools | + API catalog, OAuth discovery | API schemas | Standard | MCP server, agent skills |
Commerce layer protocols (UCP, x402, MPP, ACP) apply to e-commerce sites only; the other site types skip that layer entirely.
“Monitor only” for the protocol layer means understanding the protocols without implementing them. As platforms mature, the action layer will often be handled at the hosting or CMS level without site-specific configuration.
What to implement now
For content and editorial sites, three things are actionable today.
Content Signals (minutes): add one line to robots.txt:
Content-Signal: ai-train=yes, search=yes, ai-input=yes
Set ai-train to no to decline training use while permitting retrieval. The spec is a draft, adoption by major platforms is unconfirmed, and the cost is near zero.
HTTP Link headers (minutes): add one response header pointing to the sitemap. In Cloudflare’s _headers file:
Link: </sitemap-index.xml>; rel="sitemap"
RFC 8288 is an established standard. This adds a machine-readable pointer to site structure that agents can read without parsing robots.txt.
Markdown content negotiation (hours to days): configure the server to return .md responses when requests include Accept: text/markdown. Implementation options, limits, and risks are in Markdown for Agents.
Strong structured data (Article, Organization, FAQPage) underlies all three and remains the most load-bearing agent-facing investment for a content site. It is not new, but it is the primary signal agents use to identify what a page contains and who produced it.
What not to implement: API catalog, OAuth discovery, MCP Server Card, Agent Skills, WebMCP, and commerce layer protocols are irrelevant for informational sites. Publishing /.well-known/openid-configuration or similar OAuth discovery files without actual protected APIs creates a worse agent experience than omitting them: it signals capabilities the site does not have.
Current state and roadmap
Agent-readiness is moving from opt-in to default at the platform level. Shopify is making every store agent-ready without merchant action. WordPress is building the action layer into the core CMS. WebMCP is moving from an experimental flag to a formal browser origin trial with major commerce and travel brands already participating. Cloudflare’s scanner formalises checks that may appear in CDN-level defaults.
For most content publishers the near-term implication is limited: the action layer is handled by platforms, not individual sites. The decisions that remain are about access declarations (what Content Signals values to publish) and content formats (whether to serve Markdown). The infrastructure layer is developing quickly enough that checks which require monitoring in 2026 may require implementation in 2027.
As specific platforms (Shopify, WordPress) publish their own agent-readiness requirements, those will be covered in platform-specific articles. This article covers the common infrastructure layer that applies across site types.
Frequently asked questions
Is agent-readiness the same as technical SEO? Related but distinct. Technical SEO ensures search engines can crawl, index, and understand a site. Agent-readiness ensures AI agents can access, read, and act on it. The two overlap (structured data, crawlability, response codes) but agent-readiness extends into areas technical SEO does not cover: access declarations for different AI uses, machine-readable content formats, and action protocols.
Does improving agent-readiness affect search rankings? No confirmed mechanism. The infrastructure layer affects whether agents can access content, not how search algorithms rank it.
Should I implement WebMCP for a content site? No. WebMCP is the action layer, relevant when the site’s primary value involves completing tasks on behalf of users. For informational sites the relevant investment is the discovery and content format layers.
What is Content Signals and does it work?
Content Signals is a draft IETF spec adding semantic intent to robots.txt: three fields declaring preferences for AI training, search indexing, and inference input. No major platform has publicly confirmed it reads these directives. The implementation cost is one line, which makes it worth adding regardless of confirmed impact.
How do I audit my site’s agent-readiness? isitagentready.com is a useful starting point. Use the Content Site preset rather than All Checks; the default preset penalises informational sites for missing API and commerce features they have no reason to implement. Of the 13 checks the scanner runs, seven cover API, authentication, and commerce infrastructure that does not apply to content sites. A low score on those checks reflects the scanner’s broad scope, not a site deficiency.1
Footnotes
-
All you need to know about Cloudflare’s Agent Readiness Score — Search Engine Journal ↩ ↩2 ↩3
-
Agentic commerce: benefits and how to get started — Shopify ↩
-
From Abilities to AI Agents: introducing the WordPress MCP Adapter — WordPress Developer Blog ↩
-
Chrome at I/O 2026: powering the agentic web — Chrome for Developers ↩