Industry

Google, Microsoft and Hugging Face Publish the Agentic Resource Discovery Spec

22 June 2026 RSS

A central JSON file connected by lines to a ring of AI agent icons querying it, representing a registry of discoverable AI capabilities. — ARD publishes a site's agent-callable capabilities in an ai-catalog.json file that registries index so agents can discover them by natural-language query. Illustration: AI-generated.

On 17 June 2026, Google announced the Agentic Resource Discovery (ARD) specification, a draft open standard for publishing, discovering, and verifying AI capabilities across the web. Despite the Google Developers Blog announcement, ARD is not a Google-only project: the spec is co-authored by Junjie Bu (Google), R.V. Guha (Microsoft), and Shaun Smith (Hugging Face), with contributing organisations including Amazon Web Services, Cisco, Databricks, GitHub, GoDaddy, Nvidia, Salesforce, ServiceNow, and Snowflake.

ARD is infrastructure for the agentic web rather than a search ranking signal. It addresses how an AI agent finds out which tools, agents, and APIs a site exposes, and how it confirms who published them, before connecting.

What problem does ARD solve?

As more sites expose machine-callable capabilities (MCP servers, agent-to-agent endpoints, agent skills, OpenAPI tools), agents have no standard way to answer three questions: where those capabilities exist, which to use for a task, and whether they are safe to connect to. Capability declarations are currently fragmented across protocols and scattered across domains.

ARD proposes a common layer for that discovery problem. It builds on the AI Catalog data model from the Linux Foundation’s AI Catalog Working Group, so it positions itself as a shared standard across the ecosystem rather than a single vendor’s format.

How does ARD work?

The spec defines two primitives.

A static catalog file. Publishers host an ai-catalog.json manifest at /.well-known/ai-catalog.json. It describes the site’s agentic resources, each entry carrying a domain-anchored identifier, a display name, a media type, and a pointer to (or embedded copy of) the capability definition, alongside optional descriptions, tags, capabilities, and representative queries. Entries can reference MCP servers, A2A agents, skills, APIs, or nested catalogs.

A registry API. Registries crawl and index published catalogs and expose REST endpoints (POST /search, POST /explore, GET /agents) that return ranked matches to natural-language discovery queries. The model is deliberately search-engine-like: registries are to capability catalogs what search engines are to web pages. Agents either query a registry to find capabilities or fetch a known partner’s catalog directly.

Beyond the well-known path, a catalog can be advertised through an Agentmap directive in robots.txt, an HTML <link rel="ai-catalog"> tag, or a DNS service binding record, mirroring how sitemaps are declared today.

How is trust established?

Because each catalog sits on its own publisher’s domain, ARD uses domain ownership as the basis for identity: the domain embedded in a resource’s identifier must align with where the catalog is served. The spec also defines an optional trustManifest carrying cryptographic workload identifiers (such as SPIFFE IDs or DIDs) and attestation objects, so an agent can verify a publisher’s identity before connecting. Runtime connections then use each capability’s native protocol.

How does this relate to llms.txt and sitemaps?

ARD sits at a different layer from the content-discovery conventions publishers already know. llms.txt and XML sitemaps help agents and crawlers find and read content. ai-catalog.json declares what an agent can do on a site: the callable tools and services it exposes. The same read-versus-act distinction separates agentic search from agent-readiness and WebMCP. ARD is the discovery layer that sits in front of those action protocols.

What this means

For content and editorial sites, ARD changes nothing in the near term. It is relevant only to sites that already expose, or plan to expose, agent-callable capabilities: e-commerce, SaaS, booking, and tool platforms. Those sites should treat it as a watch item, and the discovery declarations (a well-known file, a robots.txt line, a <link> tag) are low-cost to add once an MCP server or agent endpoint actually exists.

Two caveats temper the announcement. First, the spec is early: it is published at v0.9 with “draft” status under an Apache 2.0 licence, and the format may change before it stabilises. Second, the breadth of the backing coalition signals intent, not adoption. No major registry behaviour or agent runtime has been confirmed to read ai-catalog.json at scale yet. The standard is worth understanding because of who is behind it, but publishing a catalog earns nothing until registries and agents consume it.

Sources

More news

Industry 24 June 2026

Google Cloud Publishes the Open Knowledge Format for AI Agents

OKF is a draft open spec for representing curated knowledge as Markdown files with YAML frontmatter. Google scopes it to internal data, but SEOs are asking whether it could work for websites.
Industry 23 June 2026

Cloudflare and Browser Makers Announce PACT Bot-Verification Protocol

Cloudflare, Google, Microsoft, Mozilla and Shopify are developing PACT, a privacy-first token protocol to separate humans and authorised agents from abusive bots.
Industry 22 June 2026

Study: AI Overviews Cite Self-Serving Listicles but Recommend Competitors

Lily Ray's June 2026 analysis found Google's AI Overviews cite brands' own 'best' listicles while recommending competitors around 69% of the time.