mentionable.dev

REST Transport — Mentionable v0.1

Status: Draft v0.1 — wire-major revision (request side rewritten; §4–§8 substantially unchanged from the previous v0.1 draft). Last updated: 2026-05-06 (#313 history field, #317 wire rewrite, #322/#323 envelope-signing on response side, #324 endpoint trailing-slash guidance, #task-lifecycle, #push-webhook, #artifact-channel, #388 lazy Connector file descriptors)

This spec defines how a Mentionable agent is reached over plain HTTP — no JSON-RPC, no SDK, no out-of-band setup. A web client (or an LLM with a WebFetch tool) issues an ordinary HTTP request, receives a content-negotiated response, and is done.

The motivation: any tool that can issue an HTTP request can talk to a Mentionable agent. Browsers see the agent as a normal web page (HTML); LLM clients see it as a markdown stream; agent-to-agent traffic sees it as JSON. This is the smallest possible adoption surface — the cost to talk to an agent collapses to one fetch.

REST is the third transport published alongside A2A (agent-to-agent JSON-RPC), and Email and ActivityPub. PolicyPart wire mappings for HTTP statuses live in policy-part-v0.1.md §4.5.


Thesis (design anchor)

Every rule needed to talk to an agent already exists in the HTTP standards. Re-using those rules verbatim wins on two axes simultaneously: efficiency (the surface this spec must define, document, and test approaches zero) and reach (any tool that speaks HTTP can speak to a Mentionable agent without learning anything new). The moment we invent grammar of our own we lose both — and weaken the answer to “why does REST exist alongside A2A?”

So the rule of this rewrite is one thing: use only mechanisms that are already in HTTP standards; do not define new ones. Everything below falls out of that rule:

What this spec does define on top of those standards is small: three semantic role labels (user, assistant, session) and a part-dispatch table that simply names what HTTP already says about each Content-Type. Everything else is HTTP.


1. Conformance terms

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL are interpreted as described in BCP 14 (RFC 2119, RFC 8174) when, and only when, they appear in all capitals.

A Mentionable agent that conforms to this transport MUST advertise the extension URI https://mentionable.dev/ns/transport-rest/v0.1 in its agent card’s A2ACapabilities.extensions array (see agent-card.md). The same extension entry MUST also carry an endpoint field whose value is the agent’s REST base URL (see §2).

Deprecated alias. During the transition window (see ../wiki/url-scheme.md and issue #503) the legacy URI https://mentionable.dev/spec/transport-rest/v0.1 is recognised as an alias on inbound only. Outbound emit MUST use the canonical /ns/ URI.


2. Address resolution

An agent’s REST endpoint is the URL the agent card advertises — this spec does not prescribe a path layout.

Concretely: the REST extension entry on the agent card carries an endpoint field. A caller resolving @<local>@<host> goes WebFinger → agent card → reads a2a.capabilities.extensions[] → finds the entry whose uri equals https://mentionable.dev/ns/transport-rest/v0.1 (or, for backward compatibility, the deprecated alias https://mentionable.dev/spec/transport-rest/v0.1) → uses that entry’s endpoint as the REST base URL. The caller issues GET <endpoint>?... and POST <endpoint> directly; no derivation, no ?to= overlay, no host-root assumption.

{
  "address": "@lean@firemanager.info",
  "a2a": {
    "capabilities": {
      "extensions": [
        {
          "uri": "https://mentionable.dev/ns/transport-rest/v0.1",
          "endpoint": "https://firemanager.info/~lean",
        },
      ],
    },
  },
}

The endpoint MUST be an absolute https: URL. The host MUST equal the canonical host derived from the agent’s WebFinger record (see webfinger.md §2 and policy-part-v0.1.md §3.2 for the procedure). Beyond that the endpoint’s path layout is the host’s free choice.

Cards declaring the REST extension URI without an endpoint field are malformed and MUST be rejected by validateAgentCard. Runtime callers consuming a card whose REST extension entry has no endpoint MUST treat the agent as “REST not available” and fall through to A2A.

Reference path layout (informative, not normative). The reference implementation publishes https://<host>/~<local> as the endpoint for the agent named <local>. The leading ~ keeps agent paths from colliding with /api, /.well-known/..., /setup, and other host-side namespaces; it also makes “is this URL an agent endpoint?” trivially answerable from the path alone. Other hosts can pick any layout that fits their site as long as the chosen URL is the value advertised in endpoint.

Trailing-slash trap (informative). A publisher whose stack 308-redirects between trailing-slash and no-trailing-slash variants of the same path MUST advertise the URL form the route handler accepts directly — clients cannot follow a 308 on POST without losing the request body (most fetch implementations downgrade to GET on the redirect, dropping the multipart body, which then lands at the receiver as a content-typeless GET and trips a 415). The reference implementation publishes the no-trailing-slash form and configures its framework to skip the slash redirect (#324). If you publish /~lean/ and your framework redirects POSTs to /~lean, multipart uploads will silently fail.


3. Request

3.1 GET form — single-turn only

The canonical form for a single-turn mention is:

GET <endpoint>?user=<text>
Accept: <one or more media types per §4>
Accept-Language: <BCP 47 list, optional>

A user query parameter MAY be repeated within the same turn — this is the standard application/x-www-form-urlencoded encoding for “more than one entry under the same key”. Each entry is dispatched per the part-dispatch rules in §3.3:

GET <endpoint>?user=hello                               # one text entry
GET <endpoint>?user=hello&user=https://example/img.png   # text + URL attachment
GET <endpoint>?user=data:image/png%3Bbase64,iVBOR…       # data: URL inline attachment
ParameterRequiredDescription
userREQUIREDOne or more entries that together comprise this single user turn. Per §3.3, an entry is dispatched by inspecting its body.
sessionOPTIONALOpaque correlation token returned by the agent on a prior turn (see §6).
langOPTIONALBCP 47 language tag. Hint to the agent for response language; the binding wire signal is Accept-Language.

Implementations MAY accept additional query parameters for transport-specific features but MUST ignore unknown parameters; agents MUST NOT make conformance depend on them.

GET is the single-turn-only entry point. An assistant query parameter, OR any structure that would express more than one user turn (the spec defines none — these would be inventions of the caller), MUST be rejected with 400 Bad Request. The 400 body SHOULD point the caller at the POST form below (“multi-turn conversations require POST multipart”).

The combined query string size MUST NOT exceed 8 KiB. Servers reject longer requests with 413 Payload Too Large (the reference implementation does so directly in the GET handler — 414 URI Too Long is reserved for genuine path-length blowups by the platform layer above). Clients with payloads exceeding this MUST use the POST form (§3.2) instead.

A note on ordering: application/x-www-form-urlencoded does not normatively guarantee preservation of inter-key order across the wire, only the order of entries that share the same key (URL Living Standard parser, application-defined parser order). That is exactly why GET is restricted to a single turn — multi-turn ordering would need a normative-order body, which multipart/form-data (§3.2) provides.

3.2 POST form — multi-turn, multipart/form-data

For multi-turn conversations and for any payload too large for the GET form, callers POST multipart/form-data to the same endpoint:

POST <endpoint>
Content-Type: multipart/form-data; boundary=----X
Accept: <one or more media types per §4>

------X
Content-Disposition: form-data; name="user"
Content-Type: text/plain; charset=utf-8

안녕
------X
Content-Disposition: form-data; name="history"
Content-Type: application/json

[{"role":"user","sender":{"address":"slack:T123/U456","auth_method":"none","verified":false,"profile":{"display_name":"JC","provider":"slack","provider_subject":"slack:T123/U456"}},"parts":[{"kind":"text","mime":"text/plain","content":"이전 질문"}],"timestamp":"2026-05-06T00:00:00.000Z"}]
------X
Content-Disposition: form-data; name="assistant"
Content-Type: text/plain; charset=utf-8

이전 답
------X
Content-Disposition: form-data; name="parts"
Content-Type: application/json

[{"kind":"text","mime":"text/plain","content":"현재 질문"},{"kind":"file","mime":"application/pdf","name":"report.pdf","bytes_ref":{"kind":"url","url":"https://connector.example/api/slack/files/<signed-token>","expires_at":"2026-05-06T00:05:00.000Z"},"size_bytes":12345}]
------X
Content-Disposition: form-data; name="user"
Content-Type: text/plain; charset=utf-8

현재 질문
------X
Content-Disposition: form-data; name="user"
Content-Type: image/png

<binary bytes>
------X--

The wire vocabulary for the request body is exactly five names:

nameMeaning
userA turn entry authored by the caller. Receiver-relative (see §3.4).
assistantA turn entry the agent itself authored on a prior turn. Receiver-relative (see §3.4).
historyOptional rich prior-turn HistoricalMessage[] JSON sidecar. See §3.2.2.
partsOptional typed current-turn Part[] JSON sidecar. See §3.2.1.
sessionThe opaque session token from §6, when continuing a prior conversation. At most one part with this name.

Multi-turn structure rides on RFC 7578 §5.2’s normative guarantee that part order is significant. Concretely:

3.2.1 Typed current-turn parts sidecar

parts is optional. Its value is a JSON array of Part objects for the current incoming turn only. It exists for Connectors that already normalized richer platform input before dispatching over REST. The motivating case is Slack files: the Slack Connector must not forward url_private or bot tokens, but it can safely forward a Mentionable file part whose bytes_ref.url is a short-lived Connector capability URL.

When parts is present and valid, receivers use it for NormalizedMessage.parts. The normal user field is still sent and still defines the terminal turn boundary, but it is only the legacy text fallback for receivers that do not understand parts.

Receivers MUST treat parts as body data, not identity or authorization evidence. If a file.bytes_ref.url points at a provider-private URL or a URL the receiver’s policy disallows, the receiver MAY drop that part or refuse the request.

3.2.2 Rich history sidecar

history is optional. Its value is a JSON array of HistoricalMessage objects. It exists for Connector-backed group/chat surfaces where prior turns need per-speaker sender.profile, timestamps, ids, or multi-party attribution that cannot be represented by the minimal receiver-relative user/assistant fields.

The current request sender is never taken from history or any body field. REST derives the current sender only from request headers/auth channels (§7). history.sender.profile is contextual data for LLM attribution; it is not authorization evidence. history.sender.identities MAY be present only when the evidence itself is portable and verifiable by the receiver. Receivers that do not verify historical evidence SHOULD strip history.sender.identities, ignore body-supplied auth_method / key_id, and carry historical senders with verified:false.

application/json and application/x-www-form-urlencoded POST bodies are NOT part of v0.1 and MUST be rejected with 415 Unsupported Media Type. The only POST body shape this spec defines is multipart/form-data.

Body size (MUST). Servers MUST reject bodies larger than 1 MiB with 413 Payload Too Large. Body size is counted as raw request bytes, before multipart decoding. (A per-card opt-in to a higher cap is a candidate for v0.2 — v0.1 fixes the cap at 1 MiB so every conformant client knows the budget without reading the card.)

Receive-side algorithm (informative). The wire is straight Request.formData() followed by a stable group-by-name. Reference pseudocode:

const formData = await req.formData() // Web standard; native on Node 18+, Vercel, Cloudflare, browsers.
const turns: Array<{ role: 'user' | 'assistant'; parts: Array<string | File> }> = []
let cur: { role: 'user' | 'assistant'; parts: Array<string | File> } | null = null
let session: string | undefined
let history: HistoricalMessage[] | undefined
let parts: Part[] | undefined
for (const [name, value] of formData.entries()) {
  if (name === 'parts') {
    if (typeof value === 'string') parts = parseParts(value)
    continue
  }
  if (name === 'history') {
    if (typeof value === 'string') history = parseHistoricalMessages(value)
    continue
  }
  if (name === 'session') {
    if (typeof value === 'string') session = value
    continue
  }
  if (name !== 'user' && name !== 'assistant') continue // unknown names: ignore (forward-compat)
  if (!cur || cur.role !== name) {
    if (cur) turns.push(cur)
    cur = { role: name, parts: [] }
  }
  cur.parts.push(value) // string for text/* parts; File for binary parts.
}
if (cur) turns.push(cur)
// If `parts` is absent, each terminal user entry is then resolved per §3.3.

No state machine, no grammar — just formData.entries() and a single pass.

3.2.3 History — prior turns on the wire

A multi-turn POST carries prior conversation as additional name="user" / name="assistant" parts BEFORE the terminal name="user" part. These fields are the simple transcript fallback and preserve curl-friendly interop. When a valid history sidecar is also present, receivers SHOULD use the sidecar for NormalizedMessage.history because it preserves sender/profile attribution that receiver-relative user / assistant fields cannot express:

------X
Content-Disposition: form-data; name="user"

이전 질문
------X
Content-Disposition: form-data; name="assistant"

이전 답
------X
Content-Disposition: form-data; name="user"

현재 질문   ← terminal user run = current incoming turn
------X--

Receivers reconstruct the conversation by walking parts in order and grouping consecutive same-name parts into turns (see the receive-side algorithm above). The terminal user run is always the current incoming turn; everything before it is history. Agents project that history onto NormalizedMessage.history for the runtime / LLM.

History parts MUST be text-kind only (multimodal history is the A2A escape hatch — clients that need to re-send images / files in history use A2A instead). Empty parts (zero-byte text) MUST be dropped before emission so the receiver’s part-grouping pass doesn’t collapse adjacent same-name turns over an empty boundary.

There is no MUST cap on history length per turn — the 1 MiB body cap (§3.2) is the operative constraint.

3.3 Part dispatch

Each part is dispatched by inspecting its Content-Type header (a standard HTTP header on every multipart part, defaulting to text/plain when absent per RFC 7578 §4.4) and, for text parts, the leading characters of the body. Every rule below cites the standard it borrows from — this spec adds nothing.

ConditionInterpretation
Content-Type: text/* AND the body starts with data:An RFC 2397 data URL. Fetch it (built-in fetch() resolves data: natively).
Content-Type: text/* AND the body starts with http:// or https://An RFC 9110 URL reference. Fetch it.
Content-Type: text/* AND neither of the aboveThe body is the text content of this part.
Content-Type: <anything not text/*> (e.g. image/png, application/pdf, audio/*, …)The body is the raw bytes of an attachment with the declared media type.

For GET, the same dispatch applies to each user query value — strings starting with data: or http(s): are fetched, everything else is text. There is no notion of “binary GET part”; rich-content GETs that need a binary attachment carry it via data: or http(s):.

Clients MAY attach an RFC 9530 Repr-Digest (or, prior to 9530, Digest) header to a part as an immutability hint — useful for caches that want stable keys across re-uploads. Servers MAY use it; this spec does not require either side to.

3.4 Receiver-relative roles (informative)

user and assistant are LLM-API-style placeholders, evaluated relative to the receiving agent:

This matches the placement convention of the major LLM APIs (Anthropic messages[], OpenAI messages[]) and lets a receiving agent map the wire shape to its model input with no role translation.

Multi-agent topologies (group threads, orchestrators, agent-to-agent forwarding) are expressed as the orthogonal product of (1) these receiver-relative wire roles and (2) the standard HTTP authentication channels in §7 — not by overloading user/assistant with sender metadata. The body wire never carries identity.

3.5 No other methods

PUT, PATCH, DELETE are not part of v0.1 and MUST be rejected with 405 Method Not Allowed on every endpoint, including task sub-resources. HEAD and OPTIONS follow standard HTTP semantics (HEAD returns the same headers as a GET would, with no body; OPTIONS returns the supported methods).

Task sub-resources introduced in §5.4–§5.6 use only GET and POST:

Sub-resourceAllowed methods
GET /tasks/{id}GET
POST /tasks/{id}/webhookPOST
GET /tasks/{id}/artifacts/{artifactId}GET

PUT, PATCH, and DELETE on any of these paths MUST also be rejected with 405.


4. Content negotiation

The response media type is selected by the request’s Accept header per RFC 9110 §12.5.1. Agents MUST support text/html and text/markdown; SHOULD support application/json and text/event-stream. The highest-quality match wins.

Media typeUseDefault?Required
text/htmlBrowser. Wraps the response in a minimal HTML skeleton with the agent’s name and a stylable container. The body is rendered markdown.YES — when Accept is missing, */*, or text/html is the highest-quality match.MUST
text/markdownLLM clients via WebFetch. Pure markdown stream, no HTML wrapping.When text/markdown is the highest-quality match.MUST
application/jsonAgent-to-agent. Full normalized response per §5.3.When application/json is the highest-quality match.SHOULD
text/event-streamStreaming. Server-Sent Events carrying incremental tokens (data: <markdown chunk>).When text/event-stream is the highest-quality match.SHOULD

The two MUST media types (text/html and text/markdown) cover the load-bearing cases — the human-via-browser path and the LLM-client-via-WebFetch path. Agents that don’t expose typed parts (e.g. simple echo agents) have no benefit emitting application/json and need not.

No-Accept default. If no Accept header is present, agents MUST treat the request as Accept: text/html, */*;q=0.5. This makes “paste the URL into a browser” the default user experience.

No matching media type. When the request’s Accept header lists only media types the agent does not support, the agent MUST respond with 406 Not Acceptable per RFC 9110 §15.5.7. The 406 response body is informational; clients SHOULD retry without an Accept header to receive the default text/html.

Library guidance (informative). Conformance to RFC 9110 §12.5.1 covers q-value sorting, wildcards (*/*, text/*), and parameter matching — non-trivial to hand-roll. The reference implementation uses negotiator (Node.js, used by Express). Other implementations MAY use any RFC 9110-compliant negotiator; correctness is verified by the test vectors shipped alongside this spec.

4.1 The browser default (text/html)

The HTML response is a minimal, stylable page:

<!doctype html>
<html lang="<response language>">
  <head>
    <meta charset="utf-8" />
    <title>@<local>@<host> — Mentionable</title>
    <link rel="alternate" type="text/markdown" href="<same URL>" />
    <link rel="alternate" type="application/json" href="<same URL>" />
    <meta name="mentionable:agent" content="@<local>@<host>" />
    <style>
      /* agent-chosen, optional */
    </style>
  </head>
  <body>
    <main class="mentionable-response">
      <header>...</header>
      <article><!-- rendered markdown of the agent's reply --></article>
    </main>
  </body>
</html>

Required elements:

Markdown renderer (informative). The reference implementation uses marked for CommonMark+GFM rendering. Other implementations MAY use any CommonMark+GFM-compliant renderer; the wire-visible output (HTML escaping, table structure, autolink behavior) is what conformance tests against.

Optional but RECOMMENDED:

4.2 The LLM-client default (text/markdown)

When the client requests text/markdown (highest quality), the response body is the agent’s markdown verbatim — no HTML, no JSON envelope, no preamble. Headers carry transport metadata:

HTTP/1.1 200 OK
Content-Type: text/markdown; charset=utf-8
Content-Language: <BCP 47>
X-Mentionable-Agent: @<local>@<host>
X-Mentionable-Session: <opaque>          (when the agent sets one; see §6)

This is the path optimized for LLM clients that already have a WebFetch tool. The LLM can treat the response identically to any other markdown URL it might fetch.

4.3 Streaming (text/event-stream)

When the client requests text/event-stream, the response is SSE per WHATWG HTML §9.2:

HTTP/1.1 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache
X-Mentionable-Agent: @<local>@<host>

data: First chunk of markdown
data:
data: Next chunk

event: end
data: {}

Each data: line carries a markdown fragment (no JSON wrapping). The terminal frame is event: end (with empty body). Agents emitting a PolicyPart mid-generation MUST emit it as the last event before event: end using event: policy with the canonical PolicyPart JSON as the data field.

Agents MAY also emit zero or more event: tool_call frames between data: frames (and before the terminal event: policy / event: end) to carry structured tool_call parts:

event: tool_call
data: {"v":"v0.1","part":{"kind":"tool_call","id":"call_1","name":"search","args":{"q":"hello"}}}

event: tool_call
data: {"v":"v0.1","part":{"kind":"tool_call","id":"call_1","name":"search","args":{"q":"hello"},"result":{"hits":3}}}

Tool calls are non-terminal — agents emit one frame per tool_call part, and later frames MAY repeat the same id with an updated result/error once the tool resolves (matching A2A’s monotonic streaming semantics). Receivers MUST track tool calls keyed by id and overwrite earlier values. The data envelope is RFC 8785-canonical { "v": "v0.1", "part": <ToolCallPart> } so receivers parse it identically to the policy frame.

event: policy
data: {"v":"v0.1","part":{"kind":"payment_required", ...}}

event: end
data: {}

When an SSE response carries a PolicyPart, the HTTP status remains 200 OK regardless of policy.kind. The policy is signaled in the event: policy frame. This is an explicit exception to PolicyPart §4.5’s HTTP-status mapping: SSE has already committed 200 OK by the time the policy is emitted, so the status code cannot change. For non-streaming responses, PolicyParts use the HTTP status mapping in policy-part-v0.1.md §4.5.

Streaming from non-streaming agents. If the agent returns a single Promise<NormalizedResponse> (not an AsyncIterable) and the client requested text/event-stream, the adapter MUST emit the concatenated parts[].text as a single data: frame, followed by one event: tool_call frame per tool_call part (in source order), then event: end. Streaming-only agents are not required for SSE conformance.

4.4 The agent-to-agent default (application/json)

When the client requests application/json (highest quality), the response body is the full normalized response:

{
  "v": "v0.1",
  "agent": "@<local>@<host>",
  "session": "<opaque>",
  "parts": [
    { "kind": "text", "text": "Hello!" },
    /* optional further parts */
  ],
}

This is the form most useful for agent-to-agent flows, where the consumer wants typed parts (tool_call, link, file, etc.) rather than rendered text. PolicyParts in this mode appear as a policy field at the top level instead of inside parts:

{
  "v": "v0.1",
  "agent": "@<local>@<host>",
  "policy": {
    "kind": "payment_required",
    "message": "...",
    "accepted_payments": [...]
  }
}

The HTTP status of a JSON response carrying policy MUST match the policy.kind per the table in policy-part-v0.1.md §4.5; the body’s policy field is the canonical surface, the HTTP status is the additional signal for clients that branch on it before parsing.


5. Response

5.1 Status semantics

200 OK indicates a successful response containing the agent’s reply. 204 No Content is reserved (agents MAY use it for fire-and-forget acknowledgements; v0.1 clients MUST handle it as “no reply”).

Refusal responses use the HTTP status table in policy-part-v0.1.md §4.5 — 401, 402, 403, 429, 451, 503. The body of a refusal response MUST also be content-negotiated: HTML clients see a styled refusal page, markdown clients see the refusal text, JSON clients see the structured PolicyPart.

The Content-Language header MUST be set on every response to the language actually used in the body, regardless of media type.

5.2 Required response headers

Every response (success or refusal) MUST include:

Agents MAY also set:

(An X-Mentionable-Policy response header was floated as a structured-form sidecar to the body; it is not implemented in the reference and is removed from v0.1. Clients parse the body’s PolicyPart per §5.3. Restoring a header sidecar is a candidate for v0.2 if the parse cost becomes a bottleneck.)

5.3 Response shape per content type

AcceptStatus 2xx bodyStatus 4xx/5xx body
text/htmlHTML skeleton (§4.1) wrapping rendered markdown of parts[].text.HTML skeleton wrapping the PolicyPart message and an optional action link.
text/markdownConcatenated parts[].text (markdown).The PolicyPart message, followed by the URL on its own line if url is set.
application/jsonFull normalized response (§4.4).{ v, agent, policy } with the canonical PolicyPart.
text/event-streamSSE data: frames + terminal event: end (§4.3).event: policy frame followed by event: end; HTTP status remains 200 OK.

5.4 Async response (#346)

For agents whose work spans tens of seconds — long generation, deep research, video synthesis — the synchronous request/response pattern of §3 hits HTTP timeouts in practice. v0.1 introduces an opt-in async response pattern using only standard HTTP mechanisms (RFC 7240 Prefer header + RFC 9110 §15.3.3 202 Accepted).

Client-side opt-in. The client signals async preference via the standard Prefer: respond-async request header, augmented with a callback parameter naming a URL the server SHOULD POST the completed task to:

POST <endpoint>
Prefer: respond-async; callback="https://caller.example/api/agent-response"
Content-Type: multipart/form-data; boundary=----X

Server-side response. When the server honors the request:

HTTP/1.1 202 Accepted
Content-Location: /tasks/01J…
X-Mentionable-Agent: @<local>@<host>

Server-side completion POST. When the agent finishes, the server POSTs the completed task envelope to the caller-supplied callback URL:

POST https://caller.example/api/agent-response
Content-Type: application/json

{
  "id": "01J…",                 // task identifier — matches A2A `Task.id`
  "contextId": "<uuid>",        // conversation/session id, when known
  "status": {
    "state": "completed",       // or "failed"
    "timestamp": "<RFC 3339>",
    "message": {
      "kind": "message",
      "role": "agent",
      "parts": [
        { "kind": "text", "text": "<agent reply>", "mime": "text/plain" }
      ]
    }
  }
}

The body shape mirrors A2A’s task push notification body (see a2a-agent-card.md §3) so a single receiver can handle both A2A and REST async completions without branching on transport. The primary task identifier is id — the same key A2A uses. Receivers MAY accept the legacy taskId key as a back-compat fallback for one release.

Same-origin enforcement. The callback URL MUST be same-origin with the inbound request (host comparison against the request’s Host / Origin header) and MUST NOT point at a private / loopback / link-local address (RFC 1918, 127.0.0.0/8, 169.254.0.0/16, IPv6 ::1, fc00::/7, fe80::/10, etc.). Non-conforming callbacks SHOULD be silently rejected with the request falling back to the synchronous path. Without this guard the Prefer: respond-async; callback=… header becomes an SSRF foothold — an attacker could hand the agent a third-party URL and have the server POST receipts there.

Conformance. Server support for async responses is OPTIONAL in v0.1. A server that does not support async responses MUST ignore the Prefer: respond-async directive and MAY respond synchronously (per RFC 7240 §2 — Prefer is advisory). Clients that strictly require async behavior SHOULD additionally check the agent card’s pushNotifications: true capability before issuing the request.

Failure handling. When the agent fails to deliver to the callback URL (network failure, 4xx/5xx), it SHOULD log the error and abandon the task — there is no automatic retry. Clients SHOULD poll Content-Location (when the server publishes a task-fetch endpoint) or treat absence of a callback within their own timeout as a failure.

5.4.1 Task polling — GET /tasks/{id}

The Content-Location header returned with 202 Accepted is the stable polling URL for the task. Clients SHOULD poll it when they have not received a callback within a reasonable timeout, or when no callback was supplied.

GET /tasks/01J…
Accept: application/json

Response — task still running:

HTTP/1.1 202 Accepted
Content-Type: application/json
Content-Location: /tasks/01J…

{
  "id": "01J…",
  "contextId": "<uuid>",
  "status": {
    "state": "working",
    "timestamp": "<RFC 3339>"
  }
}

Response — task complete:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "id": "01J…",
  "contextId": "<uuid>",
  "status": {
    "state": "completed",
    "timestamp": "<RFC 3339>",
    "message": {
      "kind": "message",
      "role": "agent",
      "parts": [
        { "kind": "text", "text": "<agent reply>", "mime": "text/plain" }
      ]
    }
  }
}

The status.state field MUST be one of:

stateHTTP statusMeaning
working202Agent is still processing.
completed200Agent finished successfully; status.message set.
failed200Agent finished with an error; status.message set with error detail.

Task not found or expired. Tasks expire 1 hour after creation. A GET /tasks/{id} for an unknown or expired task MUST return 404 Not Found:

HTTP/1.1 404 Not Found
Content-Type: application/json

{ "error": "task not found or expired" }

Servers that do not implement the task polling endpoint MUST return 404 (not 405), so that clients can distinguish “this server does not expose task polling” from “this method is wrong.”

Polling interval (informative). Clients SHOULD wait at least 2 seconds between polls and apply exponential back-off. A maximum polling window of 10 minutes before treating the task as lost is RECOMMENDED.


5.5 Push webhook registration

For callers that cannot receive inbound HTTP (e.g. serverless functions behind NAT) or that want durable push beyond a single task callback, servers MAY support webhook registration on a task. Server support is OPTIONAL — servers that do not support it MUST return 404 Not Found on the registration endpoint so callers can detect the absence cleanly.

Register a webhook:

POST /tasks/01J…/webhook
Content-Type: application/json

{
  "url": "https://caller.example/api/agent-events",
  "secret": "<optional random string>"
}
FieldRequiredDescription
urlREQUIREDAbsolute https: URL the server will POST state-change events to.
secretOPTIONALArbitrary string used to compute X-Hub-Signature-256 on each delivery (see below). Recommended.

Success response:

HTTP/1.1 204 No Content

Task not found:

HTTP/1.1 404 Not Found

Server push on state change. When the task’s state changes (e.g. workingcompleted or workingfailed), the server POSTs the full task envelope (same shape as §5.4.1’s 200 response body) to the registered url:

POST https://caller.example/api/agent-events
Content-Type: application/json
X-Hub-Signature-256: sha256=<hex>

{
  "id": "01J…",
  "contextId": "<uuid>",
  "status": { ... }
}

Signature. When a secret was supplied at registration, the server MUST set X-Hub-Signature-256 to sha256= followed by the lowercase hex HMAC-SHA-256 of the raw request body, keyed with the registration secret. This follows the same convention as GitHub webhooks (GitHub docs — Validating webhook deliveries). Receivers MUST verify this signature before acting on the payload when a secret was registered.

SSRF guard. The same SSRF rules as §5.4 callback apply: the webhook url MUST be an https: URL, MUST NOT resolve to a private/loopback/link-local address (RFC 1918, 127.0.0.0/8, 169.254.0.0/16, ::1, fc00::/7, fe80::/10), and MUST NOT be same-host with the Mentionable server itself. Non-conforming URLs MUST be rejected with 400 Bad Request.

Delivery semantics. Delivery is best-effort. The server SHOULD attempt the POST once; there is no mandatory retry. Receivers SHOULD respond with 2xx promptly; non-2xx responses MAY be logged by the server but do not trigger retries in v0.1.


5.6 Artifact channel

Long-running or compute-heavy tasks often produce binary artifacts (images, PDFs, audio, generated files). Rather than inline these as data: URIs in the task envelope, servers MAY host artifacts at stable URLs and reference them by pointer.

Artifact reference in task envelope. When a completed task includes artifacts, they appear as kind: "file" parts inside status.message.parts, with a url field pointing at the artifact endpoint:

{
  "id": "01J…",
  "status": {
    "state": "completed",
    "message": {
      "kind": "message",
      "role": "agent",
      "parts": [
        { "kind": "text", "text": "Here is the rendered chart.", "mime": "text/plain" },
        {
          "kind": "file",
          "id": "artifact-abc123",
          "mime": "image/png",
          "size": 204800,
          "url": "/tasks/01J…/artifacts/artifact-abc123",
        },
      ],
    },
  },
}

Fetching an artifact:

GET /tasks/01J…/artifacts/artifact-abc123

The response is content-negotiated on the request’s Accept header:

AcceptResponse
application/json200 OK with Content-Type: application/json and body { "id", "mime", "size", "url" }
anything else200 OK with Content-Type: <artifact mime> and the raw artifact bytes as the body

Immutability. Artifacts are immutable once created — the same artifactId always returns the same bytes. Servers SHOULD set Cache-Control: public, max-age=3600 (matching the 1-hour TTL) and ETag on artifact responses to enable client-side caching.

TTL. Artifacts share the 1-hour TTL of their parent task. After expiry, GET /tasks/{id}/artifacts/{artifactId} MUST return 404 Not Found (same as the expired-task response in §5.4.1).

Artifact-only servers. A server that supports async responses (§5.4) but not artifacts MUST return 404 on artifact URLs — NOT 405 — so callers can distinguish “no artifact support” from “wrong method.”

Security. Artifact URLs are task-scoped. Servers MUST NOT serve an artifact at a URL that does not include the parent taskId — i.e., there is no global /artifacts/{id} endpoint. This ensures that access to an artifact requires knowing the task ID, which is treated as a capability token. Servers MUST NOT embed user-supplied content (e.g., filenames from the request) in Content-Disposition headers without sanitizing for filename* injection (RFC 6266).


6. Multi-turn (sessions)

REST is stateless per request. Agents MAY support multi-turn conversation via an opaque session token.

When an agent wants to enable follow-up, the response includes:

The client passes the same session value back on the next request — as a ?session= query parameter on GET, or as a name="session" part on POST. Sessions are agent-private; clients MUST treat the value as opaque.

Session lifetime is the agent’s choice. The agent MAY expire sessions (returning 404 or treating the request as a fresh turn); clients MUST handle this gracefully by starting a new conversation.

Sessions are not authentication. A session token MUST NOT be used as an authorization credential. Agents that need authentication issue 401 with WWW-Authenticate per policy-part-v0.1.md §3.3.1.

REST is intentionally not an OAuth callback target in v0.1 — clients that need consent or payment flows redirect to the agent’s own page (the URL in consent_required.url or payment_required body) and complete the flow there. The session machinery is for “ask follow-up question,” not for “complete an OAuth handoff.”


7. Authentication and identity

7.1 Anonymous requests

The default REST request is anonymous — no Authorization header, no client certificate. Agents that wish to serve anonymous mentions accept the request and treat the Sender as { address: '', auth_method: 'none', verified: false } per normalized-message.md. PolicyPart unauthorized (§4.5) is the wire signal for “I need to know who you are.”

7.2 Existing HTTP authentication channels

REST identity proofs SHOULD ride on existing HTTP authentication channels first:

The token or signature binding to a Mentionable address is agent-defined in v0.1; richer REST auth profiles are future work. After a REST transport module verifies one of these channels, it MAY surface the normalized result as NormalizedMessage.sender.identities.

7.3 Forwarded identity evidence

REST callers that need to forward an already-normalized evidence array MAY send:

Mentionable-Identity-Evidence: <base64url(JSON array of IdentityEvidence)>

The shape is defined in identity-evidence-v0.1.md. This header is a carrier for normalized evidence, not the primary REST authentication mechanism. Receivers MUST structurally validate entries and MUST verify any portable proof before attaching them to NormalizedMessage.sender.identities. Malformed or unverifiable entries are ignored.

Public REST endpoints MUST NOT trust unsigned transport evidence merely because it arrived in this header. They SHOULD only accept forwarded evidence when the request itself is authenticated to a trusted Connector/gateway via §7.2, or when each evidence proof is independently verified (for example signed-attestation with audience/freshness checks). If verification requires fetching a Connector Card named by caller-provided evidence, the receiver MUST first match the issuer/Connector host against local Trusted Connector Issuer policy and MUST reject private, loopback, or non-HTTPS Connector Card hosts. The agent’s identity policy then decides whether the issuer/method/assurance is acceptable for the requested purpose.

Mentionable-Identity and X-Mentionable-Identity are legacy aliases for Mentionable-Identity-Evidence.

When the agent rejects an anonymous request, the response is 401 Unauthorized with a WWW-Authenticate header reconstructed from the PolicyPart’s auth_challenges[] (see policy-part-v0.1.md §3.3.1).

7.4 Caller identity hint (informative)

Some Mentionable senders carry their own identity (e.g. another Mentionable agent calling REST). v0.1 does not normatively specify how a caller asserts their address; agents that want this information SHOULD accept the optional header:

X-Mentionable-From: @<local>@<host>

For authenticated identity, use the §7.2 authentication channels or verified forwarded evidence (§7.3). Agents MUST treat X-Mentionable-From as a display/threading hint, not as an authenticated claim — they MUST NOT use it for authorization.


8. Security considerations

8.1 Channel-layer authentication

REST endpoints MUST be served over HTTPS. Plain HTTP is rejected (the WebFinger record itself is HTTPS-only per webfinger.md, so an HTTP REST endpoint cannot satisfy the canonical-host binding required for policy-part-v0.1.md §3.2).

The TLS certificate’s host MUST match the WebFinger-bound canonical host. Agents using a CDN or shared host MUST ensure the cert covers the exact host the WebFinger record points at.

8.2 Untrusted input in user parts

Every user part is caller-controlled. Agents MUST treat each one as untrusted on every layer:

The spec normatively forbids implicit trust on the HTML rendering path; defenses against prompt injection at the LLM layer are not otherwise prescribed.

8.2.1 GET idempotency and side effects

Per RFC 9110 §9.2.2, GET MUST NOT have user-visible side effects. Agents MUST NOT trigger state-changing actions (payment, account modification, external API calls with non-idempotent semantics) directly from a GET-with-user request. State-changing actions MUST require:

  1. The GET response carries a consent_required or payment_required PolicyPart with a url pointing at the agent’s own confirmation page, AND
  2. The actual side effect happens only after the user completes the flow at that URL.

This rule is the wire-level CSRF defense for REST. An attacker who plants <a href="https://bank-bot.example/~teller/?user=transfer%20$1000"> in a public page cannot make the bot transfer money: the most the bot can do is reply with a payment_required page. The consent-or-payment hop carries the actual confirmation.

Agents MAY use POST (§3.2) for actions whose semantics are non-idempotent at the protocol level (e.g. submitting a form the agent itself owns). POST endpoints SHOULD additionally verify Origin and Sec-Fetch-Site headers per browser CSRF hardening conventions.

8.3 CSRF

Because REST endpoints respond to GET, an attacker can cause a victim’s browser to issue a request via an <img> or <a> tag. v0.1 partially mitigates by:

Agents that maintain state per session MUST treat session tokens as cookies-equivalent: the absence of CSRF defenses on plain GET is mitigated by sessions being agent-private and not authorization credentials (§6).

8.4 Rate limiting

Agents MUST rate-limit by source IP and by session (when present). The wire signal for a rate-limit refusal is 429 Too Many Requests per policy-part-v0.1.md §4.5, with Retry-After.

8.5 Caching and search-engine indexing

Cache-Control defaults to private, max-age=0 (§5.2) for a reason: agent responses are usually per-user and short-lived. Agents that explicitly set longer caching MUST NOT include user-personal content in the cached response.

Indexing (MUST). Agent responses are reachable by GET, which makes them crawlable by default. Personal answers and user entries themselves land in public search caches without explicit defense. Every response (success or refusal) MUST include both:

v0.1 has no per-agent indexable opt-in: every response carries noindex unconditionally. (An FAQ-style agent that genuinely wants its responses crawled is a candidate for a mentionable.indexable: true opt-in in v0.2; until that lands, agents that want indexable content publish it through a separate static surface.)

8.6 CORS

Agents that want to be reachable from browser-side JavaScript MUST set:

Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, OPTIONS
Access-Control-Allow-Headers: Content-Type, Accept, Accept-Language, Authorization, Signature, Signature-Input, Mentionable-Identity-Evidence, Mentionable-Identity, X-Mentionable-Identity, X-Mentionable-From
Access-Control-Expose-Headers: X-Mentionable-Agent, X-Mentionable-Session, Content-Language

CORS is opt-in per agent. Defaults to no CORS — a server-to-server fetch (LLM tool) works either way; only browser-side JS needs the headers.


9. Examples

9.1 Browser visit (single-turn GET)

$ curl -i 'https://firemanager.info/~lean/?user=4%25%20rule'
HTTP/2 200
content-type: text/html; charset=utf-8
content-language: en
x-mentionable-agent: @lean@firemanager.info
cache-control: private, max-age=0

<!doctype html>
<html lang="en">
  <head>
    <title>@lean@firemanager.info — Mentionable</title>
    <link rel="alternate" type="text/markdown" href="https://firemanager.info/~lean/?user=4%25%20rule">
    <meta name="mentionable:agent" content="@lean@firemanager.info">
  </head>
  <body>
    <main class="mentionable-response">
      <article>The 4% rule is …</article>
    </main>
  </body>
</html>

9.2 LLM client via WebFetch

$ curl -i 'https://firemanager.info/~lean/?user=4%25%20rule' -H 'Accept: text/markdown'
HTTP/2 200
content-type: text/markdown; charset=utf-8
content-language: en
x-mentionable-agent: @lean@firemanager.info

The 4% rule is a guideline for retirement spending …

9.3 Multi-turn POST

$ curl -i 'https://firemanager.info/~lean/' \
    -H 'Accept: text/markdown' \
    -F 'user=earlier I asked about the 4% rule' \
    -F 'assistant=The 4% rule is …' \
    -F 'user=what about a 3.5% rule for early retirement?'
HTTP/2 200
content-type: text/markdown; charset=utf-8

A 3.5% rule trades current spending for survivability over a longer horizon …

9.4 Multi-turn POST with an attachment

$ curl -i 'https://firemanager.info/~lean/' \
    -F 'user=look at this chart' \
    -F 'user=@chart.png;type=image/png'

The two user parts together form one user turn; the second carries the binary image bytes via multipart/form-data’s native file-upload encoding.

9.5 Streaming

$ curl -i 'https://firemanager.info/~lean/?user=...' -H 'Accept: text/event-stream'
HTTP/2 200
content-type: text/event-stream
cache-control: no-cache
x-mentionable-agent: @lean@firemanager.info

data: The 4% rule is

data:  a guideline for

data:  retirement spending.

event: end
data: {}

9.6 Payment required (HTML)

$ curl -i 'https://firemanager.info/~lean/?user=run%20backtest'
HTTP/2 402
content-type: text/html; charset=utf-8
content-language: en
x-mentionable-agent: @lean@firemanager.info
x-mentionable-policy: eyJ2IjoidjAuMSIsInBhcnQiOnsia2luZCI6InBheW1lbnRfcmVxdWlyZWQi…

<!doctype html>
<html lang="en">
  <body>
    <main class="mentionable-response">
      <article>
        <h1>Payment required</h1>
        <p>This action requires payment.</p>
        <a href="https://firemanager.info/pay?...">Pay $5 USDC on Base</a>
      </article>
    </main>
  </body>
</html>

10. Future work


11. References

Standards

Mentionable