Normalized Message — Mentionable v0.1
Status: v0.1 (implemented)
This document defines the single in-memory shape that every Mentionable adapter produces on the way in and consumes on the way out. It is the contract between the adapter layer (ActivityPub / A2A / Email) and the agent runtime.
If this shape is wrong, the project is wrong. Every other document in this spec is downstream of getting this right.
1. Goals
- Lossless for common cases. A reply to a text message with one attachment and a thread reference must round-trip through the shape without losing anything that a sender would expect the recipient to see.
- Honest about what does not translate. Protocol-specific metadata that the agent runtime does not need (AP signature details, A2A task metadata, email routing headers) lives in
raw. That field is a feature, not a failure. - Small enough to hold in one head. If a new contributor cannot read this document and correctly implement a stub adapter in a day, the shape is too big.
2. Non-goals
- Perfect fidelity. An agent that needs the exact AP
Createenvelope readsraw. The normalized shape is not a superset of all three protocols. - Protocol translation. The shape describes what arrived. Re-sending to a different protocol requires the outbound adapter to rebuild the native message from the normalized response; it is not a byte-for-byte transform.
- State machine. Sessions, threads, and tasks-in-flight are runtime concerns, not shape concerns. The shape carries identifiers; the runtime owns the state.
3. The shape
type NormalizedMessage = {
id: string
thread_id: string
in_reply_to?: string
sender: Sender
recipient: string
parts: Part[]
history?: HistoricalMessage[] // see §5.4
recipient_capabilities: RecipientCapabilities // see §3.3
received_via: 'activitypub' | 'a2a' | 'email'
received_at: string // ISO 8601, UTC
raw: unknown
policy_resolution?: PolicyResolution // see §3.4
received_trace?: NormalizedResponse // see §10
}
type HistoricalMessage = {
id?: string // adapter-known stable id when available
role: 'user' | 'assistant'
sender: Sender
parts: Part[]
timestamp: string // ISO 8601, UTC
}
type Sender = {
address: string // canonical: @user@domain or @agent@domain
display_name?: string
profile?: SenderProfile
auth_method:
| 'ap-http-signature'
| 'ap-object-integrity-proof'
| 'a2a-jwt'
| 'a2a-oauth'
| 'email-dkim'
| 'email-dmarc'
| 'none'
verified: boolean
key_id?: string // AP actor key URL, JWT kid, DKIM selector, etc.
identities?: IdentityEvidence[] // see identity-evidence-v0.1.md
}
type SenderProfile = {
display_name?: string
username?: string
url?: string // human-viewable account/profile/actor URL or deep link
avatar?: {
url: string
mime_type?: string
width?: number
height?: number
}
locale?: string // BCP 47
timezone?: string // IANA timezone
provider?: string // e.g. slack, activitypub, oauth
provider_subject?: string // e.g. slack:T123/U456
fetched_at?: string // ISO 8601, UTC
extensions?: Record<string, unknown> // namespaced, whitelisted provider facts
}
type Part =
| { kind: 'text'; mime: 'text/plain' | 'text/markdown' | 'text/html'; content: string }
| { kind: 'file'; mime: string; name?: string; bytes_ref: BytesRef; size_bytes?: number }
| { kind: 'link'; url: string; title?: string; description?: string }
| { kind: 'artifact'; mime: string; name?: string; bytes_ref: BytesRef; artifact_type?: string }
| {
kind: 'tool_call'
id: string // unique within the response (typically the LLM-assigned tool_use id)
name: string // tool name as exposed to the LLM
args: unknown // input arguments (JSON-serializable)
result?: unknown // output payload when the call succeeded (JSON-serializable)
error?: { message: string } // present when the call failed; mutually exclusive with `result`
duration_ms?: number // observed call duration
started_at?: string // ISO 8601 timestamp when the call started
}
type BytesRef =
| { kind: 'inline'; data_base64: string }
| { kind: 'url'; url: string; expires_at?: string }
| { kind: 'content_addressed'; algo: 'sha256'; digest: string; url?: string }
3.1 Field contracts
id — Globally unique within the receiving node. Adapters MUST generate stable ids (UUIDv7 recommended) so that retries and deduplication work. This is the Mentionable id, not the protocol id; the protocol id lives in raw.
thread_id — A stable string that groups messages belonging to one conversation. Derivation is protocol-specific (see §5). Two messages with the same thread_id SHOULD be treated by the runtime as part of the same conversation regardless of arrival protocol. Cross-protocol threading is out of scope for v0.1: if the same logical conversation arrives via AP and email, the runtime sees two thread_ids.
in_reply_to — The id of the parent message if this one is a direct reply, or the protocol-native id (AP activity IRI, email Message-ID, A2A message id) when the parent is not in local history. Adapters SHOULD prefer the Mentionable id when known. If only the protocol-native id is available, it is carried as-is.
sender.address — Always the @user@domain canonical form. Email addresses become @local@domain. AP actor IRIs are resolved back to their WebFinger acct: form. If this cannot be done confidently, auth_method is none and verified is false.
sender.verified — True only when the adapter has cryptographically verified the claim to the legacy sender.address field. AP requires HTTP Signatures or Object Integrity Proofs to pass. Email requires a DKIM signature from the sender’s domain with a passing body hash. A2A requires a validated JWT or OAuth token whose subject matches sender.address. Anything weaker — SPF pass, unsigned AP object — is verified: false.
sender.identities — Optional extensible identity evidence array. This is the preferred surface for platform-native and delegated identity: Slack workspace members, OAuth subjects, SIWE wallets, agent self-signatures, and forwarded upstream principals. See identity-evidence-v0.1.md. sender.auth_method / sender.verified remain for compatibility and MUST NOT be treated as the only identity signal once sender.identities is present.
sender.profile — Optional presentation/context facts about the sender: display label, username, profile URL, avatar URL, locale, timezone, and provider-specific whitelisted facts. sender.profile is for LLM attribution and UI rendering only. It MUST NOT be used for authorization, account linking, payment, delegation, or rate-limit keying unless a separate policy explicitly links it to verified sender.identities. Transport modules and Connectors SHOULD prefer the common fields above and place provider-specific facts under extensions.<provider>, not by dumping raw provider objects. When trusted IdentityEvidence.claims.profile is present, a receiver MAY project it into sender.profile.
recipient — The single agent address this delivery is for, in @agent@domain form. If the underlying message was addressed to multiple recipients, the adapter produces one NormalizedMessage per recipient served by this node. Cross-recipient fanout is the transport module’s job, not the runtime’s.
parts — Ordered. Order is semantic: it is the order an agent SHOULD render or process them. Adapters MUST preserve source order. Empty parts is allowed (e.g. an email with only a subject; the subject becomes a single text part).
history — Optional. Prior turns of the same conversation, chronologically sorted oldest-first, conceptually append-only across the lifetime of the thread. Adapters populate this per §5.4. Each entry is a structurally-thinner sibling of NormalizedMessage — see the HistoricalMessage shape above. The role discriminator is 'assistant' when the prior turn was authored by the recipient agent (i.e. sender.address === recipient) and 'user' otherwise. Agents MAY ignore history (single-turn behavior preserved); when an agent does consume it, the transport module’s HistoryPolicy has already trimmed the array to the configured budget — see agent-interface.md §10 for the responsibility split and the HistoryPolicy interface.
received_via — The inbound protocol. Agents SHOULD NOT branch on this field for business logic; it is primarily for observability and for the outbound adapter to choose a default reply channel.
received_at — When the adapter finished parsing and validating. Not the protocol-level timestamp (Date: header, AP published, A2A timestamp), which lives in raw.
raw — The parsed native message. Its shape depends on received_via. Agents that do not need protocol specifics MUST ignore this field; the runtime is not allowed to require it. See §6.
recipient_capabilities — A structured description of what the inbound platform offers an agent that wants to bring siblings into the conversation. Required, non-optional. Adapters MUST populate this from a static per-protocol value (see §3.3 and §5). Agents (and prompt-building code wrapping LLMs) read this field to decide whether to “type @handle and trust the platform” vs. “open a direct A2A call to the sibling and synthesize one cohesive reply.” The shape is small and additive: today it carries only mention_relay; future capabilities (attachments, reactions, threading model) extend the same wrapper.
3.2 Parts — design notes
A flat ordered list beats a tree. Nesting (AP Collection, email multipart/related) tempts adapters to over-model structure that the agent will flatten anyway.
Text parts keep their mime type so the agent knows whether it is safe to treat as plain, markdown, or HTML. Email multipart alternatives collapse to a single text part by picking the best alternative (text/plain or text/markdown preferred; text/html only when plaintext is absent or empty). The unchosen alternatives are not carried in parts; they survive in raw.
File vs artifact. file is “the sender attached this.” artifact is “the sender’s agent produced this as output.” Inbound adapters produce file for email attachments and AP attachment, and artifact for A2A artifact parts. The distinction matters for outbound rendering: A2A artifacts have semantics (versioning, task association) that emails and Notes do not.
tool_call is “the agent’s LLM invoked a tool while producing this response.” A single part carries the full call lifecycle (arguments, then result-or-error) so that the part list does not need separate “start” and “end” entries; instead, streaming responses emit the same part twice — first with result/error absent (status 'partial'), then again with the call resolved (status 'partial' or, if it is the final frame, 'ok'). Adapters identify the lifecycle by id, so re-emitted parts MUST keep the same id. error and result are mutually exclusive: once a call completes, the part carries one but not both. args and result MUST be JSON-serializable so any adapter can persist them; agents that need to ship binary tool output SHOULD pair the tool_call part with a sibling file or artifact part. The part is intended for tool calls the agent wants the recipient to see (web search hits, computed analyses, lookups). Internal control-flow tool calls the agent would prefer to keep private MUST NOT be emitted as tool_call parts.
The default outbound serialization for adapters that cannot render structured tool calls (Email, plaintext-only AP) is a one-line text block:
🔧 <name>(<args summary>) → <result summary> [success]
🔧 <name>(<args summary>) → ❌ <error.message> [failure]
🔧 <name>(<args summary>) → … [in-flight; partial frame only]
The <args summary> and <result summary> are JSON, truncated to a single line and an adapter-defined byte budget (default 200 bytes per side, replaced by … when over). @mentionable/core ships a helper, serializeToolCallToText, that produces this exact line; adapters MUST use it (or an explicitly-noted equivalent) so the on-the-wire shape stays consistent. Adapters that have a richer surface MAY render the part natively and skip the text fallback. A2A uses the a2a-tool-events-v0.1 extension: tool_call parts are carried as DataParts with AI SDK-style toolCallId, toolName, input, and output fields.
BytesRef has three variants because each protocol prefers a different one. A2A messages often ship bytes inline; AP typically links; email is inline but large attachments are frequently linked. Adapters SHOULD prefer content_addressed when they have the option, because it lets the runtime deduplicate and store safely. inline is acceptable for small payloads (under 64 KiB; runtime MAY reject larger). url is acceptable when the URL is expected to live at least as long as the message (set expires_at when known).
When a Connector bridges a platform whose file URLs require Connector-held
credentials, bytes_ref.url MUST be a Connector-hosted capability URL rather
than the provider’s private URL. The capability URL should be short-lived,
audience/scoped, and budgeted. Provider OAuth tokens, bot tokens, and private
download URLs do not belong in parts, raw copies forwarded to agents, or
identity claims. A2A carries these descriptors as FilePart.file.uri; REST
callers that need typed current-turn parts use the parts JSON sidecar from
transport-rest-v0.1.md.
3.3 Recipient capabilities
When an LLM-backed agent receives a question and decides “this needs @gamebuilder’s perspective too,” the right next move depends entirely on the platform that just delivered the message:
- Slack / Discord — typing
@gamebuilderin the reply body is enough. The platform reads it and notifies the addressed agent. - Email — the body is opaque to the platform’s routing layer. To wake another recipient the agent MUST add them to
To:orCc:. A bare@gamebuilderin the body is decorative; the platform never sees it as a routing signal. - ActivityPub — both are required: the actor IRI in the activity’s
to/ccaddressing collections AND a textual@actor@hostmention in the body (Mastodon-compatible). - A2A — single-request / single-response. The receiving agent has no platform-side relay; it must dispatch siblings itself (typically via outbound A2A) and synthesize one cohesive reply.
- SMS / pager-style channels — same as A2A. No platform relay.
Without this signal, an LLM agent reused across protocols will produce replies that work on Slack and silently fail on email. recipient_capabilities exposes the difference structurally so the agent (or the prompt scaffolding around it) can adapt.
type RecipientCapabilities = {
mention_relay: MentionRelayCapability
agent_chain?: AgentChain // see §3.3.5
}
type MentionRelayCapability =
| { kind: 'inline' }
// Body `@handle` is sufficient. The platform reads the reply text and
// routes mentions itself. Slack, Discord.
| { kind: 'recipient-field'; fields: ('to' | 'cc' | 'bcc')[] }
// The platform routes by recipient envelope, not body. Adding the
// sibling's address to one of `fields` is what wakes them. Body
// `@`-text is decorative (or absent). Email is `['to', 'cc']`; a
// platform that allows hidden recipients includes `'bcc'`.
| {
kind: 'addressing'
envelope_fields: ('to' | 'cc')[]
also_inline: true
}
// BOTH required. The actor IRI in the addressing collection AND a
// body `@actor@host` mention. ActivityPub on Mastodon-compatible
// servers — dropping either one breaks delivery on at least one
// implementation.
| { kind: 'none' }
// No platform relay. The agent must dispatch siblings itself
// (outbound A2A, etc.) and weave the answers into a single reply.
// A2A and SMS-style channels.
3.3.1 Per-protocol values
Adapters supply the value statically per inbound — there is no per-message variation in v0.1 for the AP and email transports. The values are:
received_via | recipient_capabilities.mention_relay |
|---|---|
'activitypub' | { kind: 'addressing', envelope_fields: ['to', 'cc'], also_inline: true } |
'email' | { kind: 'recipient-field', fields: ['to', 'cc'] } |
'a2a' | { kind: 'none' } by default — but A2A inbounds MAY override via the mentionable.recipient_capabilities metadata extension (see §3.3.4). |
Slack-reference is not a published adapter — it’s a Slack client that produces NormalizedMessage-shaped inputs for agent-side reuse. Where it constructs that shape, it MUST emit { kind: 'inline' }. The same rule applies to any future client / bridge: if it converts an inbound from a platform that routes body mentions, it emits 'inline'; otherwise it picks the matching kind from the union.
3.3.1.1 The A2A “client on behalf of a platform” case
When a Slack-style client (slack-connector, future Discord bridge, etc.) dispatches a user message to an A2A agent, the inbound transport on the receiving side is 'a2a', but the caller’s platform is what the eventual reply will be displayed on. That’s the platform whose mention-relay mechanism the agent’s reply needs to match — not A2A’s.
Example: a Slack user types @lean what does @gamebuilder think?. slack-connector dispatches the message to lean@firemanager.info over A2A. If lean’s LLM gets { kind: 'none' } (the conservative A2A default), it will laboriously dispatch gamebuilder itself and synthesize one reply. But Slack would have routed @gamebuilder automatically if the agent had just typed it — exactly the demo Mentionable wants.
The mentionable.recipient_capabilities A2A message-metadata extension fixes this (§3.3.4): the caller forwards its own platform’s capabilities, the inbound adapter lifts them onto NormalizedMessage, and the agent sees { kind: 'inline' } even though the wire transport is A2A.
3.3.2 Agent-side use
@mentionable/core exposes renderMentionRelayCapability(capability): string so every agent / prompt-builder gets the same human-readable description. LLM agents typically inject the rendered string into their system prompt:
The platform that delivered this message routes mentions to other agents as follows:
- mechanism:
recipient-field- to wake another agent, add them to
ToorCcof your reply- body
@-mentions are NOT routed by this platformIf you can’t add recipients, dispatch the sibling directly via A2A and weave the answer into your reply.
Agents that don’t need mention-relay (e.g. single-shot Q&A agents) MAY ignore the field entirely. The field is required so adapters can’t accidentally omit it for an agent that does care.
3.3.3 Why a structured union, not a boolean
A boolean (platform_relays_mentions: true | false) was the first impulse and it’s wrong. Email is the counter-example: the platform does relay mentions to siblings — provided you put them in Cc:. The agent needs the structural signal to know how to relay, not just whether it can. The union encodes that without forcing every consumer to special-case each protocol.
3.3.4 A2A metadata extension: caller-forwarded capabilities
The A2A wire format carries arbitrary Message.metadata. Mentionable defines the namespaced key
message.metadata.mentionable.recipient_capabilities : RecipientCapabilities
so a client that dispatches A2A on behalf of a richer platform (slack-connector forwarding for a Slack workspace, a Discord bridge, etc.) can tell the receiving agent what the displayed-on platform’s relay mechanism actually is.
transport-a2a reads message.metadata.mentionable.recipient_capabilities on every inbound. When present and structurally valid, it replaces the default { kind: 'none' }. When absent or malformed, the default holds — A2A-native callers (no platform wrapper) get the conservative behaviour.
Validation rules at the inbound boundary:
- The metadata path is
mentionable.recipient_capabilities(object, with amention_relayfield). mention_relay.kindmust be one of the four union variants ('inline','recipient-field','addressing','none').- Per-variant required fields (
fieldsforrecipient-field,envelope_fields+also_inlineforaddressing) must be present and shaped correctly. - A malformed extension is silently ignored — the inbound never throws because of it; an A2A-native caller that doesn’t ship the metadata gets
{ kind: 'none' }exactly as before.
The receiving server trusts the caller’s claim. v0.1 does NOT cryptographically verify the metadata: a hostile A2A client could lie about its capabilities to confuse the agent. This is acceptable because (a) all of recipient_capabilities’s downstream effects are advisory (an LLM choice; the runtime never branches on it), and (b) the caller already controls the inbound message body. v0.2 may add signing or registry-based verification if abuse appears in practice.
Outbound clients SHOULD include the extension when they are bridging from a platform with a different relay mechanism. A client that is itself the user — running on the same platform that delivered the original message — has nothing to forward; the field is omitted.
3.3.5 A2A metadata extension: agent chain position
When a platform client chains multiple agents (A → B → C in the same Slack thread), each relay turn should tell the receiving agent where it sits in the chain so an LLM doesn’t keep inviting more agents past the configured cap.
Mentionable defines the namespaced key
message.metadata.mentionable.recipient_capabilities.agent_chain : AgentChain
carried inside the same recipient_capabilities extension as §3.3.4:
type AgentChain = {
hop: number // 1-based position of this agent in the chain
max_hops: number // operator ceiling (e.g. 3 = user → A → B → C max)
is_final: boolean // true when hop === max_hops
}
transport-a2a lifts agent_chain from the metadata object alongside mention_relay. When absent or malformed the field is omitted on RecipientCapabilities — a chain-unaware sender or an A2A-native caller produces the same conservative behaviour as before.
@mentionable/core exposes renderAgentChain(chain): string | null (returns null when the field is absent) and renderRecipientCapabilities(capabilities): string which combines both fields into a single system-prompt block. Prompt builders SHOULD call renderRecipientCapabilities rather than formatting the fields individually.
The is_final flag is the primary signal for LLMs:
- When
is_final: false— the agent MAY invite another sibling if the task warrants it. Any body@mentionthe agent writes will be dispatched as the next hop. - When
is_final: true— the agent MUST synthesize a conclusion and MUST NOT invite further agents. The platform will not dispatch another hop even if the reply contains a@mention.
Validation rules mirror §3.3.4: hop and max_hops must be positive integers, 1 ≤ hop ≤ max_hops, is_final must be a boolean. Malformed values are silently ignored.
3.4 policy_resolution
policy_resolution?: PolicyResolution
Present on a turn that carries a verified completion of a prior
payment_required or consent_required PolicyPart. Absent on all
other turns, including historical turns in history[].
This field is set by whichever component performs the final
verification — typically the agent’s own HTTP callback endpoint
(/api/x402/callback, /api/stripe/webhook, etc.) rather than
the transport adapter. See
policy-part-v0.1.md §5
for the full specification, trust model, and wire shape.
Key invariants:
- Agents MUST verify
in_reply_to_stateagainst their issuance store and atomically consume the token before acting on this signal. Presence of the field alone is not proof of payment. HistoricalMessagedoes not carrypolicy_resolution. Payment completion facts are carried as text context inparts(the agent framework injects a[Payment confirmed]text part when replaying the inner agent) and in the agent’s own database for audit purposes.- One signal per turn. Sequential payments across multiple turns each
carry their own
policy_resolutionon the respective resume turn.
The PolicyResolution type is exported from @mentionable/core.
policy_resolution.kind | Populated by |
|---|---|
'payment_required' | Agent’s payment callback endpoint (x402, Stripe, PG, …) or A2A adapter for in-band x402 metadata |
'consent_required' | Agent’s consent callback endpoint or adapter consent cache hit |
4. The response shape
type NormalizedResponse = {
reply_to: string // NormalizedMessage.id being answered
parts: Part[]
status: 'ok' | 'partial' | 'error'
error?: { code: string; message: string; retriable: boolean }
streaming?: Streaming
push_back?: PushBackHint
}
type PushBackHint = {
channel?: 'activitypub' | 'a2a' | 'email' // default: received_via of the request
thread_ref?: string // override the auto-derived thread target
}
type Streaming = {
stream_id: string // groups frames that belong to one logical response
seq: number // 0-based, strictly increasing within stream_id
final: boolean // true on the last frame; MUST be true exactly once
}
reply_tois required. A response is always about a specific incoming message.streamingis the knob for A2A’s intermediate updates and for AP/Email’s “ack now, result later” pattern. The same shape handles both: A2A serializes incremental frames into streamed task events; AP/Email send each non-final frame as its own reply.stream_idis a stable opaque string generated by the agent (or the runtime on the agent’s behalf) to group frames; two frames with the samestream_idand the samereply_tobelong to one logical response.seqlets adapters deliver out-of-order frames in order or detect gaps.final: trueappears on exactly one frame per stream, and no further frames MAY be emitted after it. An agent that produces exactly one response (non-streaming) omitsstreamingentirely.push_backlets the agent override the default reply channel. See agent-interface.md §4 for when this is appropriate.
5. Protocol mapping
The table below is the contract. Every adapter must populate these fields exactly this way, or document a deviation.
| Field | ActivityPub (Create { Note }) | A2A (task message) | Email (MIME) |
|---|---|---|---|
id | Adapter-minted UUIDv7 | Adapter-minted UUIDv7 | Adapter-minted UUIDv7 |
thread_id | See §5.1.1 | A2A task_id | See §5.3.1 |
in_reply_to | object.inReplyTo if within local history, else the IRI as-is | A2A parent message.id if present | In-Reply-To: message-id |
sender.address | actor IRI → WebFinger lookup → @preferredUsername@host | JWT subject / OAuth sub, required to match @agent@domain form | From: mailbox → @local@domain |
sender.display_name | AP actor name | Agent card name if subject is an agent, else token claim name | From: display-name |
sender.profile | AP actor name/preferredUsername/url/icon when available | Verified token/evidence profile claims; Connector history sender profiles | From: display-name plus email provider/provider_subject; richer profile is provider-specific |
sender.auth_method | ap-http-signature or ap-object-integrity-proof | a2a-jwt or a2a-oauth | email-dkim, email-dmarc, or none (see §5.3) |
sender.verified | Signature verified against actor’s public key | Token verified, audience matches, not expired | DKIM pass, d= tag covers sender.address domain |
sender.key_id | keyId from signature header | kid from JWT header | DKIM s= selector + d= domain |
sender.identities | AP identity evidence after signature verification | A2A token evidence plus forwarded metadata.mentionable.identity_evidence | Email identity evidence after DKIM/DMARC pass |
recipient | WebFinger-resolved address of the inbox owner | Agent card address for the target | To: (first Mentionable-handled address) |
parts | See §5.1 | See §5.2 | See §5.3 |
received_via | 'activitypub' | 'a2a' | 'email' |
received_at | Now (UTC) | Now (UTC) | Now (UTC) |
raw | Parsed AP activity + signature headers | Parsed A2A task envelope | Parsed MIME tree (PostalMime output) |
recipient_capabilities.mention_relay | { kind: 'addressing', envelope_fields: ['to','cc'], also_inline: true } | { kind: 'none' } | { kind: 'recipient-field', fields: ['to','cc'] } |
5.1.1 ActivityPub → thread_id
ActivityPub 2.0 does not define a standard conversation-grouping field. The derivation order is:
object.context— the closest thing to a standard (referenced by FEP-7888 and common Mastodon usage). Use it when present and when it is an IRI, not an inline object.object.conversation— a non-standard Mastodon extension. Use it whencontextis absent. Publishers MUST NOT treat this as authoritative beyond the Fediverse interop value.- The root of the
inReplyTochain, walked up to a depth of 10 or until a cycle is detected. “Root” means the oldest ancestor reachable viainReplyTo. - If all of the above fail,
thread_idis the activity IRI itself — this message starts its own thread.
Resolvers walking inReplyTo MUST apply the SSRF rules from webfinger.md §7 and MUST bound total work; deep chains are truncated, not followed indefinitely.
5.1 ActivityPub → parts
object.content(HTML) → onetextpart withmime: 'text/html'. Adapter MAY additionally emit atext/plainpart derived by stripping tags; when it does,text/plaincomes first.- Each entry in
object.attachment→ one part. IfmediaTypestarts withimage/,video/,audio/, or is a known document type →file. If it is aLinkobject with no payload →link. - AP polls, questions, and other specialized
Objectsubtypes are not part of v0.1. They are carried inrawonly.
5.2 A2A → parts
- A2A
TextPart→textwith the declared mime (defaulttext/plain). - A2A
FilePart→filewhenroleis “user”,artifactwhen the enclosing message is a taskartifact. Ifmetadata.mentionable.bytes_ref.expires_atormetadata.mentionable.size_bytesis present, receivers preserve those fields on the resultingfilepart. - A2A
DataPartcarrying the tool-events extension →tool_call. - Other A2A
DataPartvalues (structured JSON) → v0.1 carries them as atextpart withmime: 'application/json'and a stringified body. A richer genericdatapart kind is deferred to v0.2.
5.3.1 Email → thread_id
Email threading in the wild is unreliable. Gmail, in particular, often omits References: on replies composed in its web UI and depends on server-side conversation grouping. The derivation order:
- The first message-id in the
References:header if that header is present and syntactically valid. “First” means the root of the chain, not the most recent. - Otherwise,
In-Reply-To:if present. A Gmail-originated reply frequently hasIn-Reply-To:but noReferences:; this path is the common case, not a fallback. - Otherwise, the current message’s own
Message-ID:— this message starts a new thread.
Adapters MUST NOT infer threads from Subject: normalization (Re: / Fwd: stripping). Subject-based threading is a heuristic that belongs in client UX, not in the adapter layer.
When the same logical conversation is observed over multiple protocols (the same user replies by both email and AP), v0.1 produces two distinct thread_ids. Cross-protocol thread merging is out of scope for v0.1; see §9 Open questions.
5.3 Email → parts
- Subject line → prepended as the first
text/plainpart with content"Subject: <subject>"when the subject is non-empty. This is a lossy convention; agents that need the raw subject readraw.headers.subject. - Body: pick one primary text representation. Preference order:
text/markdown(if explicit),text/plain,text/html(converted to plaintext only if no alternative exists; original HTML survives inraw). - Attachments (
Content-Disposition: attachmentor inline with filename) →fileparts, in MIME order. - Inline images referenced by
cid:→fileparts, ordered after the primary text part and before other attachments.
5.3.2 Email → sender.auth_method
Email adapters MUST resolve sender.auth_method as follows:
'email-dkim'— DKIM pass AND the signingd=domain matches theFrom:domain.'email-dmarc'— DKIM is absent or does not cover theFrom:domain, but DMARC evaluation passes for theFrom:domain (i.e. aligned SPF or aligned DKIM per RFC 7489). DMARC-only wins imply weaker key binding than a direct DKIM signature; agents that require per-message cryptographic authentication SHOULD treat this differently from'email-dkim'.'none'— otherwise (including SPF-only pass, failed authentication, or no authentication performed).
sender.verified is true for 'email-dkim' and 'email-dmarc' and false for 'none'.
5.4 history — per-transport population
NormalizedMessage.history is an optional, chronologically-sorted (oldest-first) array of prior turns from the same logical conversation, populated by the inbound adapter. Each transport derives it differently:
- ActivityPub. The adapter walks
object.inReplyToupward, dereferencing each ancestor under the same SSRF + bound-the-work rules asthread_idderivation (§5.1.1). The chain is flattened into an array and sorted bypublished. Ancestors that fail to dereference (deleted, unauthorized, or beyond depth bound) are dropped fromhistoryrather than failing the whole message. The current incoming message is not included inhistory. No inline-metadata history extension is defined or supported for AP — clients that want to supply history MUST structure their messages as reply chains so the adapter can walkinReplyTo. (This asymmetry with A2A is intentional: AP has no standard metadata namespace analogous to A2A’smessage.metadata, and theinReplyTochain walker already covers the Fediverse interop case.) - A2A. The adapter maps
Task.history(the A2A protocol’s own conversation field) intoHistoricalMessage[]directly:Task.history[i].rolebecomesrole,Task.history[i].partsbecomespartsafter the same per-part normalization in §5.2, and the message timestamp becomestimestamp. Order is preserved as A2A delivers it (oldest-first per A2A spec). As a fallback for Mentionable platform clients (slack-connector, future bridges) that dispatch over A2A on behalf of a richer platform, the adapter also readsmessage.metadata.mentionable.historywhenTask.historyis absent or empty — this path covers the case where the caller pre-packages per-thread history in the A2A metadata envelope rather than relying on the receiving agent’s own history store. - Email.
historyis left empty (or omitted). Email body conventions already inline prior turns as quoted blocks (> ...), so the agent reads context frompartsdirectly. Adapters MUST NOT attempt to parse the quoted body into structuredHistoricalMessage[]for v0.1 — quote parsing across mail clients is unreliable and the cost is not justified. v0.2 may revisit if a robust parser emerges.
The size of history BEFORE policy trimming is bounded only by the upstream protocol (AP chain depth bound, A2A Task.history.length, etc.). Adapters apply a HistoryPolicy (see agent-interface.md §10) before invoking the agent, so the agent sees a budget-shaped array.
6. raw — what each adapter puts there
raw is typed as unknown because the runtime MUST NOT rely on its shape. For debuggability and for agents that deliberately opt into protocol specifics, these are the conventions.
AP (received_via: 'activitypub'):
{
activity: Activity // the full Create / Update / … object
signature: { keyId: string; algorithm: string; headers: string[]; signature: string }
actor: Actor // dereferenced actor document
delivery: { to: string[]; cc: string[]; bcc?: string[] }
}
A2A (received_via: 'a2a'):
{
task: Task // full task object
message: Message // the A2A message being normalized
auth: {
kind: 'jwt' | 'oauth'
token_claims: Record<string, unknown>
}
}
Email (received_via: 'email'):
{
headers: Record<string, string | string[]>
parsed: PostalMimeOutput // from PostalMime
dkim: { results: Array<{ domain: string; selector: string; status: 'pass' | 'fail' | 'none' }> }
spf: { status: 'pass' | 'fail' | 'softfail' | 'neutral' | 'none' }
dmarc: { status: 'pass' | 'fail' | 'none' }
envelope: { mail_from: string; rcpt_to: string[] } // SMTP envelope, may differ from headers
}
Agents that read raw take on a dependency on the protocol. That is intentional.
7. Canonicalization rules
- Addresses are canonicalized to lowercase for the domain and case-preserved for the local-part (matching
acct:URI semantics). Comparison is case-insensitive on domain, case-sensitive on local-part, with the pragmatic exception that adapters SHOULD treat ASCII local-parts case-insensitively to match email and Fediverse convention. - Timestamps are ISO 8601 with explicit UTC offset (
Zform). No naïve local timestamps anywhere in the shape. - Line endings in text parts are LF-normalized. Original CRLF survives in
raw. - JSON objects in
raware preserved as parsed; adapters do not re-canonicalize them.
8. Conformance
An adapter is conformant for v0.1 if:
- Every field in §3 is populated per §5 for every message it successfully parses.
sender.verifiedis nevertruewithout a verified cryptographic binding tosender.address.- Messages that cannot be mapped (unsupported AP object subtype, corrupt MIME, invalid A2A task) are rejected at the adapter boundary and never produce a
NormalizedMessage. The runtime is not responsible for filtering malformed inputs. rawis populated with enough information to reconstruct the wire message for debugging, though not necessarily byte-for-byte.recipient_capabilities.mention_relaymatches the per-protocol value in §3.3.1 / §5.- It MUST NOT set
policy_resolutionwithout performing or delegating verification to the agent’s own callback endpoint. Setting this field is an assertion that verification succeeded.
An agent is conformant for v0.1 if:
- It produces a
NormalizedResponsewhosereply_tomatches a receivedNormalizedMessage.id. - It does not branch on
received_viafor business logic that would be different across protocols (push_back.channelis the correct way to override channel choice). - It tolerates adapters that populate only the fields listed as required in §3 — it does not require any optional field.
- When
policy_resolutionis present, it MUST verifyin_reply_to_stateagainst its issuance store and atomically consume the token before acting. An agent that acts onpolicy_resolutionwithout this check is non-conformant.
9. Open questions
These are flagged for review before v0.1 freezes.
- Cross-protocol threading. Is there a useful case in v0.1 for the runtime to recognize that an AP reply and an email reply belong to the same logical conversation? Current answer: no, defer to v0.2. Needs review.
artifact_typevocabulary. A2A has an evolving notion of artifact types (code, image, document). Do we lift this into the normalized shape, or keep it opaque? Current answer: opaque string in v0.1.DataPartas first-class. Structured JSON payloads are common in A2A. Carrying them as stringified text loses type information. Promote to adatapart kind in v0.1 or wait for v0.2? Leaning toward v0.2.- Subject-line convention for email. The “prepend as
Subject:text part” convention is ugly but honest. Alternative is a top-levelsubject?: stringonNormalizedMessage, which bloats the shape for one protocol. Staying with the current approach pending review. - Email-specific shape extensions (deferred to v0.2 per DESIGN.md §5.2). Bounce handling,
List-Unsubscribe,Auto-Submitted, and quota/rate signalling all plausibly need dedicated top-level fields rather than living inraw.email.*. v0.1 carries all of this inraw; v0.2 MAY promote any that prove common across adapter implementations. Adapters SHOULD write these intoraw.email.<feature>to reduce churn when promotion happens. - Multi-recipient fan-out (deferred to v0.2 per DESIGN.md §11.3). Messages addressed to several agent handles in one activity / To-line / A2A task have no v0.1 spec. The current v0.1 contract is one-
NormalizedMessage-per-recipient served by the local node; whether the adapter layer fans out, rejects, or normalizes a multi-mention as something richer is an open question. Until resolved, agents can assume at most one recipient perNormalizedMessageand therecipientfield is always the local agent.
10. Structured trace annotation v0.1
10.1 Why
The mix of human and agent receivers on ActivityPub and Email keeps growing. A “human face only” transport policy underserves the agent-to-agent half: the visible body has to be standard prose (Mastodon/Gmail can’t render structured tool calls), but a pure prose body loses the structured payload (tool_call args, results, errors, timings) that Mentionable-aware receivers want. The dual-view annotation resolves this without changing what the visible body means: the standard rendering ships unchanged; an additive sidecar carries the full structured NormalizedResponse for receivers that can use it. The visible body is standalone-interpretable; the annotation is ignorable.
This applies only to the AP and Email transports. A2A needs no annotation — its Part kinds already carry tool_call structure losslessly via the a2a-tool-events-v0.1 extension.
10.2 Payload format
The annotation payload is the JSON-serialized NormalizedResponse for the outbound message, base64-encoded. AP transports use base64url (RFC 4648 §5) because the payload sits in a URL fragment; Email transports use standard base64 because the payload sits in a MIME body with Content-Transfer-Encoding: base64. The decoded JSON shape is identical across transports.
The annotation is additive, not a replacement. The visible body continues to render every tool_call part via serializeToolCallToText (or the adapter’s richer surface, where one exists). Receivers that ignore the annotation see the prose-formatted tool calls in parts[] exactly as before; receivers that decode the annotation get the full structured response on NormalizedMessage.received_trace.
10.3 Wire patterns per transport
ActivityPub. Append one inline anchor to the Note’s HTML content, after the visible body:
<a class="mentionable-trace" data-version="0.1" href="<href>">[<anchor text>]</a>
<href> is either:
- Tier 1 (default):
https://mentionable.dev/trace/#<base64url-encoded JSON>— the viewer page reads the fragment in-browser and renders typed DOM viatextContent. Mastodon-compatible servers preserve thehttpshref. - Tier 2 (fallback):
data:application/json;base64,<base64-encoded JSON>— when the operator setsMENTIONABLE_TRACE_VIEWER_URL=none. Mastodon strips thedata:scheme fromhref(its allow-list does not includedata:), so human Mastodon users see only the anchor text; raw AP receivers and Mentionable-aware agents recover the payload directly from the unprocessed activity.
Servers MAY ALSO emit an attachment array entry with the same href as a Link object so AP nodes that render attachment chips show the trace link; receivers MUST treat the inline anchor as canonical.
Receivers MUST process only the first <a class="mentionable-trace"> anchor in the Note content; senders MUST emit at most one. The attribute parser is normative for the double-quoted form (class="mentionable-trace"); single-quoted attributes are not required to be parsed.
Email. Emit a multipart/alternative body with three siblings:
text/plain— human plain-text fallback (existingparts[]rendering, unchanged).text/html— human HTML. Simple inline rendering close to “email like a normal email.”tool_callparts render as<p>✅ name(args) → result</p>; previous<div border>boxes and<details>collapsibles are intentionally removed because the structured surface has moved to part (3).application/json; profile="https://mentionable.dev/ns/normalized-message/v0.1"— full NormalizedResponse JSON, base64-encoded. The RFC 6906profileparameter is the inbound parser’s positive-discrimination signal: a genericapplication/jsonattachment from another sender does not carry it. Mainstream mail clients (Gmail, Outlook, Apple Mail) preserve the part in the MIME chain but never display it.
A2A. No annotation. The protocol’s Part kinds carry tool_call structure losslessly already.
10.4 Two-tier emit policy (env)
Both AP and Email transports read MENTIONABLE_TRACE_VIEWER_URL once and apply the same table:
| Value | Behaviour |
|---|---|
| unset (default) | Tier 1, https://mentionable.dev/trace baked in as default |
https://… | Tier 1 using that URL |
none (or empty) | Tier 2 (data:application/json;base64,…) |
| anything else | Tier 1 with default URL; one-shot stderr warning so operators notice the typo |
Operators who want to host their own viewer point the env at the fork (https://my-trace.example.com/). Operators who want zero third-party reliance set none and accept Mastodon human-face degradation. The email transport ignores Tier 1 vs Tier 2 (the payload always sits inline in the MIME part) but still honours the env variable so a single operator setting governs both transports.
10.5 Size budget
The encoded annotation MUST fit in 64 KiB after base64 encoding. The cap keeps Tier 1 URLs pasteable and Tier 2 data: URIs below most AP server limits. When the natural payload is over, transports try once more with each tool_call part’s args and result summary truncated to 200 bytes (mirroring the serializeToolCallToText budget); when the truncated payload is still over, transports drop the annotation entirely and emit a one-line stderr warning. The visible body always ships regardless — the annotation never blocks a message from being sent.
Per-call args and result summaries SHOULD be truncated to 200 bytes each (matching serializeToolCallToText’s budget at §3.2) as the first truncation step, before evaluating the overall 64 KiB cap.
10.6 Symmetric in/out (acceptance gate)
Both transports MUST parse what they emit. The round-trip test is the gate: emit a non-trivial NormalizedResponse → wire shape → feed the wire back through the same transport’s inbound parser → the reconstructed received_trace deep-equals the original. No drift, no asymmetric capability. Receivers MUST treat inbound failures (no anchor, malformed payload, shape mismatch, oversize fragment) as non-fatal: log a warning, leave received_trace undefined, deliver the visible message normally.
10.7 Anchor text
The text inside the AP anchor MUST be human-meaningful prose summarizing the response — for example, [trace: web_search ✓ 412ms] or [trace: 3 tool calls — web_search ✓, file_write ✓, slow_lookup ✗]. The visible body MUST remain interpretable when Mastodon strips the href (Tier 2 degradation), so the anchor text is the last line of defence. Implementations cap the anchor text at ~120 characters and elide additional tool calls with a +N more suffix.
10.8 received_trace field semantics
NormalizedMessage.received_trace?: NormalizedResponse is an optional sidecar populated by the inbound parser when (and only when) a trace annotation was successfully decoded. Receivers MUST NOT rely on its presence — it is undefined whenever:
- the inbound message came from a non-Mentionable sender;
- the inbound message came from a Mentionable sender deployed with
MENTIONABLE_TRACE_VIEWER_URL=noneAND the receiver fetched the message through a sanitizer that stripped thedata:href (rare); - the outbound transport dropped the annotation under the size budget;
- the inbound parser hit any failure mode (malformed, oversize, mismatched shape).
The visible parts[] body remains the source of truth in every case. Agents that branch on received_trace.parts[i].args (for example, to chain into a follow-up tool call) MUST gracefully degrade when the field is absent.
10.9 Capability negotiation — deferred
A future agent-card flag (mentionable:annotation_v1: true) could let senders skip emission for receivers that can’t parse, saving wire bytes. v0.1 always emits; non-Mentionable receivers ignore the annotation harmlessly (the AP anchor text remains valid prose; the email third part is silently dropped by mainstream clients). Capability negotiation is tracked as a follow-up to #498.
10.10 Anti-patterns
Implementations MUST NOT:
- Inject custom JSON-LD properties into the AP
Notebody to carry the structured payload — non-Mentionable AP servers would silently drop or mangle them, breaking the visible body. - Embed
<script>tags in the Note content — Mastodon and every other AP server sanitize them out, defeating the purpose AND raising the XSS surface for any server that does not. - Use hidden CSS (
display: none, off-screen positioning) to smuggle structured content — server sanitizers strip style attributes inconsistently and the visible body becomes unpredictable.
The dual-view annotation works precisely because it is an additive inline element with standard semantics (<a href> on AP, a sibling MIME part on Email) that degrades cleanly when ignored. Anything that depends on receivers preserving non-standard wire shape will lose to the next sanitizer rollout.