Conformance

Conformance Requirements

Two levels: AARM Core (R1–R6, all MUST) for baseline conformance and AARM Extended (R1–R9) for mature implementations.

AARM Core

R1 – R6

All six requirements are MUST. Baseline for AARM conformance.

AARM Extended

R1 – R9

Core + three SHOULD requirements for advanced governance.

Core Requirements — MUST

R1

Pre-execution interceptionMUST

The system MUST intercept every agent-initiated action before it is executed. No action may bypass the control plane.

R2

Context accumulationMUST

The system MUST accumulate and maintain context about the agent's intent, prior actions, and the conversation or task thread.

R3

Policy evaluation with intent alignmentMUST

The system MUST evaluate each action against a policy that considers both the action itself and its alignment with the stated agent intent.

R4

Five authorization decisionsMUST

The policy engine MUST be capable of producing one of five decisions: ALLOW, DENY, MODIFY, STEP_UP, or DEFER.

R5

Tamper-evident receiptsMUST

The system MUST produce a tamper-evident receipt for every evaluated action, including the decision, timestamp, and relevant context.

R6

Identity bindingMUST

Every action receipt MUST be cryptographically bound to an agent identity.

Extended Requirements — SHOULD

R7

Semantic distance trackingSHOULD

The system SHOULD track semantic distance between proposed actions and the original stated intent, flagging drift over long task horizons.

R8

Telemetry exportSHOULD

The system SHOULD export action telemetry in a standard format (e.g. OpenTelemetry) for integration with SIEM and observability platforms.

R9

Least privilege enforcementSHOULD

The system SHOULD enforce least-privilege scoping of agent credentials and tool access at the time of action execution.

Validation Process

Conformance validation is performed through the AARM Conformance MCP server. An agent runs the assessment end-to-end against your implementation — no manual checklists.

Request an activation key

Submit the form below with your organization name, product name, and target conformance level (Core or Extended). If your organization is on the allow-list, you will receive an activation key by email. Only listed organizations can run the assessment.

Connect to the AARM MCP server

Add the server to Claude Desktop or Claude Code using the instructions below.

https://aarm-conformance-mcp.herman-d10.workers.dev/

Run the assessment

Start a conversation with Claude and ask it to run the AARM conformance assessment. The agent will walk through each check, collect evidence, and produce a validation report.

Install on Claude Code

Run this in your terminal:

claude mcp add --transport http aarm-conformance https://aarm-conformance-mcp.herman-d10.workers.dev/

Then launch Claude Code and ask it to start the AARM conformance assessment.

Install on Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "aarm-conformance": {
      "type": "http",
      "url": "https://aarm-conformance-mcp.herman-d10.workers.dev/"
    }
  }
}

Request access

Submit your organization details to receive an activation key.

Organizational Requirements

In addition to technical requirements, organizations must satisfy the following conditions to publicly describe their system as AARM-conformant.

Condition	Verification	Expected Result
Community engagement	Verify TWG membership or participation in conformance discussions	Organization has an active representative in the AARM community
Production deployment	Confirm the system is deployed and serving active customers	System is live in production with at least 5 active production customers running for a minimum of 3 months
Security certification	Request evidence of certification	Organization holds at least one recognized security certification (e.g., SOC 2 Type II, ISO 27001, FedRAMP) relevant to the operating environment
Benchmarking commitment	Confirm willingness to participate	Organization agrees to participate in future AARM benchmarking efforts measuring policy detection and enforcement metrics

Technical Testing

Each requirement includes a minimum test to confirm conformance. All MUST tests are required for Core conformance; SHOULD tests are required for Extended.

Req	Test	Expected Result	Level
`R1`	Submit action matching DENY policy	Action does not execute; denial receipt generated	MUST
`R1`	Submit action matching DEFER condition	Action suspended; no effects; deferral receipt generated	MUST
`R1`	Make AARM system unavailable, submit action	Action fails (no fail-open bypass)	MUST
`R2`	Execute action sequence, inspect context at step N	Policy engine receives all prior actions and data classifications	MUST
`R2`	Tamper with prior context entry (if hash-chained)	Tampering detected	SHOULD
`R3`	Submit forbidden action	Immediate DENY regardless of context	MUST
`R3`	Submit allowed action after sensitive data access (context-dependent deny)	DENY based on context	MUST
`R3`	Submit denied action with confirming context (context-dependent allow)	STEP_UP or ALLOW	MUST
`R3`	Submit action with ambiguous/conflicting context	DEFER	MUST
`R4`	Trigger each of 5 decision types	ALLOW executes, DENY blocks, MODIFY transforms, STEP_UP pauses, DEFER suspends	MUST
`R4`	STEP_UP with no response within timeout	DENY after timeout	MUST
`R4`	DEFER with no resolution within timeout	DENY after timeout	MUST
`R5`	Generate receipts for ALLOW, DENY, MODIFY, STEP_UP, DEFER	Requester context, delegation chain, and policy version/hash present per schema	MUST
`R5`	Verify receipt signature offline	Signature validates	MUST
`R5`	Tamper with requester context or policy hash in receipt	Signature verification fails	MUST
`R5`	Verify deferred action receipt	Deferral reason, resolution method, resolution timestamp present	MUST
`R6`	Submit from different principals and sessions	Receipts correctly attribute identity including role/privilege scope	MUST
`R6`	Defer action, then resolve	Original identity preserved in resolution receipt	MUST
`R7`	Execute diverging action sequence exceeding drift threshold	Alert, deferral, or escalation triggered	SHOULD
`R8`	Configure SIEM export	Events appear with correct schema including DEFER events	SHOULD
`R9`	Submit read operation	Issued credential cannot perform writes	SHOULD

Full specification →Conformant builders