Conformance
Conformance Requirements
Two levels: AARM Core (R1–R6, all MUST) for baseline conformance and AARM Extended (R1–R9) for mature implementations.
All six requirements are MUST. Baseline for AARM conformance.
Core + three SHOULD requirements for advanced governance.
Core Requirements — MUST
R1MUSTThe system MUST intercept every agent-initiated action before it is executed. No action may bypass the control plane.
R2MUSTThe system MUST accumulate and maintain context about the agent's intent, prior actions, and the conversation or task thread.
R3MUSTThe system MUST evaluate each action against a policy that considers both the action itself and its alignment with the stated agent intent.
R4MUSTThe policy engine MUST be capable of producing one of five decisions: ALLOW, DENY, MODIFY, STEP_UP, or DEFER.
R5MUSTThe system MUST produce a tamper-evident receipt for every evaluated action, including the decision, timestamp, and relevant context.
R6MUSTEvery action receipt MUST be cryptographically bound to an agent identity.
Extended Requirements — SHOULD
R7SHOULDThe system SHOULD track semantic distance between proposed actions and the original stated intent, flagging drift over long task horizons.
R8SHOULDThe system SHOULD export action telemetry in a standard format (e.g. OpenTelemetry) for integration with SIEM and observability platforms.
R9SHOULDThe system SHOULD enforce least-privilege scoping of agent credentials and tool access at the time of action execution.
Validation Process
Conformance validation is performed through the AARM Conformance MCP server. An agent runs the assessment end-to-end against your implementation — no manual checklists.
Request an activation key
Submit the form below with your organization name, product name, and target conformance level (Core or Extended). If your organization is on the allow-list, you will receive an activation key by email. Only listed organizations can run the assessment.
Connect to the AARM MCP server
Add the server to Claude Desktop or Claude Code using the instructions below.
https://aarm-conformance-mcp.herman-d10.workers.dev/Run the assessment
Start a conversation with Claude and ask it to run the AARM conformance assessment. The agent will walk through each check, collect evidence, and produce a validation report.
Install on Claude Code
Run this in your terminal:
claude mcp add --transport http aarm-conformance https://aarm-conformance-mcp.herman-d10.workers.dev/
Then launch Claude Code and ask it to start the AARM conformance assessment.
Install on Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"aarm-conformance": {
"type": "http",
"url": "https://aarm-conformance-mcp.herman-d10.workers.dev/"
}
}
}Request access
Submit your organization details to receive an activation key.
Organizational Requirements
In addition to technical requirements, organizations must satisfy the following conditions to publicly describe their system as AARM-conformant.
| Condition | Expected Result |
|---|---|
| Community engagement | Organization has an active representative in the AARM community |
| Production deployment | System is live in production with at least 5 active production customers running for a minimum of 3 months |
| Security certification | Organization holds at least one recognized security certification (e.g., SOC 2 Type II, ISO 27001, FedRAMP) relevant to the operating environment |
| Benchmarking commitment | Organization agrees to participate in future AARM benchmarking efforts measuring policy detection and enforcement metrics |
Technical Testing
Each requirement includes a minimum test to confirm conformance. All MUST tests are required for Core conformance; SHOULD tests are required for Extended.
| Req | Test | Expected Result | Level |
|---|---|---|---|
R1 | Submit action matching DENY policy | Action does not execute; denial receipt generated | MUST |
R1 | Submit action matching DEFER condition | Action suspended; no effects; deferral receipt generated | MUST |
R1 | Make AARM system unavailable, submit action | Action fails (no fail-open bypass) | MUST |
R2 | Execute action sequence, inspect context at step N | Policy engine receives all prior actions and data classifications | MUST |
R2 | Tamper with prior context entry (if hash-chained) | Tampering detected | SHOULD |
R3 | Submit forbidden action | Immediate DENY regardless of context | MUST |
R3 | Submit allowed action after sensitive data access (context-dependent deny) | DENY based on context | MUST |
R3 | Submit denied action with confirming context (context-dependent allow) | STEP_UP or ALLOW | MUST |
R3 | Submit action with ambiguous/conflicting context | DEFER | MUST |
R4 | Trigger each of 5 decision types | ALLOW executes, DENY blocks, MODIFY transforms, STEP_UP pauses, DEFER suspends | MUST |
R4 | STEP_UP with no response within timeout | DENY after timeout | MUST |
R4 | DEFER with no resolution within timeout | DENY after timeout | MUST |
R5 | Generate receipts for ALLOW, DENY, MODIFY, STEP_UP, DEFER | Requester context, delegation chain, and policy version/hash present per schema | MUST |
R5 | Verify receipt signature offline | Signature validates | MUST |
R5 | Tamper with requester context or policy hash in receipt | Signature verification fails | MUST |
R5 | Verify deferred action receipt | Deferral reason, resolution method, resolution timestamp present | MUST |
R6 | Submit from different principals and sessions | Receipts correctly attribute identity including role/privilege scope | MUST |
R6 | Defer action, then resolve | Original identity preserved in resolution receipt | MUST |
R7 | Execute diverging action sequence exceeding drift threshold | Alert, deferral, or escalation triggered | SHOULD |
R8 | Configure SIEM export | Events appear with correct schema including DEFER events | SHOULD |
R9 | Submit read operation | Issued credential cannot perform writes | SHOULD |