Skip to content

Rul1an/assay

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2,006 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Assay

CI-native evidence compiler for MCP and A2A governance
Deterministic policy enforcement, canonical evidence, and reviewable trust artifacts for agent systems.

Crates.io CI License

See It Work · Quick Start · CI Guide · Discussions


Your MCP agent calls read_file, exec, web_search — but should it, and what can you honestly prove about that run afterward?

Assay compiles agent runtime signals into enforceable decisions and portable evidence artifacts. The wedge is familiar: sit between the agent and MCP servers, allow or deny tool calls from policy, and record every decision. The product story is broader: canonical evidence, bounded trust claims (what is verified vs merely visible), and outputs you can hand to CI, security review, or audit — without a hosted backend.

Positioning: Assay is best understood as a CI-native protocol-governance layer: canonical evidence compiler + protocol-aware policy checks. It is not a trust-score engine, a generic eval dashboard, or an observability product with a thin security veneer.

Enforce Intercept MCP tool calls, apply policy, ALLOW / DENY deterministically.
Compile Turn traces, decisions, and bundles into canonical evidence — not raw OTel or ad hoc logs as truth.
Prove Export tamper-evident bundles, Trust Basis (trust-basis.json), Trust Card (trustcard.json / trustcard.md), SARIF, and CI gates.

No hosted backend. No API keys for core flows. Deterministic — same input, same decision, every time.

Trust Compiler line: Release v3.5.0 is the current public trust-compiler line. It carries forward v3.3.0 as the first release that shipped both built-in evidence lint companion packs (mcp-signal-followup, a2a-signal-followup), v3.4.0 as the public line for G4-A Phase 1 (payload.discovery), built-in P2c (a2a-discovery-card-followup), and K1-A Phase 1 (payload.handoff), and now also publicly ships K2-A Phase 1 (episode_start.meta.mcp.authorization_discovery). Pack YAML still distinguishes the substrate floor >=3.2.3 from the G4-A / P2c floor >=3.3.0 — see MIGRATION — Trust Compiler 3.2.

Repository truth: main now tracks the released v3.5.0 trust-compiler surface, including the bounded MCP authorization-discovery seam in imported traces. Future trust-compiler slices may still land on main before the next public cut, so release notes and changelog remain the authority for what is actually public.

  Agent ──► Assay ──► MCP Server
              │
              ├─ ✅ ALLOW / ❌ DENY  (policy)
              ├─► 📋 Evidence bundle (verifiable)
              └─► 📊 Trust Basis → Trust Card → SARIF / CI

CLI: The mcp command group is hidden from top-level assay --help while the surface stabilizes; it is supported. Use assay mcp --help, assay mcp wrap …, or follow the MCP Quickstart.

Wedge, not category. “MCP firewall” describes the control plane; trust compilation describes the outcome: reviewable claims backed by evidence. See ADR-033 and RFC-005.

See It Work

SafeSkill 72/100

cargo install assay-cli

mkdir -p /tmp/assay-demo && echo "safe content" > /tmp/assay-demo/safe.txt

assay mcp wrap --policy examples/mcp-quickstart/policy.yaml \
  -- npx @modelcontextprotocol/server-filesystem /tmp/assay-demo
✅ ALLOW  read_file  path=/tmp/assay-demo/safe.txt  reason=policy_allow
✅ ALLOW  list_dir   path=/tmp/assay-demo/           reason=policy_allow
❌ DENY   read_file  path=/tmp/outside-demo.txt      reason=path_constraint_violation
❌ DENY   exec       cmd=ls                          reason=tool_denied

Inspect the audit artifact:

assay evidence show demo/fixtures/bundle.tar.gz

Evidence Bundle Inspector

The bundle is tamper-evident and cryptographically verifiable. Signed mandate events can include an Ed25519-backed authorization trail for high-risk actions.

Trust artifacts from a verified bundle

Install from crates.io or source (cargo install --path crates/assay-cli), then:

# Machine-readable claim basis (deterministic, claim-first)
assay trust-basis generate demo/fixtures/bundle.tar.gz > trust-basis.json

# Human + machine Trust Card (schema v2 — seven trust claims; key by `id`, not row count)
assay trustcard generate demo/fixtures/bundle.tar.gz --out-dir ./trust-out
# → trust-out/trustcard.json , trust-out/trustcard.md

trust-basis.json emits claims from a bounded, versioned vocabulary for this schema (examples: bundle_verified, delegation_context_visible, authorization_context_visible, containment_degradation_observed, …). Claim id values are stable across runs, but consumers must not rely on row count or ordering; always key by id. It is not a scalar trust score. The Trust Card is a deterministic render of the same claim rows plus frozen non-goals. Contract versions, pack floors, and release checklist: docs/architecture/MIGRATION-TRUST-COMPILER-3.2.md.

What you get

Output Role
Policy gate MCP wrap — deterministic allow/deny before tools run (see CLI note above the diagram).
Evidence bundle Offline-verifiable, tamper-evident archive for audit and replay.
Trust Basis Canonical trust-basis.json — bounded claim classification from verified bundles.
Trust Card trustcard.json / trustcard.md — same claims, review-friendly artifact.
SARIF / CI GitHub Action, Security tab integration, policy gates on PRs.

Evidence levels (trust vocabulary)

Trust claims use explicit epistemology, not a single “safety score”:

Level Meaning
verified Backed by direct evidence or offline verification in the bundle/path
self_reported Emitted by the system without stronger independent corroboration
inferred Derived from bounded, documented rules
absent No trustworthy evidence supports the claim

Assay does not ship a primary aggregate trust score or a safe/unsafe badge as the main output. See ADR-033.

Is This For Me?

Yes, if you:

  • Build with Claude Desktop, Cursor, Windsurf, or any MCP client
  • Ship agents that call tools and you need to control which ones
  • Want a CI gate that catches tool-call regressions before production
  • Need bounded auditability and trust artifacts, not only sampled observability

Not yet, if you:

  • Don't use MCP (Assay is MCP-native; other protocols use adapters)
  • Need a hosted dashboard (Assay is CLI-first and offline)
  • Want a magic trust score or badge as the main output

Add to Cursor in 30 Seconds

Assay ships a helper that finds your local Cursor MCP config path and prints a ready-to-paste entry:

assay mcp config-path cursor

It generates JSON like:

{
  "filesystem-secure": {
    "command": "assay",
    "args": [
      "mcp",
      "wrap",
      "--policy",
      "/path/to/policy.yaml",
      "--",
      "npx",
      "-y",
      "@modelcontextprotocol/server-filesystem",
      "/Users/you"
    ]
  }
}

The same wrapped command works in other MCP clients — see MCP Quick Start.

Policy Is Simple

version: "2.0"
name: "my-policy"

tools:
  allow: ["read_file", "list_dir"]
  deny: ["exec", "shell", "write_file"]

schemas:
  read_file:
    type: object
    additionalProperties: false
    properties:
      path:
        type: string
        pattern: "^/app/.*"
        minLength: 1
    required: ["path"]

Legacy constraints: policies still work. Use assay policy migrate for the v2 JSON Schema form, or assay init --from-trace trace.jsonl to generate from observed behavior.

See Policy Files.

OpenTelemetry In, Canonical Evidence Out

Assay ingests OpenTelemetry JSONL, builds replayable traces, and exports canonical evidence — OTel is a bridge, not the sole semantic authority.

assay trace ingest-otel \
  --input otel-export.jsonl \
  --db .eval/eval.db \
  --out-trace traces/otel.v2.jsonl

See OpenTelemetry & Langfuse.

Add to CI

# .github/workflows/assay.yml
name: Assay Gate
on: [push, pull_request]
permissions:
  contents: read
  security-events: write
jobs:
  assay:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: Rul1an/assay-action@v2

PRs that violate policy get blocked; SARIF can surface in the Security tab.

Why Assay (trust compiler)

Canonical evidence Assay’s evidence model is the stable contract; OTel and adapters map into it.
Deterministic Same input, same decision — not probabilistic.
Portable artifacts Bundles, Trust Basis, Trust Card, SARIF — for CI, review, audit.
Bounded claims Explicit about what is verified vs visible vs absent — no score-first UX.
MCP-native wedge assay mcp wrap is the fast path (the mcp group is hidden from assay --help; use assay mcp --help). Adapters extend the same engine.
Offline-first No backend required for core enforcement and bundle verification.

Beyond MCP: Protocol Adapters

Assay ships adapters that map protocol events into canonical evidence (same policy and evidence story, different transports):

Protocol Adapter What it maps
ACP (OpenAI/Stripe) assay-adapter-acp Checkout events, payment intents, tool calls
A2A (Google) assay-adapter-a2a Agent capabilities, task delegation, artifacts
UCP (Google/Shopify) assay-adapter-ucp Discover/buy/post-purchase state transitions

Adapter crates are workspace / binary–driven (not published as separate crates.io packages); consume them via this repo or released assay builds.

Governance stays protocol-agnostic; the evidence and claim layer stays the same as protocols evolve.

Measured Latency

On the M1 Pro/macOS fragmented-IPI harness, protected tool-decision path:

  • Main protection run: 0.771ms p50 / 1.913ms p95
  • Fast-path scenario: 0.345ms p50 / 1.145ms p95

These are tool-decision timings, not end-to-end model latency. (See Research & experiments for methodology context.)

Install

cargo install assay-cli

CI: GitHub Action. Python SDK: pip install assay-it

Learn More

Research, mappings & experiments

Bounded context: numbers below support mapping and experiments, not a product “security score.”

  • OWASP MCP Top 10 Mapping — how Assay relates to each risk category (coverage is not a scalar guarantee).
  • Third-party survey: popular MCP servers often show weak defaults — Assay adds policy + evidence; see discussion in the mapping doc.
  • Security experiments — attack vectors and harness notes (methodology matters more than headline counts).

Contributing

cargo test --workspace
cargo clippy --workspace --all-targets -- -D warnings

See CONTRIBUTING.md. Discussions: GitHub Discussions — seed topics for pinned threads live in docs/community/DISCUSSIONS.md.

License

MIT

About

Trust compiler for agent systems Deterministic MCP policy enforcement, verifiable evidence bundles, and reviewable trust artifacts.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Contributors