Version: 1.3 | Status: Standards Track | License: CC0 1.0 Universal (Public Domain)
Backed by areev.ai.
OMS is an open standard for portable, auditable, and interoperable agent memory. It defines three layered specifications — the binary memory format (OMS), the query and assembly language (CAL), and the LLM context markup (SML) — covering the complete lifecycle from storing a memory grain to delivering it to an AI agent's context window.
| Layer | Spec | Role |
|---|---|---|
| Storage | OMS — Open Memory Specification | Binary .mg container: how grains are encoded, hashed, signed, and stored |
| Query | CAL — Context Assembly Language | Non-destructive language for recalling and assembling agent context |
| Output | SML — Semantic Markup Language | Tag-based LLM context format produced by CAL ASSEMBLE |
OMS defines the Memory Grain (.mg) container — a binary format for immutable, content-addressed knowledge units called grains. A memory grain is the atomic unit of agent knowledge: a single immutable belief, event, observation, or decision record, identified by the SHA-256 hash of its canonical binary representation.
Think of the .mg container as what JSON is to APIs or .git objects are to version control — a universal, language-agnostic, self-describing interchange format for agent memory.
| Property | Description |
|---|---|
| Deterministic serialization | Identical content always produces identical bytes |
| Content addressing | SHA-256 hash for integrity, deduplication, and identity |
| Compact binary encoding | MessagePack (default) or CBOR (optional) |
| Cryptographic verification | COSE Sign1 envelopes (optional) |
| Field-level privacy | Selective disclosure without exposing full grain |
| Compliance primitives | GDPR, CCPA, HIPAA support baked in |
| Multi-modal references | Links to external images, video, audio, and embeddings |
| Decentralized identity | W3C DIDs — no certificate authority required |
| Grain protection | Invalidation policies restricting supersession rights |
| Type | Byte | Description |
|---|---|---|
| Belief | 0x01 |
Declarative knowledge: subject–relation–object triple |
| Event | 0x02 |
Timestamped occurrence: message, interaction, or utterance |
| State | 0x03 |
Agent state snapshot at a point in time |
| Workflow | 0x04 |
Multi-step procedural record |
| Action | 0x05 |
Tool invocation or code execution |
| Observation | 0x06 |
Sensor or cognitive input |
| Goal | 0x07 |
Intent, objective, or desired outcome |
| Reasoning | 0x08 |
Inference chain and thought audit trail |
| Consensus | 0x09 |
Multi-agent agreement record |
| Consent | 0x0A |
DID-scoped permission grant or withdrawal |
0xF0–0xFF |
— | Application-defined domain profile types |
0 1 2 3 4 5 6 7 8 9 10 ...
+-------+-------+-------+---+---+-------+-------+-------+-------+-------+---
| Ver | Flags | Type | NS hash | created_at (u32) | MsgPack
| 0x01 | uint8 | uint8 | uint16 | (epoch seconds) | payload
+-------+-------+-------+---+---+-------+-------+-------+-------+-------+---
Fixed header (9 bytes) Variable
The fixed 9-byte header enables O(1) field extraction without deserializing the payload — type, namespace, and timestamp are always at known byte offsets.
- References, not blobs — Multi-modal content is referenced by URI, never embedded
- Additive evolution — New fields never break old parsers
- Minimal required fields — Only essential fields per memory type
- Semantic triples — Subject–relation–object model for knowledge graph compatibility
- Compliance by design — Provenance and identity in every grain
- No AI in the format — Deterministic serialization; LLMs belong in the engine layer
- Index without deserialize — Fixed headers enable O(1) field extraction
- Sign without PKI — DIDs enable verification without certificate authorities
- Share without exposure — Selective disclosure for privacy-preserving interchange
- One file, full memory — A
.mgcontainer is a portable, complete knowledge export
CAL is a non-destructive, deterministic, LLM-native language for assembling agent context from OMS memory stores. It answers a single question: what should be in the agent's context window right now?
CAL's core safety guarantee — it cannot destroy data — is enforced at the grammar level, not by convention. The lexer has no DELETE, DROP, FORGET, or ERASE tokens. Every write creates a new grain; old grains survive forever.
| Statement | Tier | What it does |
|---|---|---|
RECALL |
Read | Retrieve grains matching filters |
ASSEMBLE |
Read | Compose a context block from multiple RECALL sources with a token budget |
EXISTS |
Read | Check whether a grain with a given hash is present |
HISTORY |
Read | Retrieve the supersession chain for a grain |
EXPLAIN |
Read | Return the execution plan without running the query |
BATCH |
Read | Run multiple queries in a single round trip |
ADD |
Evolve | Write a new grain (append-only) |
SUPERSEDE |
Evolve | Replace a grain's logical content; original grain survives |
REVERT |
Evolve | Undo a supersession; three grains exist afterward — original, supersession, revert |
A support agent handles an inbound ticket: "My invoice shows a charge I don't recognise." Before the LLM generates a reply it needs to know who the customer is, their account history, prior tickets, relevant policies, and what tools have already run this session. CAL assembles all of that in one statement.
CAL/1 ASSEMBLE support_context
FOR "resolving billing dispute for customer:priya"
FROM
profile: (RECALL beliefs ABOUT "customer:priya"
WHERE relation IS KNOWLEDGE
LIMIT 10),
history: (RECALL events
WHERE user_id = "customer:priya"
SINCE "last 90 days"
LIMIT 20),
tickets: (RECALL workflows
WHERE subject = "customer:priya"
AND tags INCLUDE ["support:billing"]
RECENT 5),
policy: (RECALL beliefs
WHERE tags INCLUDE ["policy:billing"]
LIMIT 5),
session: (RECALL actions
WHERE session_id = "sess-20260303-priya"
LIMIT 10)
BUDGET 4000 tokens
PRIORITY profile > history > tickets > policy > session
FORMAT smlThe executor runs all five RECALL queries in parallel, applies the token budget and priority ordering, then emits SML (see below).
CAL/1 ADD reasoning
SET subject = "customer:priya"
SET relation = "dispute_analysis"
SET object = "charge-2026-02-28"
SET content = "Charge matches annual plan renewal on 2026-02-28. Customer last contacted re: plan in Jan. Likely unrecognised due to annual cycle."
SET confidence = 0.91
SET tags = ["support:billing", "resolution:explain"]
REASON "agent inferred cause from renewal date and contact history"CAL/1 RECALL workflows
WHERE tags INCLUDE ["support:billing"]
AND goal_state = "open"
ORDER BY time DESC
LIMIT 50
FORMAT markdownSML is the output format produced by CAL ASSEMBLE FORMAT sml. It is a flat, tag-based markup format designed for direct LLM consumption. Tag names are OMS grain types (<belief>, <event>, <reasoning>, …). The tag tells the LLM the epistemic status of the content; the attributes carry decision metadata; the element text is natural-language prose.
SML is not XML. It requires no parser, no schema, no escape sequences. An LLM reads it the same way a person reads a well-structured document.
- Tag names are grain types.
<belief>,<goal>,<event>,<action>,<observation>,<reasoning>,<state>,<workflow>,<consensus>,<consent>— no others. - Flat only. No nesting beyond the
<context>envelope. - No storage internals. No hashes, namespaces, or OMS metadata in the output.
- Natural language content. Element text is prose, not decomposed triples.
- One envelope.
<context intent="…">is the sole container element.
This is the SML block injected into the LLM system prompt for the billing dispute above:
<context intent="resolving billing dispute for customer:priya">
<belief subject="customer:priya" confidence="0.97">account tier is Professional, annual billing cycle</belief>
<belief subject="customer:priya" confidence="0.93">primary contact email is priya@example.com</belief>
<belief subject="customer:priya" confidence="0.89">enrolled in auto-renewal since 2024-03-01</belief>
<event role="user" time="2m ago">My invoice shows a charge I don't recognise — $299 on 28 Feb.</event>
<event role="agent" time="2m ago">Looking into that now, Priya. Retrieving your billing history.</event>
<event role="user" time="5d ago">Can I switch to monthly billing?</event>
<event role="agent" time="5d ago">Monthly billing is available — I've sent a link to make that change.</event>
<workflow trigger="billing_dispute_opened" state="open">1. verify charge 2. check renewal date 3. explain or escalate 4. offer billing-cycle change 5. close ticket</workflow>
<action tool="get_invoice" phase="completed">retrieved invoice INV-2026-02-28: $299 annual Professional plan renewal</action>
<action tool="get_plan_history" phase="completed">plan enrolled 2024-03-01, renewed annually; last renewal 2026-02-28</action>
<observation observer="billing-system">renewal processed automatically on 2026-02-28 at 00:01 UTC; no failed payment</observation>
<observation observer="system">customer last viewed billing page 2026-01-15</observation>
<reasoning type="deductive">charge is the annual plan renewal; customer enrolled in auto-renewal; charge is valid</reasoning>
<reasoning type="abductive">customer may be unaware of annual cycle because last billing interaction was January — explain renewal cadence before offering monthly switch</reasoning>
<belief subject="policy:billing" confidence="1.0">customers may switch billing cycle within 30 days of renewal with pro-rated refund</belief>
<consent action="granted" grantor="customer:priya" grantee="support-agent">access billing records and invoice history for dispute resolution</consent>
</context>The LLM now knows: who the customer is, the exact charge, the full support workflow, what tools have already run, the agent's inferred cause, and the applicable refund policy — all tagged by epistemic type, all within a 4 000-token budget.
SML metadata density is controlled by disclosure level — the element shape never changes, only the number of attributes:
| Level | Example |
|---|---|
summary |
<belief subject="customer:priya">enrolled in auto-renewal</belief> |
standard |
<belief subject="customer:priya" confidence="0.89">enrolled in auto-renewal since 2024-03-01</belief> |
full |
<belief subject="customer:priya" confidence="0.89" source="crm" observed="14d ago">enrolled in auto-renewal since 2024-03-01</belief> |
┌─────────────────────────────────────────────────────────┐
│ AI Agent / Orchestrator │
│ 1. Issues a CAL ASSEMBLE query │
│ 2. Receives SML context block │
│ 3. Generates response │
│ 4. Issues CAL ADD / SUPERSEDE to persist new grains │
└──────────────────────────┬──────────────────────────────┘
│ CAL queries
┌──────────────────────────▼──────────────────────────────┐
│ CAL Executor │
│ • Runs RECALL queries in parallel │
│ • Applies token budget + priority ordering │
│ • Emits SML (or markdown / JSON / TOON) │
│ • Enforces namespace, policy, and rate limits │
└──────────────────────────┬──────────────────────────────┘
│ OMS store protocol
┌──────────────────────────▼──────────────────────────────┐
│ OMS Memory Store │
│ • .mg containers on disk / S3 / IPFS / database │
│ • SHA-256 content addressing + hexastore index │
│ • COSE Sign1 signatures (optional) │
│ • Append-only; no grain is ever overwritten │
└─────────────────────────────────────────────────────────┘
| Document | Contents |
|---|---|
SPECIFICATION.md |
Full OMS wire format, grain types, signing, selective disclosure, conformance, domain profiles |
CONTEXT-ASSEMBLY-LANGUAGE-CAL-SPECIFICATION.md |
CAL grammar (EBNF), all statements, FORMAT system, streaming, policy integration, error codes |
SEMANTIC-MARKUP-LANGUAGE-SML-SPECIFICATION.md |
SML format definition, structural rules, comprehensive example, progressive disclosure |
SPECIFICATION.md table of contents:
- Blob Layout and Structure
- Canonical Serialization and Content Addressing
- Field Compaction
- Multi-Modal Content References
- Grain Types and Field Specifications
- Cryptographic Signing
- Selective Disclosure
- File Format (
.mgfiles) - Identity and Authorization
- Sensitivity Classification
- Cross-Links and Provenance
- Temporal Modeling
- Encoding Options
- Conformance Levels
- Error Handling and Security Considerations
- Test Vectors
- Grain Protection and Invalidation Policy
- Observer Type, Observation Mode, and Scope Registries
- Query Conventions
- Store Protocol Convention
- Domain Profile Registry (Healthcare, Legal, Finance, Robotics, Science, Consumer)
| Level | Name | Description |
|---|---|---|
| Level 1 | Minimal Reader | Deserialize, verify SHA-256 content addresses, field compaction |
| Level 2 | Full Implementation | Level 1 + serialization, canonical encoding, store protocol, invalidation policy enforcement |
| Level 3 | Production Store | Level 2 + persistent backend, encryption, per-user keys, hexastore index, audit trail |
In scope:
- Binary serialization format for individual grains
.mgfile container format for grain collections- Deterministic encoding and hashing
- Cryptographic signing and selective disclosure
- Content reference and embedding reference schemas
- Identity and authorization models
- Sensitivity classification
- Cross-link and provenance tracking
- CAL query and assembly language
- SML LLM context output format
Out of scope:
- Storage layer implementation (filesystem, S3, database, IPFS)
- Transport protocols (HTTP, MQTT, Kafka, MCP)
- Encryption at rest
- Agent-to-agent communication protocol
Contributions are welcome. Please read CONTRIBUTING.md before submitting changes.
This specification is released into the public domain under CC0 1.0 Universal. See also the Open Web Foundation Final Specification Agreement (OWFa 1.0).
No copyright — use it freely.