fix: preserve unpersisted volatile live input in LCM assembly by jetd1 · Pull Request #688 · Martian-Engineering/lossless-claw

jetd1 · 2026-05-15T12:06:58Z

Problem

When contextEngine.assemble() rebuilds context from the DB context_items table, messages that were never persisted to DB are absent from the assembled output. This includes:

Inter-session subagent announcements (suppressPromptPersistence=true for inputProvenance.kind === "inter_session")
Retry/overflow retry prompts (suppressPromptPersistenceOnRetry)
Other internal runtime events that bypass DB persistence

The DB frontier lags behind params.messages, so the model never sees the current turn's volatile event — e.g., a subagent completion announcement that should trigger a response.

Root Cause

LCM's assembler rebuilds from DB only. OpenClaw's shouldSuppressAgentPromptPersistence() marks certain messages as non-persistable, so they never enter context_items. The assembled output has no way to include them.

Fix

After DB assembly, detect volatile live input in params.messages that is not represented in the assembled output and append it within the token budget.

Key design choices

Volatile inputs are never covered by summary substring matches. A summary containing similar text captures a past turn; the current volatile event is a distinct occurrence the model must see explicitly. Only exact assembled-message matches count as coverage for volatile inputs.
Occurrence-level bipartite maximum matching (DFS augmenting-path) for deduplication. Two-layer slot allocation: exact-match slots first, then normalized-overlap generic slots.
Live input takes priority. If appending volatile input exceeds the token budget, evict from the front of assembled DB context. Protected fresh-tail messages and exact live anchors are never evicted.
Tool-result/tool-call pair integrity. expandProtectedToolPairIndexes() ensures protected toolResults and their paired assistant tool-call messages are both preserved during budget trimming.
Tool name normalization. normalizeToolNameForCoverage() treats null/undefined/""/"unknown" as equivalent in coverage signatures, fixing anchor mismatches when the assembler fills missing tool names with "unknown".
System volatile normalization. System-role volatile messages are projected to user representation before coverage/anchor/append logic.
Fresh-tail hash protection. Assembler provides per-message hashes for fresh-tail messages; engine protects these alongside exact live anchors during volatile trimming.

Files Changed

src/engine.ts — core reconciliation logic (appendUncoveredVolatileLiveInputsWithinBudget, coverage matching, budget eviction)
src/assembler.ts — fresh-tail message hash export for protection
test/engine.test.ts — 931 tests (278 engine-specific), including regression tests for volatile coverage, budget eviction, tool-pair integrity, and tool-name normalization
.changeset/assemble-live-tail.md — patch bump
.gitignore — added node_modules

Codex context-engine projection caps LCM output to 24k chars, hiding full-fit context openclaw/openclaw#80760 (Codex projection 24K cap)
feat(context-engine): add interceptCompaction contract for context-engine plugins openclaw/openclaw#81164 (interceptCompaction proposal)
feat(agents): add context-window-relative compaction budget shares openclaw/openclaw#81176 (relative budget shares)
This is Track 1 (止血PR) from the three-track fix plan; Track 2 (OpenClaw RFC for assemble() contract) and Track 3 (upstream PR for live-input fields) are separate.

jetd1 · 2026-05-15T12:15:01Z

Upstream RFC for the proper contract fix: openclaw/openclaw#82137

To frame what this PR is and isn't:

This is a Track 1 hot fix. The current LCM assemble() has no structured signal about which entries in params.messages are durable user/assistant turns vs runtime-injected volatile live input — synthesized subagent announces, inter-session bridge messages, internal-runtime-context blocks. So we detect them by string-matching the formatted prompt bodies ([Inter-session message], <<<BEGIN_OPENCLAW_INTERNAL_CONTEXT>>>, [Internal task completion event]) — i.e. inverse text-matching against the same OpenClaw code paths that emitted those messages.

That's a real production fix and we should ship it, but the underlying issue is a contract gap, not an LCM bug. OpenClaw already has the structured metadata internally:

InputProvenance (src/sessions/input-provenance.ts) — third-party_user / inter_session / internal_system with sourceTool discriminator (e.g. subagent_announce).
AgentInternalEvent (src/agents/internal-event-contract.ts, internal-events.ts) — AGENT_INTERNAL_EVENT_TYPE_TASK_COMPLETION etc., already used by OpenClaw's own shouldSuppressAgentPromptPersistence().
prePromptMessageCount — already passed to afterTurn, not to assemble.

The RFC proposes plumbing these three (existing) signals through to ContextEngine.assemble() as optional, non-breaking params. Once it lands, the engine can replace the marker-string sniff in this PR with structured detection:

// Subagent / task-completion notification — currently we detect via
// "[Internal task completion event]" text match.
if (params.internalEvents?.[i]?.some(e =>
  e.type === AGENT_INTERNAL_EVENT_TYPE_TASK_COMPLETION
)) {
  // structured, robust against marker renames
}

Same for inter_session bridge detection (currently [Inter-session message] prefix match → params.inputProvenance[i].kind === "inter_session") and for the <<<BEGIN_OPENCLAW_INTERNAL_CONTEXT>>> block (currently delimiter sniff → structured marker or provenance).

When the RFC lands and a version of OpenClaw ships the signals, we'll feature-detect and migrate. The marker constants in this PR stay as a fallback until the floor OpenClaw version in the wild has the new fields, then we delete them.

Tracking that migration as Track 3 in the three-track plan referenced in the PR description.

When OpenClaw's LCM context engine assembles context from the durable DB frontier, messages that were never persisted (inter-session subagent announcements, retry/overflow prompts with suppressPromptPersistence) are absent from the DB. The assembled output therefore omits them entirely, causing the model to miss live input events like subagent completions. This patch adds a reconciliation step after DB assembly: detect volatile live input in params.messages that is not represented in the assembled output, and append it within the token budget. Key design choices: - Volatile live inputs are never covered by summary substring matches. A summary containing similar text captures a *past* turn; the current volatile event is a distinct occurrence the model must see explicitly. - Only exact assembled-message matches count as coverage for volatile inputs. This prevents stale summaries from consuming live events. - Occurrence-level bipartite maximum matching (DFS augmenting-path) deduplicates volatile inputs against assembled context. - Live input takes priority: if appending volatile input exceeds the token budget, evict from the front of assembled DB context first. - Tool-result/tool-call pairs are kept intact during budget trimming; protected fresh-tail messages and exact live anchors are never evicted. - Tool names that are null/undefined/""/"unknown" are normalized to equivalent in coverage signatures, fixing anchor mismatches when the assembler fills missing tool names with "unknown". Fixes: subagent completion announcements lost from model context Related: openclaw/openclaw#80760, openclaw/openclaw#81164

jalehman · 2026-05-18T17:04:40Z

Thank you!

jetd1 mentioned this pull request May 15, 2026

RFC: Expose existing InputProvenance / AgentInternalEvent signals to ContextEngine.assemble() openclaw/openclaw#82137

Open

jetd1 force-pushed the fix/assemble-live-tail-clean branch 2 times, most recently from 45f76a5 to 5e8abf7 Compare May 17, 2026 17:31

jetd1 force-pushed the fix/assemble-live-tail-clean branch from 5e8abf7 to 5965151 Compare May 17, 2026 17:41

fix: cover retry prompts and paired tool eviction

2f38067

jalehman merged commit d1bef05 into Martian-Engineering:main May 18, 2026
2 checks passed

github-actions Bot mentioned this pull request May 18, 2026

chore: version packages #703

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: preserve unpersisted volatile live input in LCM assembly#688

fix: preserve unpersisted volatile live input in LCM assembly#688
jalehman merged 2 commits into
Martian-Engineering:mainfrom
jetd1:fix/assemble-live-tail-clean

jetd1 commented May 15, 2026

Uh oh!

jetd1 commented May 15, 2026

Uh oh!

jalehman commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jetd1 commented May 15, 2026

Problem

Root Cause

Fix

Key design choices

Files Changed

Related

Uh oh!

jetd1 commented May 15, 2026

Uh oh!

jalehman commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants