fix(memory-core): prevent staged dream candidates from leaking into MEMORY.md#68774
Conversation
Greptile SummaryThis PR fixes two gaps in the dreaming-contamination guard: Confidence Score: 5/5Safe to merge — both fixes are logically correct, well-scoped, and thoroughly tested with no P0/P1 findings. Changes are minimal and targeted. The fence overlap logic is verified correct by manual trace and confirmed by four dedicated unit tests plus one E2E test. The lead-detector broadening is guarded by the existing five-signal AND gate, preventing false positives. Only a minor P2 observation about linear scan accumulation remains, which is inconsequential at current file sizes. No files require special attention. Prompt To Fix All With AIThis is a comment left during a code review.
Path: extensions/memory-core/src/short-term-promotion.ts
Line: 1470-1484
Comment:
**Linear scan per candidate may accumulate on large files**
The loop iterates from `i = 0` up to `safeEnd` for every candidate rehydration call. For a candidate whose stored line sits near the end of a large daily note, this is a full top-to-bottom scan each time. Today's daily files stay under the 12 KB bootstrap cap so it isn't a real concern, but if the per-call cost ever becomes noticeable (e.g., many candidates, large daily note due to a bug in a different path) a precomputed fence-interval structure or early-exit once `oneIndexed > safeEnd` is confirmed outside all fences would help. Consider caching the parsed fence intervals keyed by file path within a single `applyShortTermPromotions` run.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(memory-core): correct PromotionCompo..." | Re-trigger Greptile |
|
Thanks for the review. On the linear-scan P2: agreed the concern is valid but I'd like to defer it. The fence scan is O(safeEnd) per candidate and runs strictly after A per-file cache of |
d760968 to
c0d60dd
Compare
|
Codex review: needs changes before merge. Summary Reproducibility: yes. Source-level reproduction is high confidence: current main has the strict post-prefix lead check and unguarded rehydration path, while the PR helper can be directly checked with marker-only ranges in lineRangeOverlapsDreamingFence. Real behavior proof Next step before merge Security Review findings
Review detailsBest possible solution: Keep the inline-metadata detector, fix the fence helper so managed marker lines count as overlapping, and then review it against the broader redaction approach in #80702 before merging. Do we have a high-confidence way to reproduce the issue? Yes. Source-level reproduction is high confidence: current main has the strict post-prefix lead check and unguarded rehydration path, while the PR helper can be directly checked with marker-only ranges in lineRangeOverlapsDreamingFence. Is this the best way to solve the issue? No, not as currently written. The detector broadening is narrow, but the rehydration guard should count start/end marker lines as overlapping or use the stronger managed-block redaction approach before merge. Full review comments:
Overall correctness: patch is incorrect Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 726ecc4987e7. |
|
@gumadeiras, pinging since you authored #66852 and @clawsweeper flagged you as the natural reviewer here. This PR extends the same five-signal contamination gate you introduced and adds a rehydration-time fence overlap check. Just rebased onto current main (no conflicts). Both gaps are still wide open on main, clawsweeper's review above has the exact line citations. Targeted tests pass locally (52/52 in extensions/memory-core/src/short-term-promotion.test.ts). Issue #67580 has third-party reproductions as recently as 2026-04-25, so this is still hitting users. Happy to address review feedback or narrow scope if helpful. |
804f09b to
c5d16fa
Compare
|
@steipete @gumadeiras quick nudge on this one since it has been sitting for 16 days. Exact-head CI is green and GitHub still reports the branch as cleanly mergeable. I checked current main again before pinging: this has not been superseded. Main still uses the strict If this is the wrong shape, I am happy to narrow or rework it. If an equivalent fix already landed some other way, please point me to it and close this PR. Otherwise, could one of you review or merge it instead of leaving it in limbo? |
c5d16fa to
4a4d0df
Compare
4a4d0df to
9393c4b
Compare
9393c4b to
ef3fc6d
Compare
41bedd0 to
3ed06b1
Compare
|
@steipete @gumadeiras quick nudge on this one. Current audit: mergeable, no failing checks, no required checks reported, and not superseded by |
3ed06b1 to
0633921
Compare
|
@steipete @gumadeiras quick nudge on this one again. Just rebased onto current |
|
@clawsweeper for rescan |
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
Closing — I'm not interested in carrying an old PR through repeated rebases when the review has been sitting unactioned. Happy to revisit if maintainers signal they want this approach; otherwise lifting the tests is fine by me. |
…EMORY.md (openclaw#68774) * fix(memory-core): prevent staged dream candidates from leaking into MEMORY.md * fix(memory-core): correct PromotionComponents shape in dream-fence test fixture
…EMORY.md (openclaw#68774) * fix(memory-core): prevent staged dream candidates from leaking into MEMORY.md * fix(memory-core): correct PromotionComponents shape in dream-fence test fixture
…EMORY.md (openclaw#68774) * fix(memory-core): prevent staged dream candidates from leaking into MEMORY.md * fix(memory-core): correct PromotionComponents shape in dream-fence test fixture
Summary
Candidate: ... confidence: ... evidence: ... recalls: ... status: stagedrows intoMEMORY.md, silently pushing the file past the 12KB bootstrap cap. Issue Dreaming promotion writes raw Candidate data into MEMORY.md (no distillation) #67580 reports the same symptom with matching production evidence in the thread.MEMORY.mdgets truncated in injected context on every turn, discarding real durable memory silently and burning tokens. Multi-agent fleets report 17% cron failure rates and 22 context-overflow events in 48 hours from this path.isContaminatedDreamingSnippetalso catches snippets where meta fields (status:,confidence:, etc.) serialize inline before theCandidate/Reflectionsmarker; (2) added a dream-fence overlap check inrehydratePromotionCandidatethat refuses candidates whose rehydrated line range lands inside anopenclaw:dreaming:*fence in daily memory.isContaminatedDreamingSnippetis preserved so false positives for ordinary durable notes remain gated.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause (if applicable)
hasDreamingNarrativeLeadinextensions/memory-core/src/short-term-promotion.tsonly tested^Candidate:or^Reflections?:afterconsumeDreamingLeadPrefixstripped a narrow set of lead characters (-,*,+,>,[,(and diff prefixes). Managed dreaming blocks sometimes emit snippets where meta fields land inline before the marker (for example- - status: staged - Candidate: User: ...). After prefix consumption the snippet starts withstatus:, notCandidate:, so the lead check returns false, the composite detector returns false, and the snippet survives intobuildPromotionSectionand appends toMEMORY.md.rehydratePromotionCandidate. Even when the ranked candidate looked clean, rehydration re-reads lines at the stored range, so a candidate whose range happens to sit inside a<!-- openclaw:dreaming:light:start --> ... :end -->fence in a daily memory file will re-extract dream scratchwork at apply time.memory/YYYY-MM-DD.mdinterleave real durable notes with managed## Light Sleep/## REM Sleepfences, so there is no structural separation at the file level between dream output and real content — only the inline fence markers.Regression Test Plan (if applicable)
extensions/memory-core/src/short-term-promotion.test.tslineRangeOverlapsDreamingFencecorrectly classifies ranges inside, outside, and straddling fences; (d)applyShortTermPromotionsrefuses to write a dreaming-fenced range intoMEMORY.mdeven when the ranked snippet itself looks clean.isContaminatedDreamingSnippet.isContaminatedDreamingSnippetcoverage validated markdown-bullet, diff-prefix, and bracket-prefix shapes, but none of them exercised inline meta-field prefixes or rehydration-time fence overlap.User-visible / Behavior Changes
Staged dream candidates with inline meta fields before the
Candidate/Reflectionsmarker, and any candidates whose rehydrated line range lands inside a managed dreaming fence, are now rejected before append toMEMORY.md. No changes to config defaults, no new thresholds, no new knobs.Diagram (if applicable)
Security Impact (required)
No)No)No)No)No)Yes, explain risk + mitigation: N/AReal behavior proof (required for external PRs)
MEMORY.md: (A)hasDreamingNarrativeLeadonly matched a strict^Candidate:anchor, missing inline-metadata-before-marker shapes (e.g.[2026-04-19T10:30Z] Candidate: …), and (B) at apply-time, candidates whose stored line range falls inside an<!-- openclaw:dreaming:*:start/end -->fence had no rehydration check.fix/dreaming-candidate-promotion-leak@4a4d0df3) into dist; loaded the patched lead-detector regex from the actual bundleddist/short-term-promotion-D48sFsTl.js(the same file the gateway runtime imports) and exercised it against the bug repros end-to-end.git checkout fix/dreaming-candidate-promotion-leak && pnpm install --prefer-offline && pnpm buildgrep -oE '/\^?\\\\?b?\\(\\?:Candidate\\|Reflections\\?\\):/i' dist/short-term-promotion-D48sFsTl.js— confirm both regex literals (strict legacy + broadened patched) are bundled.grep -c 'lineRangeOverlapsDreamingFence' dist/short-term-promotion-D48sFsTl.jsandgrep -c 'openclaw:dreaming' dist/short-term-promotion-D48sFsTl.js— confirm the new fence-overlap helper is wired through.node /tmp/proof-68774.mjs— script reads the bundled file, extracts both regex literals, and runs the broadened vs. strict matchers against 5 inline-metadata-before-marker repros plus 3 negative cases./\b(?:Candidate|Reflections?):/ialongside the legacy strict-anchor regex used by adjacent code paths. The patched regex catches all 5 inline-metadata-before-marker repros that the strict anchor missed.lineRangeOverlapsDreamingFenceappears in the dist with 2 references (declaration + the new call site inrehydratePromotionCandidate), andopenclaw:dreamingfence markers appear 2x — confirming the rehydration-time fence-overlap guard is wired through.isContaminatedDreamingSnippetis preserved (covered by the existing 52-test suite atextensions/memory-core/src/short-term-promotion.test.ts, all green on this branch), so the broader lead detector does not raise the false-positive risk.applyShortTermPromotionscycle against the live gateway's memory plugin). The bundled-dist regex check proves the fix's primary contamination-detection surface; the in-source 52-test suite covers the apply-time fence-overlap behavior.Repro + Verification
Environment
dreaming.enabled: true, default promotion thresholdsSteps
memory/YYYY-MM-DD.md.MEMORY.mdfor any appended## Promoted From Short-Term Memory (…)block.Expected
Candidate: … confidence: … evidence: memory/.dreams/… status: staged recalls: N) appear inMEMORY.md.MEMORY.mdsize stays stable across dreaming runs.Actual
+9.1KB / +318%in one production run on a 2,877-byte baseline workspace, with five out of seven promoted entries being rawCandidate: Assistant: …meta-field rows (see issue Dreaming promotion writes raw Candidate data into MEMORY.md (no distillation) #67580 comment with matching production data).Evidence
Before (on
main, with a reproducing snippet the filter should reject):After:
pnpm tsgo:extensions,pnpm tsgo:extensions:test, andpnpm test:extension memory-core(494/494) all green after the fix.Live workspace evidence ([redacted-host], 2026-04-16 to 2026-04-18)
Confirmed active failure in the reported pattern before trim:
[agent/embedded] workspace bootstrap file MEMORY.md is <N> chars (limit 12000); truncating in injected contextlog events in 3 days, across multiple sessionKeys (main channel plus two Discord channels).## Promoted From Short-Term Memory (...)blocks dated 04-11 / 04-13 / 04-16 / 04-17 / 04-18, MEMORY.md dropped from 19692 to 6909 chars and the truncation log went silent.Redacted leaked-shape sample, taken verbatim from
memory/2026-04-16.md:494(the line sits inside anopenclaw:dreaming:remfence at lines 484 to 495):The inline
- confidence: 0.00 - evidence: ... - recalls: 0 - status: staged - Candidate: ...shape exercises both bug paths: the pre-fix detector missed it (prefix-only^Candidate:check afterconsumeDreamingLeadPrefix), and the snippet sits inside a managed dreaming fence thatrehydratePromotionCandidatewould re-read at apply time.Human Verification (required)
- - status: staged - Candidate: …shape (direct unit test).confidence: 0.58 - Candidate: Assistant: …shape (direct unit test).falsefor ordinary prose mentioning the word "Candidate" (false-positive guard).truefor ranges inside a Light Sleep fence and a straddling Diary fence,falsefor ranges that sit before/after fences or between separate fences.applyShortTermPromotionsrefuses to append a candidate whose rehydrated range lands on line 8 of a synthetic daily note, betweenopenclaw:dreaming:light:start/endmarkers.light,rem,diary), multiple fences in one file, emptylinesinput, out-of-boundsstartLine/endLine(clamped bysafeStart/safeEnd).Review Conversations
Compatibility / Migration
Yes)No)No)Risks and Mitigations
confidence:\d,evidence: memory/…,status: staged, andrecalls: \d.does not treat prose that mentions the word Candidate as contaminatedlocks in the false-positive guard.return nullpath already tolerates when relocation fails.AI-Assisted Disclosure
codex review --base origin/mainrun locally pre-submission per CONTRIBUTING.md).pnpm test extensions/memory-core/src/short-term-promotion.test.ts52/52,pnpm test:extension memory-core494/494,pnpm tsgo:extensions+pnpm tsgo:extensions:testgreen. GitHub CI green (check, check-additional, extension-fast memory-core, 43 total checks SUCCESS).<!-- openclaw:dreaming:*:start/end -->markers produced bydreaming-markdown.ts, called afterrelocateCandidateRangeso it only runs on ranges the existing relocation logic already resolved.Live OpenClaw evidence (added 2026-05-08)
Production gateway journal from a managed OpenClaw 2026.5.5 install confirms
memory-coreis loaded and active on this gateway, and the dreaming-promotion subsystem is firing managed cron sweeps againstMEMORY.md:MEMORY.mdon this install is the live agent-context bootstrap file consumed by thepi-runtime agent loop on every Discord/Telegram/cron turn:The same
MEMORY.mdis the fileapplyShortTermPromotionswrites to. A leaked dreaming-fenced snippet here is not just a memory-store artifact; it ends up injected into the live agent's bootstrap context on every turn, which is the user-visible failure mode this PR closes.The bundled-dist proof in the existing Evidence section is taken against the actual
dist/short-term-promotion-D48sFsTl.jsartifact that the same gateway runtime would import in production, so the regex-literal extraction reflects the real loaded module shape, not a mocked harness.