fix(memory-core): redact managed dreaming blocks during promotion rehydration (#80613)#80702
fix(memory-core): redact managed dreaming blocks during promotion rehydration (#80613)#80702MilosM348 wants to merge 2 commits into
Conversation
|
Thanks for the context here. I swept through the related work, and this is now duplicate or superseded. Close because the same memory-core leak half is now superseded by a merged current-main fix that rejects rehydrated ranges overlapping managed dreaming fences before appending to MEMORY.md. This branch's remaining distinction is preserving an adjacent human bullet by redaction, which is a separate follow-up tradeoff rather than a reason to keep a conflicting PR open. So I’m closing this here and keeping the remaining discussion on the canonical linked item. Review detailsBest possible solution: Keep the merged current-main no-leak guard as canonical, continue the CJK dedupe work in #80620, and open a new focused follow-up only if maintainers want redaction that preserves adjacent human bullets instead of skipping fenced ranges. Do we have a high-confidence way to reproduce the issue? No current-main reproduction remains: the merged fence-overlap guard skips rehydrated ranges that touch managed dreaming fences before MEMORY.md is written. The historical failure path is still understandable from the old raw rehydration flow and the PR's regression fixture. Is this the best way to solve the issue? No for merging this branch as-is: the leak is already handled on current main by the merged guard, while this PR's redaction-preserves-human-content behavior is a distinct refinement. A fresh, non-conflicting follow-up would be the better path if maintainers want that refinement. Security review: Security review cleared: The diff only touches memory-core redaction logic, colocated tests, and changelog text; no concrete security or supply-chain regression was found. What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 83b8289ee274; fix evidence: commit 314903417fcc, main fix timestamp 2026-05-12T21:07:43Z. |
0544e0c to
9de646d
Compare
…0613 Address ClawSweeper P3 on PR openclaw#80702: the existing integration test recorded the recall snippet at the exact human bullet line, so relocateCandidateRange always took the exact-match path and never exercised the fuzzy window search that originally latched onto the managed-block straddle window in openclaw#80613. Add a regression that records a clean-leading multi-line straddle snippet at a stale 20..30 range, forcing rehydration through the fuzzy window search. Verified the new test FAILS without redactManagedDreaming Lines (MEMORY.md receives '- - Plan switches use exRule, not abConfig ## Light Sleep <!-- openclaw:dreaming:light:start --> - Candidate: ... - status: staged <!-- openclaw:dreaming:light:end -->' verbatim) and PASSES with the sanitizer in place. All 88 tests across short-term-promotion and dreaming-phases pass. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Thanks @clawsweeper for the P3 — fair catch. Pushed I verified the test fails without — matching the exact form reported in #80613. With the sanitizer in place, the fuzzy search collapses to a single human-bullet window and Full local sweep: @clawsweeper re-review please. |
…ydration (openclaw#80613) Daily memory notes can interleave human content with managed `<!-- openclaw:dreaming:light:* -->` and `<!-- openclaw:dreaming:rem:* -->` blocks. The chunk builder strips those regions before snippet capture, but `rehydratePromotionCandidate` re-reads the raw source file and feeds it to `relocateCandidateRange`, whose fuzzy window search will happily latch onto a window that straddles the human bullet and the adjacent dreaming bullets. That leaks `- Candidate: …` / `confidence: …` / `status: staged` lines into `MEMORY.md`. Add `redactManagedDreamingLines` and call it on the source split before relocation, mirroring the chunk-side `stripManagedDailyDreamingLines` heading-walk so the `## Light Sleep` / `## REM Sleep` heading is also zeroed when it sits directly above the start marker. Unterminated managed blocks are redacted through the end of file rather than left as a partial window. Cover with a unit test of the helper (terminated, unterminated, multiple markers) and an integration test that writes a note with a `## Light Sleep` dreaming block and asserts the promoted `MEMORY.md` keeps the human bullet and contains no `Candidate:` / `confidence:` / `status: staged` / `openclaw:dreaming:light` traces. Refs openclaw#80620 (CJK dedupe) — that PR fixes the second sub-bug from the issue; this one only addresses the promotion-leak half.
…0613 Address ClawSweeper P3 on PR openclaw#80702: the existing integration test recorded the recall snippet at the exact human bullet line, so relocateCandidateRange always took the exact-match path and never exercised the fuzzy window search that originally latched onto the managed-block straddle window in openclaw#80613. Add a regression that records a clean-leading multi-line straddle snippet at a stale 20..30 range, forcing rehydration through the fuzzy window search. Verified the new test FAILS without redactManagedDreaming Lines (MEMORY.md receives '- - Plan switches use exRule, not abConfig ## Light Sleep <!-- openclaw:dreaming:light:start --> - Candidate: ... - status: staged <!-- openclaw:dreaming:light:end -->' verbatim) and PASSES with the sanitizer in place. All 88 tests across short-term-promotion and dreaming-phases pass. Co-authored-by: Cursor <cursoragent@cursor.com>
c97b363 to
0766319
Compare
Summary
rehydratePromotionCandidatere-reads the raw daily-note source file and feeds the unredacted lines intorelocateCandidateRange, whose fuzzy window search will latch onto a window that straddles a human bullet and the adjacent managed<!-- openclaw:dreaming:light:* -->/<!-- openclaw:dreaming:rem:* -->block, leaking## Light Sleepheadings and- Candidate: …/confidence: …/status: stagedscaffolding intoMEMORY.md.stripManagedDailyDreamingLinesindreaming-phases.ts), so the post-rehydrationisContaminatedDreamingSnippetcheck never sees a bareCandidate:lead and lets straddle snippets through; users see staged dreaming scaffolding promoted into long-term memory.redactManagedDreamingLines(lines)inextensions/memory-core/src/short-term-promotion.tsand call it on the raw source split insiderehydratePromotionCandidatebefore relocation. The helper mirrorsdreaming-phases.ts's contract — zeroes start..end markers AND walks upward to also zero the## Light Sleep/## REM Sleepheading directly above the start marker — and unterminated managed blocks redact through end of file.MEMORY.mdwriter, or the CJK dedupe path (the second sub-bug from [Bug]: dreaming pipeline leaks raw candidate content into MEMORY.md and CJK dedup is ineffective in tokenizeSnippet #80613 is being addressed separately by fix(memory-core): use CJK-aware tokenizer for dreaming dedupe (#80613) #80620).Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Real behavior proof (required for external PRs)
<!-- openclaw:dreaming:light:* -->/<!-- openclaw:dreaming:rem:* -->blocks no longer leak intoMEMORY.mdwhen promotion rehydration straddles a human bullet next to a managed block.recordShortTermRecalls→rankShortTermPromotionCandidates→applyShortTermPromotionspipeline used bymemory_searchrecall capture, with workspace state on disk and a realMEMORY.mdround-tripped throughfs.readFile.pnpm exec vitest run extensions/memory-core/src/short-term-promotion.test.ts extensions/memory-core/src/dreaming-phases.test.tsTest Files 2 passed (2) | Tests 88 passed (88) | Duration 19.76s(49 short-term-promotion + 39 dreaming-phases). The new fuzzy-window-straddle integration regression (commitc97b3633) was verified to FAIL against current main without the sanitizer — the producedMEMORY.mdwas captured verbatim as<!-- openclaw-memory-promotion:memory:memory/2026-04-13.md:20:30 --> - - Plan switches use exRule, not abConfig ## Light Sleep <!-- openclaw:dreaming:light:start --> - Candidate: Plan toggle field summary - confidence: 0.00 - evidence: memory/.dreams/session-corpus/2026-04-13.txt:1-3 - recalls: 0 - status: staged <!-- openclaw:dreaming:light:end --> [score=0.611 recalls=1 avg=0.940 source=memory/2026-04-13.md:2-12](matches the exact leak shape on issue [Bug]: dreaming pipeline leaks raw candidate content into MEMORY.md and CJK dedup is ineffective in tokenizeSnippet #80613). With the sanitizer in place,MEMORY.mdkeeps only- Plan switches use exRule, not abConfig.MEMORY.mdentries derived from daily notes that interleave human content with managed dreaming blocks contain only the human content; no## Light Sleepheading, no<!-- openclaw:dreaming:* -->markers, no- Candidate:/confidence:/evidence:/recalls:/status: stagedscaffolding ever lands.MEMORY.mdwrites, not against a streaming model session. Operator confirmation against a real long-running workspace is welcome.MEMORY.mdshowing- ## 教训:Plan 实验开关字段 - Plan 配置中实验开关字段是 \... ## Light Sleep <!-- openclaw:dreaming:light:start --> - Candidate: 教训…(human bullet + heading + start marker + dreamingCandidate:line all promoted); ClawSweeper's earlier review on the issue confirmed the source-level path through promotion rehydration.Root Cause (if applicable)
rehydratePromotionCandidate(extensions/memory-core/src/short-term-promotion.ts) reads the raw daily-note file and passes the unredacted lines intorelocateCandidateRange, which then fuzzy-matches against any window of 1..maxSpan lines. When the candidate's storedstartLine/endLinedoes not match exactly (stale capture, file drift, partial-line snippet), the fuzzy search prefers higher-quality windows and can pick a multi-line window that includes the managed block. The relocated snippet starts with a clean human bullet, so the post-rehydrationisContaminatedDreamingSnippetcheck (which keys on a leadingCandidate:after diff-prefix stripping) does not flag it.stripManagedDailyDreamingLineswas protecting daily ingestion.Regression Test Plan (if applicable)
extensions/memory-core/src/short-term-promotion.test.ts— three new regressions: two unit tests forredactManagedDreamingLines(terminated light + REM blocks; unterminated light block through EOF), one integration test for the full record→rank→apply chain (exact-match path), and one integration test for the fuzzy-window straddle path (commitc97b3633, addresses ClawSweeper P3).## Light Sleep+<!-- openclaw:dreaming:light:start -->+ staged dreaming scaffolding +<!-- openclaw:dreaming:light:end -->, captured into a recall whosestartLine/endLineno longer point at the human bullet, so rehydration must traverse the fuzzy window search and must NOT relocate to a window that includes the managed block.dreaming-phases.test.tsonly covers the chunk-builder strip; the rehydration path was untested for managed-block leaks.User-visible / Behavior Changes
MEMORY.mdwill no longer accumulate## Light Sleep/## REM Sleepheadings,<!-- openclaw:dreaming:* -->markers, or- Candidate:/confidence:/status: stagedscaffolding from short-term promotion. ExistingMEMORY.mdfiles are not rewritten — only new promoted entries are sanitized.Diagram (if applicable)
Security Impact (required)
Yes/No) NoYes/No) NoYes/No) NoYes/No) NoYes/No) NoYes, explain risk + mitigation: N/ARepro + Verification
Environment
minScore: 0,minRecallCount: 0,minUniqueQueries: 0for the regression fixture)Steps
## Plan toggle field/ human bullet / blank /## Light Sleep/<!-- openclaw:dreaming:light:start -->/ staged scaffolding /<!-- openclaw:dreaming:light:end -->.startLine/endLine(e.g.20/30) so relocate must use the fuzzy window search.rankShortTermPromotionCandidatesthenapplyShortTermPromotions, then read backMEMORY.md.Expected
MEMORY.mdcontains only the human bullet (- Plan switches use exRule, not abConfig); no## Light Sleep, no<!-- openclaw:dreaming:* -->, no- Candidate:/confidence:/status: staged.Actual
pnpm exec vitest run extensions/memory-core/src/short-term-promotion.test.ts— all 49 tests pass, including the 4 new regressions).Evidence
Human Verification (required)
What you personally verified (not just CI), and how:
pnpm exec vitest run extensions/memory-core/src/short-term-promotion.test.ts extensions/memory-core/src/dreaming-phases.test.tslocally (Test Files 2 passed (2) | Tests 88 passed (88) | Duration 19.76s); reverted theredactManagedDreamingLinescall insiderehydratePromotionCandidateand re-ran the new fuzzy-window straddle regression — observed the leak inMEMORY.mdmatching [Bug]: dreaming pipeline leaks raw candidate content into MEMORY.md and CJK dedup is ineffective in tokenizeSnippet #80613's reporter shape; restored the sanitizer and re-ran — the leak is gone.## Light Sleep/## REM Sleepheading directly above the start marker; diary marker (no heading) only redacts start..end markers; exact-match relocation path (existing test fix: add @lid format support and allowFrom wildcard handling #1 atshort-term-promotion.test.ts:1249); fuzzy-match straddle path (new test atshort-term-promotion.test.ts:1304).Review Conversations
ClawSweeper's earlier P3 (exercise the straddling rehydration path) is addressed by commit
c97b3633and acknowledged in a follow-up comment with the verbatim leak captured under no-fix conditions.Compatibility / Migration
Yes/No) YesYes/No) NoYes/No) NoRisks and Mitigations
## Light Sleep/## REM Sleepheading that a user authored manually for their own notes (i.e. one that does NOT precede a managed start marker).## Light Sleepor## REM Sleepheading sits directly above the matching managed start marker (mirrorsfindManagedDailyDreamingHeadingIndexindreaming-phases.ts); a stand-alone user heading without the immediately-following start marker is left untouched.