Skip to content

[Behavior]: memory-core dreaming deep phase promotes raw verbatim daily-log snippets to MEMORY.md without distillation #67363

@automata2k

Description

@automata2k

Bug type

Behavior bug (incorrect output quality)

Summary

The dreaming deep phase promotion writes raw verbatim text snippets from daily memory files directly into MEMORY.md with no summarization, distillation, or insight extraction step. This means unprocessed daily log fragments (timestamps, routine activity notes, casual conversation fragments) are promoted as-is to long-term memory, polluting the curated memory store with noise.

Root cause

In short-term-promotion-0cyGzCF3.js, the buildPromotionSection function:

function buildPromotionSection(candidates, nowMs, timezone) {
  const lines = [
    "",
    `## Promoted From Short-Term Memory (${formatMemoryDreamingDay(nowMs, timezone)})`,
    ""
  ];
  for (const candidate of candidates) {
    const source = `${candidate.path}:${candidate.startLine}-${candidate.endLine}`;
    const snippet = candidate.snippet || "(no snippet captured)";
    lines.push(`<!-- ${PROMOTION_MARKER_PREFIX}${candidate.key} -->`);
    lines.push(`- ${snippet} [score=${candidate.score.toFixed(3)} recalls=${candidate.recallCount} avg=${candidate.avgScore.toFixed(3)} source=${source}]`);
  }
  lines.push("");
  return lines.join("\n");
}

The candidate.snippet is the raw text captured from the source daily memory file — copied verbatim. The scoring algorithm (weighted relevance 30%, frequency 24%, diversity 15%, recency 15%, consolidation 10%, conceptual 6%) determines whether something gets promoted, but there is no step that determines how it gets written. The raw snippet is written as-is.

Result

MEMORY.md ends up with raw daily log fragments like:

## Promoted From Short-Term Memory (2026-04-15)

<!-- openclaw-memory-promotion:memory:memory/2026-02-21.md:1:8 -->
- Started the day. 10:15 AM. Greeted the user. [score=0.820 recalls=3 avg=0.817 source=memory/2026-02-21.md:1-8]

This defeats the purpose of MEMORY.md as a curated long-term memory store. The entry above carries no actionable insight — it is a raw activity log fragment that scored well on frequency and recency but contains no distilled information worth retaining long-term.

Documentation gap

The docs describe the deep phase as "decides what becomes long-term memory" and note it "Appends promoted entries to MEMORY.md" — but do not clarify whether the entries are intended to be raw verbatim copies or distilled insights.

Question for maintainers

What is the intended behavior? Is the raw snippet write:

  • By design — with the expectation that agents/humans review and curate promoted entries afterward?
  • An oversight — where a distillation/summarization step was planned but not yet implemented?
  • A deliberate tradeoff — where the scoring algorithm is the only filter, and raw text preserves provenance?

If raw promotion is intentional, consider adding a config option or flag for "distilled promotion" that summarizes the snippet into a concise insight before writing. This would make MEMORY.md a genuine curated store rather than an append-only log of scored fragments.

OpenClaw version

2026.4.14

Operating system

macOS Darwin 21.6.0 (x64)

Install method

global npm install

Impact

Medium — MEMORY.md accumulates raw daily log fragments instead of curated insights, degrading long-term memory quality over time. Users who trust the automated promotion system may not notice the noise until their memory store becomes cluttered with unprocessed entries.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Normal backlog priority with limited blast radius.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:needs-security-reviewClawSweeper marked this issue as needing security-sensitive review.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:securitySecurity boundary, credential, authz, sandbox, or sensitive-data risk.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.staleMarked as stale due to inactivity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions