Skip to content

Dreaming needs configurable session/cron exclusions; isolated cron transcripts still enter session corpus #72611

@zqchris

Description

@zqchris

Summary

Dreaming currently has no documented/configurable way to exclude specific sessions, cron jobs, groups/topics, or session-key prefixes from session transcript ingestion. On a real 2026.4.24 deployment, isolated cron runs are still appearing in memory/.dreams/session-corpus/YYYY-MM-DD.txt, even though the runtime contains generatedByCronRun / DIRECT_CRON_PROMPT_RE filtering logic.

This makes cron-heavy deployments difficult to operate without maintaining local Dreaming patches. Moving the cron delivery target to a Telegram group/topic does not help, because Dreaming scans agent session transcript files rather than the final delivery surface.

Related: #68449, but this issue is specifically about the missing operator-facing exclusion/config path and the remaining leak of isolated cron transcripts into the session corpus.

Environment

  • OpenClaw: 2026.4.24 (764822a)
  • Dreaming: enabled via plugins.entries.memory-core.config.dreaming.enabled: true
  • Memory backend: builtin
  • Affected corpus path: memory/.dreams/session-corpus/YYYY-MM-DD.txt
  • Cron job type: payload.kind=agentTurn, sessionTarget=isolated, delivery.mode=announce

What I expected

At least one of these should be true:

  1. sessionTarget: isolated cron runs are not ingested into Dreaming's session corpus, or
  2. Cron-run transcripts are reliably classified as generatedByCronRun and skipped, including deleted/rotated transcript filenames, or
  3. Operators can configure Dreaming to ignore specific cron jobs/session-key prefixes/agents/surfaces, e.g. an allowlist/blocklist such as:
    • dreaming.sessionFilter.excludeSessionKeyPrefixes
    • dreaming.sessionFilter.excludeCronJobIds
    • dreaming.sessionFilter.excludeAgents
    • dreaming.sessionFilter.excludeRegex

What happened

A cron job already configured as sessionTarget: isolated still appeared in Dreaming session corpus.

Counts from local memory/.dreams/session-corpus inspection:

  • 2026-04-24.txt: 11 entries for the cron job
  • 2026-04-25.txt: 17 entries for the cron job
  • 2026-04-26.txt: 2 entries for the cron job
  • 2026-04-27.txt: 0 entries at the time of inspection

Example corpus line shape:

[main/sessions/<uuid>.jsonl.deleted.2026-04-25T06-33-10.801Z#L5] User: [cron:<cron-id> <cron-name>] ...

The source paths are important: many contaminated entries reference deleted/rotated transcript filenames such as:

main/sessions/<uuid>.jsonl.deleted.<timestamp>
main/sessions/<uuid>.trajectory.jsonl.deleted.<timestamp>

The current runtime code appears to rely on session-store classification to mark cron transcripts:

  • loadSessionTranscriptClassificationForSessionsDir(...)
  • isCronRunSessionKey(...)
  • cronRunTranscriptPaths
  • generatedByCronRun

But once transcript filenames are rotated/deleted, the path in session-corpus may no longer match the live sessions.json sessionFile path, so classification can fail. DIRECT_CRON_PROMPT_RE should still catch direct User: [cron:...] messages, but the corpus evidence shows these prompt lines were already ingested in recent runs.

Config gap checked

I checked the current config schema. memory only exposes:

  • memory.backend
  • memory.citations
  • memory.qmd.*

The memory-core Dreaming plugin config schema currently exposes only broad phase settings:

  • dreaming.enabled
  • dreaming.frequency
  • dreaming.timezone
  • dreaming.verboseLogging
  • dreaming.storage
  • dreaming.phases.light.*
  • dreaming.phases.deep.*
  • dreaming.phases.rem.*

I could not find a config option to skip a session, group/topic, cron job, or session-key prefix.

Why this matters

Cron jobs often contain operational prompts, tool stdout summaries, and repetitive status text that are useful as notifications but harmful as durable Dreaming input. In a personal-agent deployment, users should not have to fork/patch Dreaming just to keep maintenance cron transcripts out of memory.

Changing Telegram delivery target is not a workaround: the transcript is still stored under the agent and scanned by Dreaming.

Suggested fixes

A robust fix probably needs both:

  1. Make cron transcript skipping reliable

    • classify by record/session metadata where available, not only by current session-store transcript path
    • handle .jsonl.deleted.<timestamp> and .trajectory.jsonl.deleted.<timestamp> rotated transcript artifacts
    • keep direct [cron:...] prompt filtering as a defense-in-depth path
  2. Add operator-facing Dreaming session filters

    • block by cron job id/name
    • block by session key prefix, e.g. agent:main:cron:
    • block by agent id
    • block by regex over rendered source path/snippet
    • optionally allow an explicit include-only policy for session ingestion

This would let deployments keep Dreaming on for real human conversations while excluding high-volume automation/cron sessions without local patches.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions