Skip to content

[Bug]: QMD archived session hits can be dropped by visibility filter due to normalized path mapping #83506

@qxunomni-coder

Description

@qxunomni-coder

Summary

QMD can index and retrieve archived/reset session transcripts, but memory_search(corpus="sessions") can still drop those hits after retrieval because the session-hit visibility filter cannot map QMD's normalized archived-transcript path back to the original session key.

This is distinct from #80736 and #30220. Those covered archived transcripts being skipped or producing zero chunks. In this case, the archived transcript has content, QMD returns a hit, but the tool/API path removes it during post-search visibility filtering.

Related: #53550, #80736, #30220.

Environment

  • OpenClaw: 2026.5.7 (eeef486)
  • Backend: memory.backend = "qmd"
  • QMD sessions enabled: memory.qmd.sessions.enabled = true
  • Session recall query path: memory_search(..., corpus="sessions")
  • Same-agent cross-session visibility configured for testing:
    • tools.sessions.visibility = "agent"
    • agents.defaults.sandbox.sessionToolsVisibility = "all"

Observed behavior

For a reset/archived session transcript, the QMD-exported source file exists with a name like:

<session-id>.jsonl.reset.<timestamp>.md

The exported Markdown contains the expected conversation text, and direct QMD/CLI search can retrieve it.

However, the memory_search tool path can return no archived hit for the same query. Non-archived session hits from the same QMD sessions collection do surface correctly.

Reproduction outline

  1. Enable QMD session export:
{
  memory: {
    backend: "qmd",
    qmd: {
      sessions: { enabled: true }
    }
  }
}
  1. Have a distinctive conversation in a session.
  2. Reset/archive the session so the transcript becomes a usage-counted archive, e.g.:
<session-id>.jsonl.reset.<timestamp>
  1. Let QMD export/index the archived transcript as Markdown.
  2. Search for distinctive text from that archived transcript using direct QMD/CLI search. The hit is returned from a QMD sessions collection.
  3. Search for the same text via the agent tool:
memory_search({ query: "<distinctive text>", corpus: "sessions", maxResults: 5, minScore: 0.1 })
  1. The archived hit may be absent, while ordinary non-archived session hits still appear.

Local diagnosis

The failure appears to happen after QMD retrieval, inside the session-hit visibility filter:

  • extensions/memory-core/src/session-search-visibility.ts
  • filterMemorySearchHitsBySessionVisibility(...)
  • extractTranscriptIdentityFromSessionsMemoryHit(...)
  • resolveTranscriptStemToSessionKeys(...)

For normal QMD session result paths like:

qmd/sessions-main/<session-id>.md

extractTranscriptIdentityFromSessionsMemoryHit() derives <session-id>, and resolveTranscriptStemToSessionKeys() maps it back to the corresponding session store key. The visibility guard can then allow it.

For archived/reset hits, QMD may return a normalized/slugified result path like:

qmd/sessions-main/<session-id>-jsonl-reset-<timestamp>.md

The current identity extraction treats the whole slug as a plain .md stem:

<session-id>-jsonl-reset-<timestamp>

That stem does not match the canonical session id or session file basename in the combined session store, so resolveTranscriptStemToSessionKeys() returns no keys. With no mapped session key, filterMemorySearchHitsBySessionVisibility() drops the hit.

Expected behavior

If QMD returns an archived/reset session transcript hit, OpenClaw should map the QMD result path back to the original session id/key before applying session visibility.

For example, both of these should resolve to the same underlying session id:

<session-id>.jsonl.reset.<timestamp>.md
<session-id>-jsonl-reset-<timestamp>.md

Then the existing visibility policy (self / tree / agent / all, plus sandbox clamp and agent-to-agent rules) can decide whether to surface it.

Actual behavior

The archived hit is indexed and retrievable by QMD/CLI, but can be dropped by memory_search(corpus="sessions") because the normalized archived result path cannot be mapped back to a session key.

Why this matters

This creates a confusing failure mode:

  • QMD has the transcript content.
  • Direct search proves the archived session is indexed.
  • memory_search(corpus="sessions") returns no relevant archived hit.
  • Debug output reports only final hit counts, so users cannot tell whether the backend found hits that were later removed by visibility filtering.

From the user perspective, reset/archive can still look like forgotten conversation history even after the #80736 indexing fix.

Suggested fixes

  1. Teach extractTranscriptIdentityFromSessionsMemoryHit() / related path handling to recognize QMD-normalized archived transcript names such as:
<session-id>-jsonl-reset-<timestamp>.md
<session-id>-jsonl-deleted-<timestamp>.md

and map them back to <session-id>.

  1. Add a regression test where:

    • QMD returns a session hit path with a normalized archived/reset stem.
    • The combined session store has the canonical archived session id/file.
    • filterMemorySearchHitsBySessionVisibility() keeps the hit when visibility allows it.
  2. Add memory_search debug counters for raw backend hits vs post-visibility-filter hits, with redacted reason buckets such as:

    • visibility_tree
    • sandbox_clamp
    • unmapped_transcript_stem
    • cross_agent_denied

This would make #53550-style failures much easier to diagnose without weakening the privacy model.

Privacy note

I intentionally used placeholder session ids/paths here. The underlying issue is filename/slug normalization and session-key resolution, not any user-specific data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Normal backlog priority with limited blast radius.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions