Skip to content

fix: fall back to canonical session transcripts during cleanup#69954

Open
Blahdude wants to merge 1 commit into
openclaw:mainfrom
Blahdude:fix/50248-sessions-cleanup-fallback
Open

fix: fall back to canonical session transcripts during cleanup#69954
Blahdude wants to merge 1 commit into
openclaw:mainfrom
Blahdude:fix/50248-sessions-cleanup-fallback

Conversation

@Blahdude

Copy link
Copy Markdown

Summary

  • Problem: sessions cleanup --fix-missing can prune a live session when the persisted sessionFile path is stale even though the canonical <sessionId>.jsonl transcript still exists.
  • Why it matters: valid sessions can be incorrectly deleted during cleanup, which makes the command destructive in a recoverable metadata-drift case.
  • What changed: cleanup now checks the canonical transcript path before pruning and repairs stale sessionFile metadata when that canonical transcript exists.
  • What did NOT change (scope boundary): this does not change broader session pruning rules, transcript rotation behavior, or unrelated cleanup paths.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause (if applicable)

  • Root cause: cleanup treated a missing persisted sessionFile as authoritative and never checked whether the canonical transcript path for the same sessionId still existed.
  • Missing detection / guardrail: there was no regression test covering stale sessionFile metadata with a live canonical transcript.
  • Contributing context (if known): older or migrated session metadata can retain a structurally valid but outdated transcript path.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/commands/sessions-cleanup.test.ts
  • Scenario the test should lock in: when entry.sessionFile points to a missing file but the canonical <sessionId>.jsonl exists, --fix-missing should keep the session and repair the stored path instead of pruning it.
  • Why this is the smallest reliable guardrail: the bug is isolated to cleanup-path resolution logic and can be reproduced deterministically without a broader integration harness.
  • Existing test that already covers this (if any): none
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

  • sessions cleanup --fix-missing no longer prunes sessions solely because a stale stored transcript path is missing when the canonical transcript still exists.
  • In that case, the command repairs persisted sessionFile metadata to the canonical transcript path.

Diagram (if applicable)

Before:
cleanup -> check stale persisted sessionFile -> file missing -> prune session

After:
cleanup -> check stale persisted sessionFile -> file missing
        -> check canonical transcript path -> file exists
        -> repair sessionFile -> keep session

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node v20.20.2 in local dev checkout
  • Model/provider: N/A
  • Integration/channel (if any): N/A
  • Relevant config (redacted): default local session cleanup path

Steps

  1. Create a session-store entry with sessionId=<id> and a stale but structurally valid sessionFile.
  2. Ensure the stale sessionFile target is missing but the canonical <sessionId>.jsonl transcript exists.
  3. Run sessions cleanup --fix-missing.

Expected

  • The session is kept and its stored sessionFile is repaired to the canonical transcript path.

Actual

  • Before this change, the session was pruned.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: dry-run accounting for missing entries, stale-path fallback to canonical transcript, applied cleanup repairing sessionFile instead of pruning.
  • Edge cases checked: true missing transcripts still count as missing; existing canonical transcript path is only used when it differs from the stale persisted path.
  • What you did not verify: full end-to-end CLI repro against a live user session store.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps:

Risks and Mitigations

  • Risk: cleanup may preserve an entry in cases where the canonical path exists unexpectedly.
    • Mitigation: fallback only applies when the persisted path is missing and the canonical path for the same sessionId exists; otherwise cleanup behavior is unchanged.

@Blahdude Blahdude changed the title [codex] preserve canonical session transcripts during cleanup fix: fall back to canonical session transcripts during cleanup Apr 22, 2026
@openclaw-barnacle openclaw-barnacle Bot added commands Command implementations size: S labels Apr 22, 2026
@Blahdude Blahdude marked this pull request as ready for review April 22, 2026 03:15
@greptile-apps

greptile-apps Bot commented Apr 22, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes sessions cleanup --fix-missing incorrectly pruning sessions whose persisted sessionFile path is stale but whose canonical <sessionId>.jsonl transcript still exists. The fix adds a fallback check in pruneMissingTranscriptEntries that repairs the stored path instead of deleting the entry, and covers both the dry-run preview and the applied path.

  • The wouldMutate field in the JSON output remains false when only path-repairs occur (no prunings), because the function returns only the pruned count. Callers relying on wouldMutate to detect store mutations would see a misleading no-op signal for repair-only runs.

Confidence Score: 5/5

Safe to merge; the bug fix is correct and tests cover both dry-run and apply paths.

All remaining findings are P2. The core fallback logic and in-place repair are correct, the guard conditions are sound, and the test mocking strategy is appropriate. The only gap is a minor output inaccuracy in wouldMutate that does not affect actual cleanup behavior.

No files require special attention.

Comments Outside Diff (1)

  1. src/commands/sessions-cleanup.ts, line 435-441 (link)

    P2 wouldMutate stays false when only repairs happen

    When pruneMissingTranscriptEntries repairs a stale sessionFile but prunes nothing, it returns 0. Both the dry-run summary (missing > 0 || ...) and the applied summary (missingApplied > 0 || ...) use that count as their sole signal from this function, so wouldMutate is reported as false even though the store was mutated. A caller or monitoring script that inspects wouldMutate in the JSON output would incorrectly conclude cleanup was a no-op when in reality session metadata was rewritten.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/commands/sessions-cleanup.ts
    Line: 435-441
    
    Comment:
    **`wouldMutate` stays `false` when only repairs happen**
    
    When `pruneMissingTranscriptEntries` repairs a stale `sessionFile` but prunes nothing, it returns `0`. Both the dry-run summary (`missing > 0 || ...`) and the applied summary (`missingApplied > 0 || ...`) use that count as their sole signal from this function, so `wouldMutate` is reported as `false` even though the store was mutated. A caller or monitoring script that inspects `wouldMutate` in the JSON output would incorrectly conclude cleanup was a no-op when in reality session metadata was rewritten.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/commands/sessions-cleanup.ts
Line: 435-441

Comment:
**`wouldMutate` stays `false` when only repairs happen**

When `pruneMissingTranscriptEntries` repairs a stale `sessionFile` but prunes nothing, it returns `0`. Both the dry-run summary (`missing > 0 || ...`) and the applied summary (`missingApplied > 0 || ...`) use that count as their sole signal from this function, so `wouldMutate` is reported as `false` even though the store was mutated. A caller or monitoring script that inspects `wouldMutate` in the JSON output would incorrectly conclude cleanup was a no-op when in reality session metadata was rewritten.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: preserve canonical session transcri..." | Re-trigger Greptile

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 881bd5fbc1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +164 to +165
entry.sessionFile = fallbackTranscriptPath;
continue;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Report repaired transcript paths as cleanup mutations

When this branch repairs entry.sessionFile, the function still returns only a prune count (removed), so downstream summary fields derived from that count can report no mutation even though the store is rewritten. In practice, sessions cleanup --fix-missing --dry-run --json can show missing: 0 / wouldMutate: false for stale-path repair cases, which makes dry-run output inaccurate for operators or automation that rely on it.

Useful? React with 👍 / 👎.

@clawsweeper

clawsweeper Bot commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Latest ClawSweeper review: 2026-05-20 09:46 UTC / May 20, 2026, 5:46 AM ET.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR modifies the command-local sessions cleanup helper to fall back to canonical <sessionId>.jsonl transcripts, repair stale sessionFile values, and add command tests.

Reproducibility: yes. at source level. A store entry with a valid sessionId, a missing stale generated sessionFile, and an existing canonical <sessionId>.jsonl still follows the single resolved-path cleanup check on current main and would be pruned.

PR rating
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🧂 unranked krab
Summary: Useful bug-fix direction, but the PR is not quality-ready because proof is missing and the patch is stale/incomplete for current main.

Rank-up moves:

  • Add redacted after-fix real behavior proof from a real cleanup scenario.
  • Port the fix into src/config/sessions/cleanup-service.ts and keep command code as a caller only.
  • Count repaired transcript paths in dry-run and applied mutation summaries.
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Needs real behavior proof before merge: The PR body describes local verification, but it has no inspectable after-fix terminal output, logs, screenshot, recording, copied live output, or linked artifact; the contributor should add redacted proof and update the PR body to trigger a fresh review, or ask a maintainer to comment @clawsweeper re-review.

Risk before merge

  • Merging the branch as-is would not fix current-main cleanup because the active pruning helper now lives in src/config/sessions/cleanup-service.ts, not the command-local helper changed by the PR.
  • Repair-only runs can still report wouldMutate: false unless repaired transcript paths are counted in preview and applied summaries.
  • The branch is currently conflicting against main, so the exact merge result needs maintainer or contributor handling.
  • The external PR body describes local verification but provides no inspectable after-fix terminal output, logs, screenshot, recording, copied live output, or linked artifact.

Maintainer options:

  1. Port service path and count repairs (recommended)
    Move the fallback and metadata repair into cleanup-service.ts and surface repaired-path mutations in both dry-run and applied summaries before merge.
  2. Replace the stale branch
    If rebasing this branch is awkward, open a fresh current-main replacement that credits this PR and carries the same narrow cleanup-service fix.
  3. Require real cleanup proof
    Wait for redacted after-fix CLI output or logs showing stale-path repair and true-missing pruning before accepting the session-state behavior change.

Next step before merge
Needs human PR handling because the branch is conflicting, lacks inspectable real behavior proof, and must be ported into the cleanup service with mutation accounting.

Security
Cleared: The diff only changes local session cleanup path checks and tests, with no dependency, workflow, network, secret-handling, permission, or code-execution surface added.

Review findings

  • [P2] Port the fallback into the cleanup service — src/commands/sessions-cleanup.ts:152-164
  • [P2] Report repaired transcript paths as mutations — src/commands/sessions-cleanup.ts:164-165
Review details

Best possible solution:

Port the fallback-and-repair behavior into src/config/sessions/cleanup-service.ts, count repaired paths in cleanup summaries, preserve true-missing pruning, add service-level regression coverage, and require redacted CLI proof before merge.

Do we have a high-confidence way to reproduce the issue?

Yes, at source level. A store entry with a valid sessionId, a missing stale generated sessionFile, and an existing canonical <sessionId>.jsonl still follows the single resolved-path cleanup check on current main and would be pruned.

Is this the best way to solve the issue?

No, not as submitted. The fallback-and-repair direction is the right narrow fix, but it needs to be ported to the cleanup service, count repair mutations, and include real behavior proof.

Label justifications:

  • P2: This is a focused session cleanup safety fix for a real linked bug, with limited surface area but meaningful session-history impact.
  • merge-risk: 🚨 session-state: The PR changes cleanup behavior that preserves, prunes, and rewrites session transcript metadata, so an incomplete merge can stale or mis-associate session state.
  • rating: 🧂 unranked krab: Current PR rating is 🧂 unranked krab because proof is 🧂 unranked krab, patch quality is 🧂 unranked krab, and Useful bug-fix direction, but the PR is not quality-ready because proof is missing and the patch is stale/incomplete for current main.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body describes local verification, but it has no inspectable after-fix terminal output, logs, screenshot, recording, copied live output, or linked artifact; the contributor should add redacted proof and update the PR body to trigger a fresh review, or ask a maintainer to comment @clawsweeper re-review.

Full review comments:

  • [P2] Port the fallback into the cleanup service — src/commands/sessions-cleanup.ts:152-164
    Current main runs sessions cleanup through runSessionsCleanup, whose active pruneMissingTranscriptEntries lives in src/config/sessions/cleanup-service.ts. Keeping this fallback only in the stale command-local helper will not fix current main after rebase; move the behavior and coverage to the service path.
    Confidence: 0.9
  • [P2] Report repaired transcript paths as mutations — src/commands/sessions-cleanup.ts:164-165
    When this branch repairs entry.sessionFile, the helper still returns only the prune count. Repair-only dry runs or applied JSON summaries can therefore show missing: 0 and wouldMutate: false even though cleanup rewrote session metadata; return a repaired count or mutation flag and include it in summary accounting.
    Confidence: 0.95

Overall correctness: patch is incorrect
Overall confidence: 0.93

What I checked:

  • Current active cleanup path still uses one resolved transcript path: On current main, pruneMissingTranscriptEntries lives in src/config/sessions/cleanup-service.ts, resolves resolveSessionFilePath(entry.sessionId, entry, ...) once, and deletes the store entry when that one path is absent; there is no canonical transcript fallback or metadata repair in the active service path. (src/config/sessions/cleanup-service.ts:198, 110042d840bb)
  • Command now delegates cleanup to the service: The CLI command calls runSessionsCleanup from src/config/sessions.js, so a fix kept in the old command-local helper will not cover current-main cleanup behavior after rebase. (src/commands/sessions-cleanup.ts:219, 110042d840bb)
  • Repair-only mutation accounting is missing: Current summary logic sets wouldMutate from prune/count signals; the PR's helper mutates entry.sessionFile but still returns only the removed count, so repair-only cleanup can be reported as a no-op unless repairs are counted separately. (src/config/sessions/cleanup-service.ts:354, 110042d840bb)
  • Transcript candidate contract supports the requested fallback model: resolveSessionFilePath prefers a persisted sessionFile, while resolveSessionTranscriptCandidates also adds the store-directory canonical <sessionId>.jsonl candidate; cleanup currently does not use that broader candidate behavior. (src/gateway/session-transcript-files.fs.ts:71, 110042d840bb)
  • PR diff targets stale command-local code: The PR commit changes only src/commands/sessions-cleanup.ts and src/commands/sessions-cleanup.test.ts, adding fallback and repair logic to the old command helper rather than the current cleanup service. (src/commands/sessions-cleanup.ts:150, 881bd5fbc1a2)
  • GitHub state and related item context: Live PR metadata shows this PR closes the still-open canonical issue, is currently conflicting, has a failed Real behavior proof check, and prior review comments already flagged the mutation-accounting gap.

Likely related people:

  • SidQin-cyber: Commit 71e45ce added the --fix-missing cleanup path whose single-path transcript check is implicated by this PR. (role: introduced related cleanup behavior; confidence: high; commits: 71e45ceecced; files: src/commands/sessions-cleanup.ts)
  • Peter Steinberger: Commit b151694 moved sessions cleanup into cleanup-service.ts, and nearby history includes cleanup mutation-check refactors and the latest release commit touching this area. (role: recent area contributor; confidence: high; commits: b151694e0048, bdf3b4a317e2, 50a2481652b6; files: src/config/sessions/cleanup-service.ts, src/commands/sessions-cleanup.ts)
  • Gustavo Madeira Santana: Commit eff3c5ce introduced the sessions cleanup command and store pruning integration coverage that later received the --fix-missing behavior. (role: cleanup command foundation contributor; confidence: medium; commits: eff3c5c70778; files: src/commands/sessions-cleanup.ts, src/config/sessions/store.pruning.integration.test.ts)
  • Tak Hoffman: Commit 81818df changed session-file persistence around fresh sessions, adjacent to the stale sessionFile producer-side behavior discussed in the linked issue. (role: adjacent session-file contributor; confidence: medium; commits: 81818df1b483; files: src/config/sessions/session-file.ts, src/config/sessions/sessions.test.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 110042d840bb.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. labels May 20, 2026
@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature or rarity is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

@openclaw-barnacle openclaw-barnacle Bot added the triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. label May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

commands Command implementations merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P2 Normal backlog priority with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: S status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. triage: refactor-only Candidate: refactor/cleanup-only PR without maintainer context.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: sessions cleanup --fix-missing can prune fresh cron sessions with valid transcripts

2 participants