Skip to content

fix: persist orphaned user leaf repair#76969

Closed
serkonyc wants to merge 2 commits into
openclaw:mainfrom
serkonyc:fix/persist-orphan-user-repair
Closed

fix: persist orphaned user leaf repair#76969
serkonyc wants to merge 2 commits into
openclaw:mainfrom
serkonyc:fix/persist-orphan-user-repair

Conversation

@serkonyc

@serkonyc serkonyc commented May 3, 2026

Copy link
Copy Markdown
Contributor

Summary

  • persist orphaned trailing user leaf removal instead of only moving the in-memory SessionManager branch
  • keep branch/reset fallback for managers that do not expose file rewrite internals
  • add unit coverage for persisted leaf removal and fallback behavior
  • fix an existing oxlint unbound-method failure in the Discord draft preview callback path that CI surfaced on this branch

Fixes #76968.

Test plan

  • pnpm lint --threads=8
  • pnpm exec vitest run src/agents/pi-embedded-runner/run/attempt.test.ts --config test/vitest/vitest.unit.config.ts
  • pnpm tsgo:test:src

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S labels May 3, 2026
@clawsweeper

clawsweeper Bot commented May 3, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Summary
The PR adds a persisted leaf-removal helper for file-backed embedded-runner SessionManager orphan-user repair and unit tests for rewrite and fallback behavior.

Reproducibility: yes. source-reproducible: a file-backed SessionManager with a trailing user leaf reaches the orphan-repair block, and current main only calls branch() or resetLeaf() before rebuilding active messages. I did not run a live OpenClaw session in this read-only review.

Real behavior proof
Needs real behavior proof before merge: The PR body and comments provide lint/unit/typecheck evidence only, not after-fix evidence from a real OpenClaw setup. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, ask a maintainer to comment @clawsweeper re-review.

Next step before merge
Contributor action is needed for real behavior proof; an automated repair lane cannot provide evidence from the contributor's real setup.

Security
Cleared: The diff is limited to embedded-runner transcript persistence logic and focused unit tests, with no dependency, workflow, permission, secret-handling, install, or publishing change.

Review details

Best possible solution:

Land this or an equivalent narrow persisted leaf-removal fix after redacted real behavior proof is added and focused embedded-runner validation stays green.

Do we have a high-confidence way to reproduce the issue?

Yes, source-reproducible: a file-backed SessionManager with a trailing user leaf reaches the orphan-repair block, and current main only calls branch() or resetLeaf() before rebuilding active messages. I did not run a live OpenClaw session in this read-only review.

Is this the best way to solve the issue?

Yes for the code direction, but not yet for merge readiness. Rewriting the file-backed SessionManager entries while retaining the branch/reset fallback is the narrowest maintainable fix I saw, but the external PR still needs real behavior proof.

What I checked:

  • current_main_in_memory_only: Current main repairs a trailing user leaf by calling sessionManager.branch(...) or sessionManager.resetLeaf() and rebuilding active in-memory messages; this block does not remove the persisted file entry or call a rewrite hook. (src/agents/pi-embedded-runner/run/attempt.ts:3197, 333f65fc8a12)
  • helper_contract_requests_removal: mergeOrphanedTrailingUserPrompt returns removeLeaf: true for empty, already-queued, and merged orphaned user content, so the runtime path intentionally removes the active leaf after carrying or dropping the prompt text. (src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts:437, 333f65fc8a12)
  • current_tests_cover_merge_not_persistence: Current tests cover single-leaf merge and removeLeaf behavior, but current main has no removeSessionManagerLeafEntry helper or persisted JSONL rewrite coverage for this orphan-user repair path. (src/agents/pi-embedded-runner/run/attempt.test.ts:689, 333f65fc8a12)
  • dependency_contract_checked: OpenClaw pins @earendil-works/pi-coding-agent to 0.74.0; the upstream v0.74.0 SessionManager keeps private fileEntries, byId, and leafId, _rewriteFile() rewrites fileEntries, and branch()/resetLeaf() only move leafId. (package.json:1759, 333f65fc8a12)
  • pr_implements_narrow_rewrite_path: The PR adds removeSessionManagerLeafEntry, which removes the leaf entry from fileEntries, deletes it from byId, moves leafId to the parent, calls _rewriteFile(), and falls back to branch/reset when rewrite internals are unavailable. (src/agents/pi-embedded-runner/run/attempt.ts:563, e42bceefd658)
  • real_behavior_proof_missing: The PR body lists lint, a focused Vitest command, and tsgo:test:src; the author follow-up also states the current blocker is the real behavior proof gate, not the fix direction. (2b4518a0d0c2)

Likely related people:

  • vincentkoc: Commit fcae3bf9433b98b22c1d995644e9e33f0621d8f8 added the active-turn queued user prompt preservation helper and tests in the same embedded-runner path this PR extends. (role: introduced current orphan-merge behavior; confidence: high; commits: fcae3bf9433b; files: src/agents/pi-embedded-runner/run/attempt.ts, src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts, src/agents/pi-embedded-runner/run/attempt.test.ts)
  • steipete: Recent GitHub path history and local blame show Peter Steinberger repeatedly touching attempt.ts and adjacent runtime-state/transcript areas, including current-line snapshot ownership in this checkout. (role: recent embedded-runner and transcript-state contributor; confidence: medium; commits: e79e5dbbdf5a, c6ddb1afb7bc, 694ca50e9775; files: src/agents/pi-embedded-runner/run/attempt.ts, src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts, src/agents/pi-embedded-runner/transcript-file-state.ts)
  • jalehman: Recent session repair/replay work touches the same embedded-runner and transcript consistency area, so this is a useful secondary routing candidate for review coordination. (role: adjacent session repair contributor; confidence: medium; commits: 4bfd7416f0f9, 75e7fc97f804; files: src/agents/pi-embedded-runner/run/attempt.ts, src/agents/session-file-repair.ts)

Remaining risk / open question:

  • The contributor has not supplied after-fix evidence from a real OpenClaw setup, which is required before merge for an external bug-fix PR.
  • The implementation intentionally relies on private SessionManager internals from the pinned Pi dependency; that matches existing local repair patterns but remains coupled to upstream internals.
  • The broader stale-reply transcript cluster has adjacent open work, so this PR should stay scoped to persisted orphaned-user leaf removal rather than claiming to solve every stale-context symptom.
  • No live OpenClaw run or test command was executed in this read-only review; the reproduction assessment is source-backed plus issue discussion evidence.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 333f65fc8a12.

@openclaw-barnacle openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. and removed channel: discord Channel integration: discord labels May 8, 2026
@serkonyc

serkonyc commented May 8, 2026

Copy link
Copy Markdown
Contributor Author

Maintenance sweep after v2026.5.7:

  • Checked the latest release notes and current origin/main; I don't see this fix shipped yet, so this PR should stay open.
  • Updated the branch from latest main via gh pr update-branch.
  • Current blocker: the new Real behavior proof gate is failing because the PR body only has unit/lint/typecheck evidence. It needs after-fix evidence from a real OpenClaw setup before merge.

Not closing this as fixed-by-release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: orphaned user repair leaves persisted transcript out of sync

1 participant