Skip to content

File-history follow-ups: persistence, shell tracking, bridge tests, perf, failure reasons #4204

@doudouOUC

Description

@doudouOUC

Tracks deferred follow-up items from PR #4064 (/rewind file restoration). All real but each is large enough to belong outside the introduction PR. Sibling follow-ups: #4173 (session backup cleanup), #4187 (RewindSelector + handleRewindConfirm tests), #4216 (TOCTOU + sticky failed-marker fixes).

Item A — Cross-session snapshot persistence

FileHistoryService.getSnapshots() and restoreFromSnapshots() exist but are never called outside tests. After a process restart or /resume, the in-memory state is empty, so /rewind cannot see snapshots from earlier sessions; the on-disk backup files in ~/.qwen/file-history/{prevSessionId}/ are orphaned with no metadata to restore them.

Upstream reference: claude-code/src/utils/sessionStorage.ts builds a FileHistorySnapshotChain keyed by message uuid into the session transcript; claude-code/src/utils/sessionRestore.ts calls fileHistoryRestoreStateFromLog on resume to hydrate appState.fileHistory. The full loop is closed.

Acceptance criteria:

  • ChatRecordingService (or sidecar) persists each turn's FileHistorySnapshot keyed by promptId
  • Session resume / /resume reads the persisted chain and calls restoreFromSnapshots
  • Schema is versioned so older session logs without snapshots load cleanly
  • Tests for resume → rewind round-trip

Estimated: ~80-150 LOC across chatRecordingService.ts, sessionService.ts, client.ts, plus tests.

Item B — Shell-tool change tracking

PR #4064 hooks file history into edit and write_file only. Changes made via run_shell_command (sed -i, cp, mv, rm, npm run …, git apply, etc.) and out-of-tool manual edits are not captured.

This matches upstream claude-code scope (only edit/write/notebook/simulated-sed-edit hook in). PR #4064 added a class-level JSDoc on FileHistoryService and a footer hint on the rewind UI ("Rewinding does not affect files edited manually or via shell commands.") so users are not misled.

Two concrete next steps:

  1. Port the simulated-sed-edit path from upstream — when the model emits sed -i 's/a/b/g' file, parseSedEditCommand re-routes execution through applySedEdit (a direct file write), which lets fileHistoryTrackEdit fire on the recognized file path. Brings a common bash idiom into the rewind covered set.
  2. Generic shell tracking is much harder — the proposed approach in the PR review (git status --porcelain post-shell) has correctness issues (only works for git repos, semantics are post-hoc not pre-edit, can clobber in-flight backups). Open question: is there a fs-watcher-or-pre-shell-snapshot design worth doing? Upstream punts on this.

Acceptance criteria:

  • (B1) port parseSedEditCommand + applySedEdit so qualifying sed -i invocations write through file history
  • (B2) decide on / prototype generic shell tracking, or formally close it as out of scope

Item C — GeminiClient.sendMessageStreammakeSnapshot bridge tests

client.ts calls getFileHistoryService().makeSnapshot(prompt_id) for SendMessageType.UserQuery turns. The integration is not directly tested:

  • not asserted that makeSnapshot is called for UserQuery
  • not asserted that it is skipped for non-UserQuery turns and retries
  • not asserted that a rejection from makeSnapshot is swallowed and the chat flow proceeds

Acceptance criteria: focused sendMessageStream test that mocks getFileHistoryService() and covers the three cases above.

Estimated: ~80-120 LOC test + setup.

Item D — getDiffStats concurrency limit

FileHistoryService.getDiffStats runs Promise.all over every entry in state.trackedFiles with no bound. Each iteration does stat + readFile (both original and backup) + diffLines. On a long-running session with many tracked large files this can spike memory and freeze the UI during the rewind selector's phase-2 diff load.

Reference: upstream claude-code/src/utils/fileHistory.ts fileHistoryGetDiffStats has the exact same unbounded Promise.all pattern, so this is a "go beyond upstream" improvement rather than a port defect. Worth doing because the typical session can grow into the danger zone, but not urgent.

Acceptance criteria: bound the parallelism to a small concurrency (e.g. 10) using a worker-pool helper or p-limit; add a regression test for the bounded behavior on a large trackedFiles set.

Estimated: ~30-50 LOC + 1 test.

Item E — Per-file failure reason for filesFailed

When /rewind reports Failed to restore N file(s): foo.ts, the user has no information about why — disk full, permission, race, missing backup. The cause is in debugLogger.error(...) but invisible under default logging.

Why this is deferred:

  • Adds failedReason?: string to FileHistoryBackup, which becomes part of the on-disk schema once item A lands. Want to settle the persistence shape and this field at the same time rather than churn the schema twice.
  • RewindResult.filesFailed is currently string[]; surfacing per-file reasons either changes that to Array<{ file, reason }> (API break for callers) or adds a parallel string[]. The right shape is easier to settle after item A.
  • Translating raw I/O strings (EACCES, ENOSPC) into user-friendly text is its own UX call.

Interim option: append a hint to the user-visible error item — e.g. " (run with --debug for details)". Cheap, no schema change.

Acceptance criteria:

  • FileHistoryBackup.failedReason?: string populated by makeSnapshot's catch
  • applySnapshot surfaces the reason alongside the file name in filesFailed
  • Reason rides along in the persisted schema (depends on item A)

Estimated: ~30-40 LOC + i18n + tests, blocked on item A's schema decision.

claude-code alignment and current scope

Local upstream reference checked in ~/Projects/claude-code on 2026-06-12.

  • Item A (cross-session persistence): supported upstream. claude-code records standalone JSONL entries with type: 'file-history-snapshot', keyed by the user message UUID. fileHistoryMakeSnapshot(...) records the initial snapshot for the turn, and fileHistoryTrackEdit(...) records isSnapshotUpdate: true for the same message after an edit mutates that snapshot. On resume, sessionStorage.ts builds the file-history snapshot chain in transcript order and fileHistoryRestoreStateFromLog(...) hydrates appState.fileHistory. When the session id changes, copyFileHistoryForResume(...) hard-links/copies backup files into the new session's file-history directory. Upstream does not use a dedicated file-history schemaVersion; older logs simply have no file-history-snapshot entries and load without file-history state.
  • Item B1 (simulated sed -i edit): supported upstream. BashTool/sedEditParser.ts recognizes simple in-place sed substitutions, the permission UI previews the exact file diff, and after approval an internal-only _simulatedSedEdit is injected. BashTool then applies the edit by direct file write and calls fileHistoryTrackEdit(...) before writing. This is the only shell-command path upstream routes through file history.
  • Item B2 (generic shell tracking): not supported upstream; out of scope for now. Arbitrary shell writes such as cp, mv, rm, git apply, package scripts, and manually edited files are not tracked by claude-code. We should not implement generic shell tracking in this issue unless upstream adds a concrete design later.
  • Item C (qwen-code GeminiClient.sendMessageStream bridge tests): qwen-specific. claude-code has no GeminiClient equivalent; snapshots are triggered from prompt-submit / query-engine paths. Keep this as qwen-code focused test coverage for our own bridge.
  • Item D (getDiffStats concurrency limit): not supported upstream; defer for now. claude-code/src/utils/fileHistory.ts:fileHistoryGetDiffStats still uses unbounded Promise.all over state.trackedFiles. Treat this as a possible qwen-code hardening task, not a port-alignment requirement.
  • Item E (per-file failure reasons): not supported upstream; defer for now. claude-code logs per-file restore failures and telemetry events but does not expose filesFailed reasons to users or persist a failedReason field. Keep qwen-code's current file-name-only failure reporting unless a separate UX/schema decision is made.

Current active scope from upstream parity: finish/verify Item A's persistence semantics, including snapshot-update behavior where needed; port Item B1 simulated sed edit; keep Item C bridge tests. Generic shell tracking, diff concurrency limiting, and per-file failure reasons are intentionally deferred because claude-code does not support them today.

Related

🤖 Generated with Qwen Code

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions