File-history follow-ups: persistence, shell tracking, bridge tests, perf, failure reasons

Tracks deferred follow-up items from PR #4064 (`/rewind` file restoration). All real but each is large enough to belong outside the introduction PR. Sibling follow-ups: #4173 (session backup cleanup), #4187 (RewindSelector + handleRewindConfirm tests), #4216 (TOCTOU + sticky failed-marker fixes).

## Item A — Cross-session snapshot persistence

`FileHistoryService.getSnapshots()` and `restoreFromSnapshots()` exist but are never called outside tests. After a process restart or `/resume`, the in-memory state is empty, so `/rewind` cannot see snapshots from earlier sessions; the on-disk backup files in `~/.qwen/file-history/{prevSessionId}/` are orphaned with no metadata to restore them.

**Upstream reference**: `claude-code/src/utils/sessionStorage.ts` builds a `FileHistorySnapshotChain` keyed by message uuid into the session transcript; `claude-code/src/utils/sessionRestore.ts` calls `fileHistoryRestoreStateFromLog` on resume to hydrate `appState.fileHistory`. The full loop is closed.

**Acceptance criteria**:
- `ChatRecordingService` (or sidecar) persists each turn's `FileHistorySnapshot` keyed by `promptId`
- Session resume / `/resume` reads the persisted chain and calls `restoreFromSnapshots`
- Schema is versioned so older session logs without snapshots load cleanly
- Tests for resume → rewind round-trip

Estimated: ~80-150 LOC across `chatRecordingService.ts`, `sessionService.ts`, `client.ts`, plus tests.

## Item B — Shell-tool change tracking

PR #4064 hooks file history into `edit` and `write_file` only. Changes made via `run_shell_command` (`sed -i`, `cp`, `mv`, `rm`, `npm run …`, `git apply`, etc.) and out-of-tool manual edits are not captured.

This matches upstream `claude-code` scope (only edit/write/notebook/simulated-sed-edit hook in). PR #4064 added a class-level JSDoc on `FileHistoryService` and a footer hint on the rewind UI ("Rewinding does not affect files edited manually or via shell commands.") so users are not misled.

**Two concrete next steps**:
1. **Port the simulated-sed-edit path** from upstream — when the model emits `sed -i 's/a/b/g' file`, `parseSedEditCommand` re-routes execution through `applySedEdit` (a direct file write), which lets `fileHistoryTrackEdit` fire on the recognized file path. Brings a common bash idiom into the rewind covered set.
2. **Generic shell tracking** is much harder — the proposed approach in the PR review (`git status --porcelain` post-shell) has correctness issues (only works for git repos, semantics are post-hoc not pre-edit, can clobber in-flight backups). Open question: is there a fs-watcher-or-pre-shell-snapshot design worth doing? Upstream punts on this.

**Acceptance criteria**:
- (B1) port `parseSedEditCommand` + `applySedEdit` so qualifying `sed -i` invocations write through file history
- (B2) decide on / prototype generic shell tracking, or formally close it as out of scope

## Item C — `GeminiClient.sendMessageStream` → `makeSnapshot` bridge tests

`client.ts` calls `getFileHistoryService().makeSnapshot(prompt_id)` for `SendMessageType.UserQuery` turns. The integration is not directly tested:
- not asserted that `makeSnapshot` is called for UserQuery
- not asserted that it is *skipped* for non-UserQuery turns and retries
- not asserted that a rejection from `makeSnapshot` is swallowed and the chat flow proceeds

**Acceptance criteria**: focused `sendMessageStream` test that mocks `getFileHistoryService()` and covers the three cases above.

Estimated: ~80-120 LOC test + setup.

## Item D — `getDiffStats` concurrency limit

`FileHistoryService.getDiffStats` runs `Promise.all` over every entry in `state.trackedFiles` with no bound. Each iteration does `stat` + `readFile` (both original and backup) + `diffLines`. On a long-running session with many tracked large files this can spike memory and freeze the UI during the rewind selector's phase-2 diff load.

**Reference**: upstream `claude-code/src/utils/fileHistory.ts fileHistoryGetDiffStats` has the exact same unbounded `Promise.all` pattern, so this is a "go beyond upstream" improvement rather than a port defect. Worth doing because the typical session can grow into the danger zone, but not urgent.

**Acceptance criteria**: bound the parallelism to a small concurrency (e.g. 10) using a worker-pool helper or `p-limit`; add a regression test for the bounded behavior on a large `trackedFiles` set.

Estimated: ~30-50 LOC + 1 test.

## Item E — Per-file failure reason for filesFailed

When `/rewind` reports `Failed to restore N file(s): foo.ts`, the user has no information about *why* — disk full, permission, race, missing backup. The cause is in `debugLogger.error(...)` but invisible under default logging.

**Why this is deferred**:
- Adds `failedReason?: string` to `FileHistoryBackup`, which becomes part of the on-disk schema once item A lands. Want to settle the persistence shape and this field at the same time rather than churn the schema twice.
- `RewindResult.filesFailed` is currently `string[]`; surfacing per-file reasons either changes that to `Array<{ file, reason }>` (API break for callers) or adds a parallel `string[]`. The right shape is easier to settle after item A.
- Translating raw I/O strings (`EACCES`, `ENOSPC`) into user-friendly text is its own UX call.

**Interim option**: append a hint to the user-visible error item — e.g. `" (run with --debug for details)"`. Cheap, no schema change.

**Acceptance criteria**:
- `FileHistoryBackup.failedReason?: string` populated by `makeSnapshot`'s catch
- `applySnapshot` surfaces the reason alongside the file name in `filesFailed`
- Reason rides along in the persisted schema (depends on item A)

Estimated: ~30-40 LOC + i18n + tests, blocked on item A's schema decision.

## claude-code alignment and current scope

Local upstream reference checked in `~/Projects/claude-code` on 2026-06-12.

- **Item A (cross-session persistence): supported upstream.** `claude-code` records standalone JSONL entries with `type: 'file-history-snapshot'`, keyed by the user message UUID. `fileHistoryMakeSnapshot(...)` records the initial snapshot for the turn, and `fileHistoryTrackEdit(...)` records `isSnapshotUpdate: true` for the same message after an edit mutates that snapshot. On resume, `sessionStorage.ts` builds the file-history snapshot chain in transcript order and `fileHistoryRestoreStateFromLog(...)` hydrates `appState.fileHistory`. When the session id changes, `copyFileHistoryForResume(...)` hard-links/copies backup files into the new session's file-history directory. Upstream does not use a dedicated file-history `schemaVersion`; older logs simply have no `file-history-snapshot` entries and load without file-history state.
- **Item B1 (simulated `sed -i` edit): supported upstream.** `BashTool/sedEditParser.ts` recognizes simple in-place sed substitutions, the permission UI previews the exact file diff, and after approval an internal-only `_simulatedSedEdit` is injected. `BashTool` then applies the edit by direct file write and calls `fileHistoryTrackEdit(...)` before writing. This is the only shell-command path upstream routes through file history.
- **Item B2 (generic shell tracking): not supported upstream; out of scope for now.** Arbitrary shell writes such as `cp`, `mv`, `rm`, `git apply`, package scripts, and manually edited files are not tracked by `claude-code`. We should not implement generic shell tracking in this issue unless upstream adds a concrete design later.
- **Item C (qwen-code `GeminiClient.sendMessageStream` bridge tests): qwen-specific.** `claude-code` has no `GeminiClient` equivalent; snapshots are triggered from prompt-submit / query-engine paths. Keep this as qwen-code focused test coverage for our own bridge.
- **Item D (`getDiffStats` concurrency limit): not supported upstream; defer for now.** `claude-code/src/utils/fileHistory.ts:fileHistoryGetDiffStats` still uses unbounded `Promise.all` over `state.trackedFiles`. Treat this as a possible qwen-code hardening task, not a port-alignment requirement.
- **Item E (per-file failure reasons): not supported upstream; defer for now.** `claude-code` logs per-file restore failures and telemetry events but does not expose `filesFailed` reasons to users or persist a `failedReason` field. Keep qwen-code's current file-name-only failure reporting unless a separate UX/schema decision is made.

**Current active scope from upstream parity:** finish/verify Item A's persistence semantics, including snapshot-update behavior where needed; port Item B1 simulated sed edit; keep Item C bridge tests. Generic shell tracking, diff concurrency limiting, and per-file failure reasons are intentionally deferred because `claude-code` does not support them today.

## Related

- PR #4064 — introduces all integration points
- PR #4216 — TOCTOU ordering + sticky failed marker fixes
- #4173 — file-history session backup cleanup (mtime-based housekeeping)
- #4187 — RewindSelector + handleRewindConfirm tests

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File-history follow-ups: persistence, shell tracking, bridge tests, perf, failure reasons #4204

Item A — Cross-session snapshot persistence

Item B — Shell-tool change tracking

Item C — `GeminiClient.sendMessageStream` → `makeSnapshot` bridge tests

Item D — `getDiffStats` concurrency limit

Item E — Per-file failure reason for filesFailed

claude-code alignment and current scope

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

File-history follow-ups: persistence, shell tracking, bridge tests, perf, failure reasons #4204

Description

Item A — Cross-session snapshot persistence

Item B — Shell-tool change tracking

Item C — GeminiClient.sendMessageStream → makeSnapshot bridge tests

Item D — getDiffStats concurrency limit

Item E — Per-file failure reason for filesFailed

claude-code alignment and current scope

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Item C — `GeminiClient.sendMessageStream` → `makeSnapshot` bridge tests

Item D — `getDiffStats` concurrency limit