fix(backup): gracefully skip session files deleted during backup create#67472
fix(backup): gracefully skip session files deleted during backup create#67472zhouhe-xydt wants to merge 1 commit intoopenclaw:mainfrom
Conversation
`openclaw backup create` could fail with ENOENT when the gateway's session compaction deleted a session file between enumeration and tar archiving. Replace `tar.c` with `tar.Pack` so that ENOENT errors during archiving are caught and skipped rather than failing the entire backup. Skipped files are reported in the backup summary. Fixes openclaw#67417
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ca60d00251
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| await fs.writeFile(manifestPath, `${JSON.stringify(manifest, null, 2)}\n`, "utf8"); | ||
|
|
||
| await tar.c( | ||
| const skippedPaths = await createTarArchiveWithMissingSkip( |
There was a problem hiding this comment.
Regenerate manifest after archive-time path skips
createBackupArchive writes manifest.json before createTarArchiveWithMissingSkip discovers ENOENT paths and appends them to result.skipped. If an entire planned asset path disappears between planning and packing (for example, an external workspace dir), the archive omits that payload but manifest.assets still declares it present, which later fails verification in verifyManifestAgainstEntries (src/commands/backup-verify.ts:236-247) and leaves non-verified backups internally inconsistent. Rebuild or rewrite the manifest after applying skippedPaths (or drop missing assets from assets) before publishing.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR fixes a race condition in Three minor observations:
Confidence Score: 5/5Safe to merge; all findings are P2 style or test-quality observations that don't affect production correctness. The production fix is logically sound — tar.Pack emits file-level errors without erroring the write stream, so the Promise correctly resolves via stream.on('close') even when ENOENT entries are skipped. No P0 or P1 issues were found. The three noted issues are style (redundant file option), documentation (in-archive manifest gap), and test quality (mock can't exercise the new path), none of which affect runtime correctness. src/commands/backup.test-support.ts — MockPack.end() propagates all errors to the stream, blocking future regression tests for the ENOENT-skip path. Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/commands/backup.test-support.ts
Line: 41-47
Comment:
**Mock propagates ENOENT to stream, preventing testing of skip behavior**
In `MockPack.end()`, any `tarCreateMock` rejection always reaches `this.stream?.emit("error", err)` — even for ENOENT errors that the pack error listener (line 155–161 in `backup-create.ts`) intentionally swallows by returning early. In production, a file-level ENOENT emitted on the Pack doesn't error the write stream; the stream keeps going and eventually closes. In the mock, that same ENOENT would still fire `stream.on("error", reject)` on line 154, causing the Promise to reject instead of resolving with the populated `skipped` array.
This means no regression test can be written for the primary fix using the current mock — any test that simulates a missing file via `tarCreateMock.mockRejectedValue({ code: "ENOENT", path: "/..." })` would see the operation fail rather than succeed with a skipped entry. The mock needs to distinguish pack-level file errors (which don't propagate to the write stream in production) from genuine stream errors.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/infra/backup-create.ts
Line: 142-150
Comment:
**`file` option is unused by `tar.Pack` — only `tar.c`/`createFile` opens the stream**
`tar.Pack` stores `file` in its options object but never opens a WriteStream for it; that behaviour belongs exclusively to the higher-level `tar.c` / `createFile` helpers. The actual output here is driven entirely by `pack.pipe(createWriteStream(params.file))` two lines below. Passing `file` to the Pack constructor is harmless but misleading — it looks like it controls where the archive is written (as it does in `tar.c`), but it doesn't.
```suggestion
const pack = new tar.Pack({
gzip: params.gzip,
portable: params.portable,
preservePaths: params.preservePaths,
onWriteEntry: params.onWriteEntry,
});
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/infra/backup-create.ts
Line: 397-414
Comment:
**Manifest written before archiving won't reflect ENOENT-skipped files**
The manifest is serialised and written to `manifestPath` before `createTarArchiveWithMissingSkip` runs. ENOENT-skipped entries are appended to `result.skipped` only after archiving completes, so the manifest embedded in the archive will never list them. Any restore tool (or human) inspecting the in-archive manifest would see the sessions asset as fully present even when individual files were silently omitted.
This is a trade-off inherent in writing the manifest first so it can be included in the archive, and the CLI summary output does report the skipped files correctly. Worth an inline comment explaining the gap so future maintainers don't assume the in-archive manifest is a complete record of what was actually archived vs. skipped.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(backup): gracefully skip session fil..." | Re-trigger Greptile |
| } catch (err) { | ||
| for (const listener of this.errorListeners) { | ||
| listener(err); | ||
| } | ||
| this.stream?.emit("error" as never, err); | ||
| } | ||
| } |
There was a problem hiding this comment.
Mock propagates ENOENT to stream, preventing testing of skip behavior
In MockPack.end(), any tarCreateMock rejection always reaches this.stream?.emit("error", err) — even for ENOENT errors that the pack error listener (line 155–161 in backup-create.ts) intentionally swallows by returning early. In production, a file-level ENOENT emitted on the Pack doesn't error the write stream; the stream keeps going and eventually closes. In the mock, that same ENOENT would still fire stream.on("error", reject) on line 154, causing the Promise to reject instead of resolving with the populated skipped array.
This means no regression test can be written for the primary fix using the current mock — any test that simulates a missing file via tarCreateMock.mockRejectedValue({ code: "ENOENT", path: "/..." }) would see the operation fail rather than succeed with a skipped entry. The mock needs to distinguish pack-level file errors (which don't propagate to the write stream in production) from genuine stream errors.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/commands/backup.test-support.ts
Line: 41-47
Comment:
**Mock propagates ENOENT to stream, preventing testing of skip behavior**
In `MockPack.end()`, any `tarCreateMock` rejection always reaches `this.stream?.emit("error", err)` — even for ENOENT errors that the pack error listener (line 155–161 in `backup-create.ts`) intentionally swallows by returning early. In production, a file-level ENOENT emitted on the Pack doesn't error the write stream; the stream keeps going and eventually closes. In the mock, that same ENOENT would still fire `stream.on("error", reject)` on line 154, causing the Promise to reject instead of resolving with the populated `skipped` array.
This means no regression test can be written for the primary fix using the current mock — any test that simulates a missing file via `tarCreateMock.mockRejectedValue({ code: "ENOENT", path: "/..." })` would see the operation fail rather than succeed with a skipped entry. The mock needs to distinguish pack-level file errors (which don't propagate to the write stream in production) from genuine stream errors.
How can I resolve this? If you propose a fix, please make it concise.| const pack = new tar.Pack({ | ||
| file: params.file, | ||
| gzip: params.gzip, | ||
| portable: params.portable, | ||
| preservePaths: params.preservePaths, | ||
| onWriteEntry: params.onWriteEntry, | ||
| }); | ||
| const stream = createWriteStream(params.file); | ||
| pack.pipe(stream); |
There was a problem hiding this comment.
file option is unused by tar.Pack — only tar.c/createFile opens the stream
tar.Pack stores file in its options object but never opens a WriteStream for it; that behaviour belongs exclusively to the higher-level tar.c / createFile helpers. The actual output here is driven entirely by pack.pipe(createWriteStream(params.file)) two lines below. Passing file to the Pack constructor is harmless but misleading — it looks like it controls where the archive is written (as it does in tar.c), but it doesn't.
| const pack = new tar.Pack({ | |
| file: params.file, | |
| gzip: params.gzip, | |
| portable: params.portable, | |
| preservePaths: params.preservePaths, | |
| onWriteEntry: params.onWriteEntry, | |
| }); | |
| const stream = createWriteStream(params.file); | |
| pack.pipe(stream); | |
| const pack = new tar.Pack({ | |
| gzip: params.gzip, | |
| portable: params.portable, | |
| preservePaths: params.preservePaths, | |
| onWriteEntry: params.onWriteEntry, | |
| }); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/infra/backup-create.ts
Line: 142-150
Comment:
**`file` option is unused by `tar.Pack` — only `tar.c`/`createFile` opens the stream**
`tar.Pack` stores `file` in its options object but never opens a WriteStream for it; that behaviour belongs exclusively to the higher-level `tar.c` / `createFile` helpers. The actual output here is driven entirely by `pack.pipe(createWriteStream(params.file))` two lines below. Passing `file` to the Pack constructor is harmless but misleading — it looks like it controls where the archive is written (as it does in `tar.c`), but it doesn't.
```suggestion
const pack = new tar.Pack({
gzip: params.gzip,
portable: params.portable,
preservePaths: params.preservePaths,
onWriteEntry: params.onWriteEntry,
});
```
How can I resolve this? If you propose a fix, please make it concise.| await fs.writeFile(manifestPath, `${JSON.stringify(manifest, null, 2)}\n`, "utf8"); | ||
|
|
||
| await tar.c( | ||
| const skippedPaths = await createTarArchiveWithMissingSkip( | ||
| { | ||
| file: tempArchivePath, | ||
| gzip: true, | ||
| portable: true, | ||
| preservePaths: true, | ||
| onWriteEntry: (entry) => { | ||
| entry.path = remapArchiveEntryPath({ | ||
| entryPath: entry.path, | ||
| (entry as { path: string }).path = remapArchiveEntryPath({ | ||
| entryPath: (entry as { path: string }).path, | ||
| manifestPath, | ||
| archiveRoot, | ||
| }); | ||
| }, | ||
| }, | ||
| [manifestPath, ...result.assets.map((asset) => asset.sourcePath)], | ||
| ); |
There was a problem hiding this comment.
Manifest written before archiving won't reflect ENOENT-skipped files
The manifest is serialised and written to manifestPath before createTarArchiveWithMissingSkip runs. ENOENT-skipped entries are appended to result.skipped only after archiving completes, so the manifest embedded in the archive will never list them. Any restore tool (or human) inspecting the in-archive manifest would see the sessions asset as fully present even when individual files were silently omitted.
This is a trade-off inherent in writing the manifest first so it can be included in the archive, and the CLI summary output does report the skipped files correctly. Worth an inline comment explaining the gap so future maintainers don't assume the in-archive manifest is a complete record of what was actually archived vs. skipped.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/infra/backup-create.ts
Line: 397-414
Comment:
**Manifest written before archiving won't reflect ENOENT-skipped files**
The manifest is serialised and written to `manifestPath` before `createTarArchiveWithMissingSkip` runs. ENOENT-skipped entries are appended to `result.skipped` only after archiving completes, so the manifest embedded in the archive will never list them. Any restore tool (or human) inspecting the in-archive manifest would see the sessions asset as fully present even when individual files were silently omitted.
This is a trade-off inherent in writing the manifest first so it can be included in the archive, and the CLI summary output does report the skipped files correctly. Worth an inline comment explaining the gap so future maintainers don't assume the in-archive manifest is a complete record of what was actually archived vs. skipped.
How can I resolve this? If you propose a fix, please make it concise.|
Codex review: found issues before merge. What this changes: The PR replaces backup creation's Maintainer follow-up before merge: This is an open contributor PR with real correctness findings and an overlapping open backup-race PR; maintainers should choose whether to revise this Pack-based approach, fold in volatile filtering, or supersede it with #72251 before an automated repair lane attempts code changes. Security review: Security review cleared: Security review cleared: the diff changes local backup archiving code and a Vitest mock only, with no new dependency, workflow, package script, permission, publishing, secret-handling, or downloaded-artifact surface. Review findings:
Review detailsBest possible solution: Revise or replace the patch with a backup archiving path that either avoids feeding known-volatile session/log files to tar or handles archiver errors in a way that actually drains/completes the tar job. The shipped behavior should keep result, manifest, and Do we have a high-confidence way to reproduce the issue? Yes. #67417 gives a concrete reproduction path: include live session transcripts in a backup plan, delete one before tar archives it, and current main can fail through Is this the best way to solve the issue? No. Full review comments:
Overall correctness: patch is incorrect Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 68c010906a94. |
|
Hi @zhouhe-xydt — author of #72251 here. Just spotted the overlap; we're solving adjacent problems in the same file. #67472: ENOENT during #72251: Tar EOF race + proactive filter for known volatile paths. Belt-and-suspenders via pre-filter + post-retry. I think these are complementary. Your Proposal: once #67472 merges, I'll rebase #72251 on top, drop the retry scaffolding, keep only the volatile-filter layer. Or we can fold it all into #67472 if you prefer — happy either way. Which route works for you? |
Summary
openclaw backup createcould fail with ENOENT when the gateway's session compaction deleted a session file between enumeration and tar archiving. This change replacestar.cwith the underlyingtar.PackAPI so that ENOENT errors during archiving are caught and skipped rather than failing the entire backup. Skipped files are reported in the backup summary with the reason "cleaned up during backup".Changes
src/infra/backup-create.ts: addcreateTarArchiveWithMissingSkiphelper usingtar.Packwith a custom error handler that ignores ENOENT and records skipped paths; replace thetar.ccall increateBackupArchivesrc/commands/backup.test-support.ts: updatetarmock to includePackso tests continue to work with the new implementationRelated issue
Fixes #67417