fix(backup): retry on tar EOF race and skip known volatile files by abnershang · Pull Request #72251 · openclaw/openclaw

abnershang · 2026-04-26T15:42:54Z

Problem

openclaw backup create fails with Error: did not encounter expected EOF on any live installation. Root cause: node-tar's WriteEntry records size from the initial lstat(); if the file grows between lstat() and the end of fs.read(), the header no longer matches the consumed bytes, the stream errors, and tar.c() aborts the whole archive. Common culprits are append-only JSONL transcripts and .log files under the state directory.

Details and generic reproducer: #72249. Related but distinct races: #67417 (ENOENT when a session file is deleted mid-backup). Broader exclude-rule proposal: #67990.

Impact

Affects any live install where backup source files are written concurrently — session transcripts, cron run logs, gateway logs. Today the workaround is to stop the gateway before backing up, which is not viable for scheduled backups.

Fix

Default volatile-path filter in the backup archiver — no user config required:
- {stateDir}/sessions/**/*.{jsonl,log}
- {stateDir}/cron/runs/**/*.log
- {stateDir}/logs/**/*.{jsonl,log}
- *.{sock,pid,tmp,lock} anywhere
Rationale: these files are either actively appended to (logs), transient (sockets/pid/lock), or have no restoration value from a partial tail. Snapshotting them racily has no upside.
Retry on EOF-class errors (up to 3 attempts, 10s / 20s backoff) for residual races on non-filtered files. The partial temp archive is removed between attempts, so no half-written artifact escapes.
Error enrichment: on final failure, surface err.path (the offending file) and attempt count. Turns an opaque error into an actionable one.
skippedVolatileCount observability: extend BackupCreateResult with the skip count; surface it in both the stdout summary and the --json output.

Backwards compatibility

Filter is additive; no previously-included path stops being included unless it matches a volatile rule. The volatile rules cover files that already produced errors or had no snapshot value.
--json output gains a new optional field skippedVolatileCount. Existing consumers that ignore unknown fields are unaffected.
Error message format changes only on failure, and only to add diagnostic context.

Testing

Added:

src/infra/backup-volatile-filter.test.ts — predicate unit tests for each rule and the anywhere-patterns, plus negative cases.
src/infra/backup-create.test.ts — retry tests using a mocked tar runner:
- Throws EOF twice then succeeds → 3 calls, 10s + 20s backoff, logs emitted.
- Throws EOF on every attempt → final error includes offending path + attempt count.
- Throws non-EOF → no retry, no sleep.
formatBackupCreateSummary updated to include the skip-count line when > 0.

Verified locally: a previously-failing live-install backup (large state tree with actively written session and cron logs) now completes cleanly; the skipped-volatile count is reported and the archive is usable.

Future work (not in this PR)

A --quiesce flag that pauses cron and drains in-flight work before archival, for users who want stronger consistency guarantees than "retry-plus-filter".

Fixes #72249
See also #67417, #67990

Real behavior proof

Behavior or issue addressed: openclaw backup create on a live install with actively mutating session/log files. The fix should let the backup complete by skipping known volatile files instead of aborting the whole archive on tar EOF races.

Real environment tested: Remote Linux VM (Blacksmith sandbox), exact PR commit 053c118900e6afddd05b899fba2b03f21f237131, fresh pnpm install --frozen-lockfile, Node v20.20.0, pnpm 10.33.2. No local gateway on this machine was restarted or used for the proof.

Exact steps or command run after this patch:

Sync the PR checkout to the remote VM and run corepack prepare pnpm@10.33.0 --activate && pnpm install --frozen-lockfile.
Run node --import tsx /home/runner/proof-run/openclaw-backup-proof.mjs from /home/runner/crabbox-proof/openclaw.
That script creates an isolated OpenClaw state dir, continuously appends to state/sessions/s-live/transcript.jsonl and state/logs/gateway.jsonl, keeps state/gateway.pid and state/pending.tmp present, runs the real backupCreateCommand(..., { output: outDir, includeWorkspace: false, verify: true }) code path, and then inspects the resulting tarball entries.

Evidence after fix:

Backup skipped 4 volatile files (live session/cron logs, sockets, pid/lock/tmp).
Backup archive: /home/runner/proof-run/live-backup/out
Included 1 path:
- state: ~/state
Created /home/runner/proof-run/live-backup/out
Skipped 4 volatile files (live session/cron logs, sockets, pid/lock/tmp).
Archive verification: passed
---PROOF_SUMMARY---
{
  "archivePath": "/home/runner/proof-run/live-backup/out",
  "verified": true,
  "skippedVolatileCount": 4,
  "writesDuringBackup": 75,
  "elapsedMs": 627,
  "transcriptBytesBefore": 28,
  "transcriptBytesAfter": 3244,
  "gatewayLogBytesBefore": 26,
  "gatewayLogBytesAfter": 3242,
  "archiveHasTranscript": false,
  "archiveHasGatewayLog": false,
  "archiveHasPid": false,
  "archiveHasTmp": false,
  "archiveHasMeta": true,
  "archiveHasConfig": true
}

Observed result after fix: The backup completed successfully and backup verify passed while the transcript and gateway log kept growing during the run (writesDuringBackup: 75, both byte counts increased). The resulting archive excluded the live-mutation targets (transcript.jsonl, gateway.jsonl, gateway.pid, pending.tmp) and still kept stable state (meta.json, openclaw.json).

What was not tested: I did not run the packaged pnpm openclaw backup create wrapper on this sandbox; this proof used the real backup command path from the PR checkout via node --import tsx after a fresh install. I also did not run a Windows or macOS live proof in this pass.

greptile-apps · 2026-04-26T15:44:58Z

Greptile Summary

This PR fixes backup create failures on live installations by adding a volatile-path filter (skipping actively-written session/cron/log files and transient socket/pid/lock/tmp files), a 3-attempt retry loop with 10s/20s backoff for residual EOF races, and skippedVolatileCount observability in both the result object and CLI output. The previously flagged skippedVolatileCount accumulation across retries and the overly-broad /EOF/i regex have both been corrected in the current code.

Confidence Score: 5/5

This PR is safe to merge; implementation is correct and well-tested with no P0 or P1 issues remaining.

All logic is sound: backoff indexing is correct (2-element array covers exactly the 2 inter-attempt sleeps needed for 3 attempts), the EOF regex is appropriately narrowed, the closure counter reset inside runTar() prevents cumulative overcounting on retries, and the finally block ensures temp file cleanup even on final failure. Test coverage addresses the key behavioral contracts. No new security surface or data-loss path was introduced.

No files require special attention.

_{Reviews (2): Last reviewed commit: "fix(backup): narrow EOF retry classifier..." | Re-trigger Greptile}

greptile-apps · 2026-04-26T15:45:02Z

+    let skippedVolatileCount = 0;
+    const tarFilter = (entryPath: string): boolean => {
+      // The manifest is staged in a tmp dir outside any state directory and
+      // is always safe to include.
+      if (path.resolve(entryPath) === manifestPath) {
+        return true;
+      }
+      if (isVolatileBackupPath(entryPath, volatilePlan)) {
+        skippedVolatileCount += 1;
+        return false;
+      }
+      return true;
+    };
+    await writeTarArchiveWithRetry({
+      tempArchivePath,
+      log: opts.log,
+      runTar: () =>
+        tar.c(
+          {
+            file: tempArchivePath,
+            gzip: true,
+            portable: true,
+            preservePaths: true,
+            filter: tarFilter,
+            onWriteEntry: (entry) => {
+              entry.path = remapArchiveEntryPath({
+                entryPath: entry.path,
+                manifestPath,
+                archiveRoot,
+              });
+            },
+          },
+          [manifestPath, ...result.assets.map((asset) => asset.sourcePath)],
+        ),
+    });
+    result.skippedVolatileCount = skippedVolatileCount;


skippedVolatileCount accumulates across retry attempts

skippedVolatileCount is a closure variable that is never reset between runTar() invocations. On each retry tar.c() re-traverses every file and calls tarFilter again, incrementing the counter for the same volatile files a second (and third) time. With 5 volatile files and 3 attempts the reported count would be 15, not 5.

Fix: reset the counter at the top of runTar before each tar.c() call:

runTar: () => { skippedVolatileCount = 0; return tar.c( { file: tempArchivePath, gzip: true, portable: true, preservePaths: true, filter: tarFilter, onWriteEntry: (entry) => { entry.path = remapArchiveEntryPath({ entryPath: entry.path, manifestPath, archiveRoot, }); }, }, [manifestPath, ...result.assets.map((asset) => asset.sourcePath)], ); },

Prompt To Fix With AI

This is a comment left during a code review. Path: src/infra/backup-create.ts Line: 436-471 Comment: **`skippedVolatileCount` accumulates across retry attempts** `skippedVolatileCount` is a closure variable that is never reset between `runTar()` invocations. On each retry `tar.c()` re-traverses every file and calls `tarFilter` again, incrementing the counter for the same volatile files a second (and third) time. With 5 volatile files and 3 attempts the reported count would be 15, not 5. Fix: reset the counter at the top of `runTar` before each `tar.c()` call: ``` runTar: () => { skippedVolatileCount = 0; return tar.c( { file: tempArchivePath, gzip: true, portable: true, preservePaths: true, filter: tarFilter, onWriteEntry: (entry) => { entry.path = remapArchiveEntryPath({ entryPath: entry.path, manifestPath, archiveRoot, }); }, }, [manifestPath, ...result.assets.map((asset) => asset.sourcePath)], ); }, ``` How can I resolve this? If you propose a fix, please make it concise.

Fixed in 96243ba0b2. Reset skippedVolatileCount = 0 at the top of runTar() before tar.c(), plus a unit test asserting the counter reflects the final attempt rather than the cumulative sum across retries. Thanks for the catch.

greptile-apps · 2026-04-26T15:45:03Z

+    return true;
+  }
+  const message = (err as Error).message ?? "";
+  return /did not encounter expected|unexpected end of|EOF/i.test(message);


Broad EOF regex alternative may match unrelated errors

The bare /EOF/i alternative will match any error message containing the substring "EOF" (e.g., SSL EOF occurred in violation of protocol, OpenSSL unexpected eof while reading). In a tar context this only causes an extra unnecessary retry rather than data loss, but narrowing the pattern reduces the risk of masking non-tar errors.

Suggested change

return /did not encounter expected|unexpected end of|EOF/i.test(message);

return /did not encounter expected|unexpected end of file|TAR_BAD_ARCHIVE/i.test(message);

Prompt To Fix With AI

This is a comment left during a code review. Path: src/infra/backup-create.ts Line: 112 Comment: **Broad `EOF` regex alternative may match unrelated errors** The bare `/EOF/i` alternative will match any error message containing the substring "EOF" (e.g., SSL `EOF occurred in violation of protocol`, OpenSSL `unexpected eof while reading`). In a tar context this only causes an extra unnecessary retry rather than data loss, but narrowing the pattern reduces the risk of masking non-tar errors. ```suggestion return /did not encounter expected|unexpected end of file|TAR_BAD_ARCHIVE/i.test(message); ``` How can I resolve this? If you propose a fix, please make it concise.

Fixed in dded2d10ab. Narrowed the regex to /(did not encounter expected|encountered unexpected) EOF|TAR_BAD_ARCHIVE/i — your suggested alternation didn't cover the shrink case ("encountered unexpected EOF"), so the shipped version handles both node-tar WriteEntry#onread messages plus the parser's TAR_BAD_ARCHIVE code. Added a comment citing the node-tar source and a unit test table pinning down match/no-match behavior (including SSL/OpenSSL strings we explicitly don't retry on). Thanks.

abnershang · 2026-04-26T15:55:50Z

Addressed both Greptile findings:

P1 (96243ba0b2) — fix(backup): reset skippedVolatileCount on each retry attempt. Resets the counter at the top of runTar() so the reported count reflects only the final attempt instead of summing across retries. Unit test added via the existing retry-test harness.
P2 (dded2d10ab) — fix(backup): narrow EOF retry classifier to tar-specific errors. Greptile's proposed alternation missed the shrink-case message ("encountered unexpected EOF"); shipped a version that matches both node-tar WriteEntry#onread EOF messages plus TAR_BAD_ARCHIVE, and rejects SSL/OpenSSL EOF strings. Classifier unit test table added.

Branch pushed, lint clean, backup tests green (56/56 across backup-create, backup-volatile-filter, archive-path).

clawsweeper · 2026-04-26T17:32:17Z

Codex review: needs real behavior proof before merge.

Summary
The PR adds backup tar EOF retries, volatile-file filtering, enriched backup errors, skipped-volatile reporting, and focused backup tests.

Reproducibility: yes. Source inspection is enough here: current main writes sessions under stateDir/agents//sessions/*.jsonl and cron logs under cron/runs/.jsonl, while the PR filter does not match those paths and globally skips meaningful workspace *.lock files.

Real behavior proof
Needs stronger real behavior proof before merge: Terminal proof is supplied, but it exercises synthetic paths at an older PR commit and does not prove the latest head against current agent-scoped session, cron JSONL, delivery-queue, or workspace-skip behavior. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, ask a maintainer to comment @clawsweeper re-review.

Next step before merge
Contributor action is needed to repair the branch and provide updated real behavior proof; this should not enter the automated repair lane while the external proof gate is still insufficient.

Security
Cleared: The diff touches backup command/infra code and tests only; I found no workflow, dependency, package-script, permission, credential, or downloaded-code supply-chain change.

Review findings

[P1] Don't skip workspace lock files globally — src/infra/backup-volatile-filter.ts:60-62
[P2] Match current live session and cron paths — src/infra/backup-volatile-filter.ts:70-76
[P3] Keep retry notices out of JSON output — src/commands/backup.ts:27

Review details

Best possible solution:

Revise the PR so volatile skips are scoped to known state/runtime paths, current agent session and cron JSONL paths are covered, JSON mode stays machine-readable, docs describe the new backup exclusions, and latest-head proof shows the real backup path passing.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection is enough here: current main writes sessions under stateDir/agents//sessions/*.jsonl and cron logs under cron/runs/.jsonl, while the PR filter does not match those paths and globally skips meaningful workspace *.lock files.

Is this the best way to solve the issue?

No, not yet. Retry-plus-filter is a maintainable direction for the node-tar EOF race, but the filter must be scoped to state/runtime volatility and aligned with current storage contracts before this is the best fix.

Full review comments:

[P1] Don't skip workspace lock files globally — src/infra/backup-volatile-filter.ts:60-62
isVolatileBackupPath returns true for .sock, .pid, .tmp, and .lock before checking any stateDir anchor, and createBackupArchive applies this filter to every asset. Since workspace directories are included by default, this silently drops valid workspace files such as Cargo.lock or poetry.lock from backup archives. Scope these suffix skips to known state/runtime locations instead of all backup sources.
Confidence: 0.92
[P2] Match current live session and cron paths — src/infra/backup-volatile-filter.ts:70-76
Current session transcripts live under stateDir/agents/<agentId>/sessions/*.jsonl, and cron run logs are cron/runs/<jobId>.jsonl. This filter only covers stateDir/sessions/**/*.{jsonl,log} and cron *.log, so the main files likely to trigger the EOF race remain archived.
Confidence: 0.95
[P3] Keep retry notices out of JSON output — src/commands/backup.ts:27
Defaulting the new logger to runtime.log means retry and skip notices can be printed before writeRuntimeJson when opts.json is true, especially for plain RuntimeEnv callers that use the same log sink for JSON. Suppress the default logger in JSON mode unless a caller explicitly passes one.
Confidence: 0.8

Overall correctness: patch is incorrect
Overall confidence: 0.93

Acceptance criteria:

pnpm test src/infra/backup-create.test.ts src/infra/backup-volatile-filter.test.ts src/commands/backup.test.ts
pnpm exec oxfmt --check --threads=1 src/commands/backup.ts src/infra/backup-create.ts src/infra/backup-volatile-filter.ts src/infra/backup-create.test.ts src/infra/backup-volatile-filter.test.ts docs/cli/backup.md CHANGELOG.md
Real latest-head backup proof with current agent-scoped session JSONL, cron run JSONL, delivery-queue JSON, and a workspace file such as Cargo.lock that must remain archived

What I checked:

PR globally skips transient extensions: The PR returns true for .sock/.pid/.tmp/.lock before checking stateDir anchors, so the volatile filter applies to every archived asset, including workspaces. (src/infra/backup-volatile-filter.ts:60, 9d2812f7bdc5)
Workspaces are backed up by default: Current docs say workspace directories discovered from config are included unless --no-include-workspace is passed, making global .lock/.tmp filtering a backup-content regression for real workspace files. Public docs: docs/cli/backup.md. (docs/cli/backup.md:36, 678c41a222cf)
Current session transcript path: Current main resolves session transcripts under stateDir/agents//sessions, not stateDir/sessions. (src/config/sessions/paths.ts:15, 678c41a222cf)
Current cron run log path: Current main writes cron run logs as cron/runs/.jsonl, while the PR filter only matches cron run .log files. (src/cron/run-log.ts:82, 678c41a222cf)
JSON output contamination path: The PR defaults backup retry/skip notices to runtime.log, and writeRuntimeJson falls back to runtime.log for plain RuntimeEnv callers, so notices can precede JSON output. (src/runtime.ts:106, 678c41a222cf)
Proof gap: The PR body's terminal proof uses synthetic state/sessions and state/logs paths at commit 053c118, not the latest head's delivery-queue change or current agent-scoped sessions and cron JSONL paths. (9d2812f7bdc5)

Likely related people:

shichangs: The local backup CLI and backup archive implementation appear to originate in feat: add local backup CLI (feat: add local backup CLI #40163). (role: introduced behavior; confidence: high; commits: 0ecfd37b4465; files: src/infra/backup-create.ts, src/commands/backup.ts, src/commands/backup-shared.ts)
Peter Steinberger: Recent main history and blame for the backup, session, cron, and delivery-queue areas in this checkout point to ongoing maintenance by Peter Steinberger. (role: recent maintainer; confidence: medium; commits: 95b936fd46a6, 86c0e3d3eb17, eeef4864494f; files: src/infra/backup-create.ts, src/commands/backup.ts, src/config/sessions/paths.ts)

Remaining risk / open question:

The latest PR head still lacks real behavior proof for current agent-scoped sessions, cron JSONL logs, and the delivery-queue addition.
The backup content change is user-facing and still needs docs coverage for skipped volatile files and the new JSON field.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 678c41a222cf.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/25597280299
Updated: 2026-05-09T09:13:48.829Z

abnershang · 2026-04-29T11:59:29Z

Rebased onto latest main, composed with the new buildExtensionsNodeModulesFilter from upstream (extensions node_modules exclusion), and all tests green (38/38 — backup-create + backup-volatile-filter). The only typecheck failure in tsgo:all is a pre-existing missing-types error in extensions/acpx unrelated to this branch.

abnershang · 2026-04-29T12:39:18Z

Follow-up from cross-review:

Path canonicalization in isVolatileBackupPath — normalizePosix now runs inputs through path.posix.normalize, so traversal segments like sessions/../config.jsonl can no longer be misclassified as volatile through a bare startsWith ancestry check. Added direct predicate tests for .. escapes and Windows-style backslash inputs.
Retry cleanup hygiene — temp-archive removal between attempts now only silently swallows ENOENT; any other fs.rm error (EPERM, EACCES, …) is logged before the retry continues, so a real hygiene problem is no longer hidden.

Tests: 19 passed (volatile-filter) + 21 passed (backup-create) = 40 green.

New tip: b777111e.

abnershang · 2026-05-09T07:10:56Z

Status update — re-verified against v2026.5.7

Ran a sanity check against openclaw v2026.5.7-3-gabffed82d8 after upstream's e11eb03182 fix: exclude plugin dependencies from backups (v2026.4.29) landed. Two openclaw backup create runs both failed, with two distinct errors that are not addressed by the plugin-deps exclusion:

Run 1

Error: ENOENT: no such file or directory, lstat
  '$STATE_DIR/delivery-queue/3fac5e46-42dc-4230-a725-51c203830b4f.json'

Run 2

Error: did not encounter expected EOF

Run 2 is exactly the tar EOF race this PR's retry path already handles. Run 1 is a new volatile-file class the filter didn't cover: {stateDir}/delivery-queue/*.json entries are transient IPC state created and consumed by the messaging delivery queue, and they routinely vanish between tar's directory read and lstat().

What changed in this push (`9d2812f7bd`)

Extended the volatile filter to skip {stateDir}/delivery-queue/**/*.json. Same isUnder + hasExtension helpers, same rule style, JSDoc updated. Added 4 focused tests (match, nested, wrong extension, outside stateDir). pnpm vitest run src/infra/backup-volatile-filter.test.ts src/infra/backup-create.test.ts → 44/44 green.

Upstream's plugin-deps fix is complementary, not a substitute. Keeping this PR open as the tar-EOF-race hardening.

…abnershang

…abnershang)

tar.c() rejects with 'did not encounter expected EOF' when a source file grows between lstat() and stream consumption. On a running install this is nearly deterministic against live session transcripts, cron run logs, and gateway .log/.jsonl files — the same files that have no meaningful backup value mid-write. - Add a built-in filter that skips known volatile paths: {stateDir}/sessions/**/*.{jsonl,log} {stateDir}/cron/runs/**/*.log {stateDir}/logs/**/*.{jsonl,log} *.{sock,pid,tmp,lock} anywhere - Retry tar.c() up to 3 times on EOF-class errors with 10s/20s backoff, cleaning the partial tmp archive between attempts. - Enrich the final error message with err.path and attempt count. - Track and surface skippedVolatileCount in stdout and --json output. No breaking changes: filter is additive, --json adds a new field, and the error-message format change is only visible on failure. Fixes openclaw#72249 See also openclaw#67417, openclaw#67990

tar.c re-walks the staged tree on every invocation, which re-invokes the filter closure. Previously skippedVolatileCount was initialized once outside runTar(), so on retry it accumulated across attempts -- e.g. 5 volatile files over 3 attempts reported 15 in both the --json output and the stderr summary line. Reset the counter at the top of runTar() so the reported count reflects only the final (successful or final-failed) attempt, matching what ended up in the archive. Adds a unit test that exercises the retry path and asserts the counter equals the final attempt's value, not the running sum.

The previous regex included a bare /EOF/i alternative that also matched SSL/OpenSSL strings such as "EOF occurred in violation of protocol" and "unexpected eof while reading", causing pointless retries on unrelated failures. Replace it with an alternation that matches only the two tar-specific EOF-class messages thrown by node-tar's WriteEntry#onread (grow and shrink races, citing write-entry.ts) plus the TAR_BAD_ARCHIVE code surfaced by the parser on truncated input. The code === 'EOF' fast path stays as-is for errors that propagate with structured fields intact. Adds a unit test table that pins down the classifier's match and no-match behavior, including SSL/OpenSSL strings we explicitly do not want to treat as tar EOF races.

…cleanup Per Argus review: - isVolatileBackupPath() now runs inputs through path.posix.normalize() so traversal segments like sessions/../config.jsonl cannot be misclassified as volatile via a bare startsWith() ancestry check. - Add tests for '..' escape and Windows-style backslash inputs on the predicate itself. - Retry cleanup between tar attempts now only silently ignores ENOENT; any other fs.rm error (EPERM/EACCES/etc.) is logged before the retry continues so temp-file hygiene issues are not hidden. Built-by: Vulcan 🔨

Real-world run of `openclaw backup create` on v2026.5.7 reproduced an ENOENT race on a delivery-queue entry: Error: ENOENT: no such file or directory, lstat '$STATE_DIR/delivery-queue/3fac5e46-...json' Files under `{stateDir}/delivery-queue/**/*.json` are transient IPC state for the messaging delivery queue: they are created and consumed continuously and routinely vanish between tar's directory read and `lstat()`. They have no restoration value — capturing a partial tail only risks tar corruption without benefit. Extends the volatile filter with the same `isUnder` + `hasExtension` rule style as the existing sessions / cron-runs / logs rules, and keeps the JSDoc rule list in sync. Tests: 44/44 green across backup-volatile-filter + backup-create. Built-by: Vulcan 🔨

…abnershang

…abnershang)

steipete · 2026-05-09T09:12:08Z

Landed via squash merge.

Source head: 3766457
Merge commit: 9eaca28
Gate: CI, CodeQL, CodeQL Critical Quality, Workflow Sanity, Real behavior proof all green on the source head.
Local/Crabbox proof: pre-fix repro tbx_01kr5xt9vf5pas5ee4aefrp3am; post-fix proof tbx_01kr5y3e1kbtt6chbypfdydbgs; pnpm check:test-types; pnpm lint:core; focused backup tests.

Thanks @abnershang!

openclaw-barnacle Bot added commands Command implementations size: M labels Apr 26, 2026

greptile-apps Bot reviewed Apr 26, 2026

View reviewed changes

abnershang marked this pull request as draft April 29, 2026 01:29

github-actions Bot mentioned this pull request Apr 29, 2026

🦞 OpenClaw 生态日报 2026-04-29 borq168/big_model_radar#69

Open

abnershang force-pushed the fix/backup-create-tar-eof-race branch from dded2d1 to 1c5217a Compare April 29, 2026 11:59

abnershang marked this pull request as ready for review April 29, 2026 11:59

abnershang force-pushed the fix/backup-create-tar-eof-race branch from b369e67 to 07b9899 Compare May 6, 2026 08:10

openclaw-barnacle Bot added the triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. label May 6, 2026

abnershang force-pushed the fix/backup-create-tar-eof-race branch from 07b9899 to 053c118 Compare May 6, 2026 08:16

openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. size: L and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. size: M labels May 6, 2026

steipete force-pushed the fix/backup-create-tar-eof-race branch from 9d2812f to c7ae86a Compare May 9, 2026 08:56

steipete added a commit to abnershang/openclaw that referenced this pull request May 9, 2026

fix: harden live backup volatile filtering (openclaw#72251) (thanks @…

c7ae86a

…abnershang)

openclaw-barnacle Bot added the docs Improvements or additions to documentation label May 9, 2026

steipete added a commit to abnershang/openclaw that referenced this pull request May 9, 2026

test: satisfy backup json lint (openclaw#72251)

c1da9f0

steipete added a commit to abnershang/openclaw that referenced this pull request May 9, 2026

test: narrow backup json payload type (openclaw#72251)

78f2b61

abnershang and others added 9 commits May 9, 2026 10:06

test: format backup volatile filter cases

6910c33

fix(backup): preserve fs-safe manifest writes

51b0143

fix: harden live backup volatile filtering (openclaw#72251) (thanks @…

db024ef

…abnershang)

test: satisfy backup json lint (openclaw#72251)

05f4a86

test: narrow backup json payload type (openclaw#72251)

3766457

steipete force-pushed the fix/backup-create-tar-eof-race branch from 78f2b61 to 3766457 Compare May 9, 2026 09:06

steipete merged commit 9eaca28 into openclaw:main May 9, 2026
110 checks passed

abnershang deleted the fix/backup-create-tar-eof-race branch May 9, 2026 09:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(backup): retry on tar EOF race and skip known volatile files#72251

fix(backup): retry on tar EOF race and skip known volatile files#72251
steipete merged 10 commits intoopenclaw:mainfrom
abnershang:fix/backup-create-tar-eof-race

abnershang commented Apr 26, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 26, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot Apr 26, 2026

Uh oh!

abnershang Apr 26, 2026

Uh oh!

greptile-apps Bot Apr 26, 2026

Uh oh!

abnershang Apr 26, 2026

Uh oh!

abnershang commented Apr 26, 2026

Uh oh!

clawsweeper Bot commented Apr 26, 2026 •

edited

Loading

Uh oh!

abnershang commented Apr 29, 2026

Uh oh!

abnershang commented Apr 29, 2026 •

edited

Loading

Uh oh!

abnershang commented May 9, 2026

Uh oh!

Uh oh!

steipete commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	return /did not encounter expected\|unexpected end of\|EOF/i.test(message);
	return /did not encounter expected\|unexpected end of file\|TAR_BAD_ARCHIVE/i.test(message);

Uh oh!

Conversation

abnershang commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Impact

Fix

Backwards compatibility

Testing

Future work (not in this PR)

Real behavior proof

Uh oh!

greptile-apps Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Uh oh!

greptile-apps Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

abnershang Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

abnershang Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

abnershang commented Apr 26, 2026

Uh oh!

clawsweeper Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abnershang commented Apr 29, 2026

Uh oh!

abnershang commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abnershang commented May 9, 2026

Status update — re-verified against v2026.5.7

What changed in this push (9d2812f7bd)

Uh oh!

Uh oh!

steipete commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

abnershang commented Apr 26, 2026 •

edited

Loading

greptile-apps Bot commented Apr 26, 2026 •

edited

Loading

clawsweeper Bot commented Apr 26, 2026 •

edited

Loading

abnershang commented Apr 29, 2026 •

edited

Loading

What changed in this push (`9d2812f7bd`)