Bug Description
openclaw backup create aborts with Error: did not encounter expected EOF when the archiver reads a file that is being actively appended to during the tar.c() stream. On any live OpenClaw installation, this is reliably reproducible against session transcript .jsonl files, cron run .log files, and state logs/*.jsonl — all of which are append-only on a running gateway.
Exact Error
Error: did not encounter expected EOF
at WriteEntry.<anonymous> (.../node_modules/tar/dist/...)
...
Process exited with code 1
err.path points at the file being appended to mid-stream (typically a live session transcript or gateway log).
Root Cause
node-tar's WriteEntry records the file size from the initial lstat(). It then streams file contents via fs.read(). If the file grows between the lstat() and the end of the read, the header size no longer matches the byte count actually consumed, the stream errors with "did not encounter expected EOF", and tar.c() rejects. Because the backup is a single transaction, the whole archive is aborted — partial writes are not salvaged.
This is not an OpenClaw bug in the strict sense: it's a fundamental mismatch between how tar packs files and how any live service writes logs. But every user who runs openclaw backup create against a running install will eventually hit it, so the CLI needs to handle it.
Steps to Reproduce
Generic reproducer (no OpenClaw state needed):
mkdir -p /tmp/backup-eof-repro/src
cd /tmp/backup-eof-repro
# Generate a non-trivial file that will be appended to during archiving.
yes "$(head -c 2000 /dev/urandom | base64)" | head -c 100M > src/live.log
# In one shell, keep appending:
while true; do date >> src/live.log; sleep 0.05; done &
APPENDER=$!
# In another, try to tar it:
tar -czf out.tar.gz src/ # or via node-tar c({ file, gzip: true }, ['src'])
# Observe: intermittent "did not encounter expected EOF" error.
kill $APPENDER
In a real OpenClaw install, just run openclaw backup create against a state directory that has any active session (i.e. a running gateway). With a state tree in the tens of GB and dozens of live sessions, failure is nearly deterministic.
Environment
- OpenClaw: observed on
2026.4.x line
- Node: v22–v25
- OS: macOS and Linux both reproduce
- Triggering files:
{stateDir}/sessions/**/*.{jsonl,log}, {stateDir}/cron/runs/**/*.log, {stateDir}/logs/**/*.{jsonl,log}
Impact
- Severity: High for users with a running gateway and a non-trivial state directory — backups can fail repeatedly until the gateway is stopped.
- Workaround today: stop the gateway before running
backup create. Not viable for scheduled/automated backups.
Expected Behavior
openclaw backup create should complete on a running install. Files known to be volatile (live logs, sockets, pid/lock markers) are not meaningful to snapshot anyway and can safely be skipped; transient races on other files should be retried.
Proposed Fix
- Default-exclude known volatile paths in the backup archiver:
{stateDir}/sessions/**/*.{jsonl,log}
{stateDir}/cron/runs/**/*.log
{stateDir}/logs/**/*.{jsonl,log}
*.{sock,pid,tmp,lock} anywhere
- Retry
tar.c() on EOF-class errors (up to 3 attempts, 10s/20s backoff) for residual races on other files. Clean the partial temp archive between attempts.
- On final failure, include
err.path and attempt count in the thrown message so users get an actionable report.
- Surface the skipped-volatile count in stdout and in
--json output for observability.
This is distinct from #67417 (ENOENT when a session file is deleted mid-backup — same race family, different failure mode and different fix). A broader exclude-rule system is proposed in #67990; this bug asks for the minimum viable built-in filter that makes backup create work out of the box on any live install, without requiring user configuration.
A PR implementing the above is on the way.
Bug Description
openclaw backup createaborts withError: did not encounter expected EOFwhen the archiver reads a file that is being actively appended to during thetar.c()stream. On any live OpenClaw installation, this is reliably reproducible against session transcript.jsonlfiles, cron run.logfiles, and statelogs/*.jsonl— all of which are append-only on a running gateway.Exact Error
err.pathpoints at the file being appended to mid-stream (typically a live session transcript or gateway log).Root Cause
node-tar'sWriteEntryrecords the filesizefrom the initiallstat(). It then streams file contents viafs.read(). If the file grows between thelstat()and the end of the read, the header size no longer matches the byte count actually consumed, the stream errors with "did not encounter expected EOF", andtar.c()rejects. Because the backup is a single transaction, the whole archive is aborted — partial writes are not salvaged.This is not an OpenClaw bug in the strict sense: it's a fundamental mismatch between how
tarpacks files and how any live service writes logs. But every user who runsopenclaw backup createagainst a running install will eventually hit it, so the CLI needs to handle it.Steps to Reproduce
Generic reproducer (no OpenClaw state needed):
In a real OpenClaw install, just run
openclaw backup createagainst a state directory that has any active session (i.e. a running gateway). With a state tree in the tens of GB and dozens of live sessions, failure is nearly deterministic.Environment
2026.4.xline{stateDir}/sessions/**/*.{jsonl,log},{stateDir}/cron/runs/**/*.log,{stateDir}/logs/**/*.{jsonl,log}Impact
backup create. Not viable for scheduled/automated backups.Expected Behavior
openclaw backup createshould complete on a running install. Files known to be volatile (live logs, sockets, pid/lock markers) are not meaningful to snapshot anyway and can safely be skipped; transient races on other files should be retried.Proposed Fix
{stateDir}/sessions/**/*.{jsonl,log}{stateDir}/cron/runs/**/*.log{stateDir}/logs/**/*.{jsonl,log}*.{sock,pid,tmp,lock}anywheretar.c()on EOF-class errors (up to 3 attempts, 10s/20s backoff) for residual races on other files. Clean the partial temp archive between attempts.err.pathand attempt count in the thrown message so users get an actionable report.--jsonoutput for observability.This is distinct from #67417 (ENOENT when a session file is deleted mid-backup — same race family, different failure mode and different fix). A broader exclude-rule system is proposed in #67990; this bug asks for the minimum viable built-in filter that makes
backup creatework out of the box on any live install, without requiring user configuration.A PR implementing the above is on the way.