Skip to content

IntegrityError: Invalid segment entry size #2099

@bket

Description

@bket

In an attempt to expunge multiple checkpoints I used a script similar to:

#!/bin/sh
set -eu
    borg list :: | grep checkpoint | cut -d ' ' -f1 | while read l; do
    borg delete ::${l}
done

The client is running borg 1.0.9 on OSX and the server (OpenBSD) is also running 1.0.9. Connection is via SSH.
After some time the client errored with:

Remote: Borg 1.0.9: exception in RPC call:
Remote: Traceback (most recent call last):
Remote:   File "/usr/local/lib/python3.6/site-packages/borg/remote.py", line 113, in serve
Remote:     res = f(*args)
Remote:   File "/usr/local/lib/python3.6/site-packages/borg/repository.py", line 200, in commit
Remote:     self.compact_segments(save_space=save_space)
Remote:   File "/usr/local/lib/python3.6/site-packages/borg/repository.py", line 300, in compact_segments
Remote:     for tag, key, offset, data in self.io.iter_objects(segment, include_data=True):
Remote:   File "/usr/local/lib/python3.6/site-packages/borg/repository.py", line 705, in iter_objects
Remote:     (TAG_PUT, TAG_DELETE, TAG_COMMIT))
Remote:   File "/usr/local/lib/python3.6/site-packages/borg/repository.py", line 770, in _read
Remote:     size, segment, offset))
Remote: borg.helpers.IntegrityError: Invalid segment entry size 1195725663 - too big [segment 131098, offset 5300305]Remote:
Remote: Platform: OpenBSD aurora.lan 6.0 GENERIC#136 i386 i386
Remote: Borg: 1.0.9  Python: CPython 3.6.0
Remote: PID: 44224  CWD: /home/snapshot
Remote: sys.argv: ['/usr/local/bin/borg', 'serve', '--restrict-to-path', '/storage/d66e1f9b5845327b.a/borg/macbook-n_nancy.mac']
Remote: SSH_ORIGINAL_COMMAND: 'borg serve --umask=077 --info'
Remote:
Data integrity error: b'Remote Exception (see remote log for the traceback)'
Traceback (most recent call last):
  File "borg/archiver.py", line 2052, in main
  File "borg/archiver.py", line 1997, in run
  File "borg/archiver.py", line 90, in wrapper
  File "borg/archiver.py", line 503, in do_delete
  File "borg/remote.py", line 403, in commit
  File "borg/remote.py", line 288, in call
  File "borg/remote.py", line 327, in call_many
  File "borg/remote.py", line 309, in handle_error
borg.helpers.IntegrityError: b'Remote Exception (see remote log for the traceback)'

Platform: Darwin macbook-n 16.3.0 Darwin Kernel Version 16.3.0: Thu Nov 17 20:23:58 PST 2016; root:xnu-3789.31.2~1/RELEASE_X86_64 x86_64 i386
Borg: 1.0.9  Python: CPython 3.5.2
PID: 29277  CWD: /Users/n.mac/.snapshot/keys
sys.argv: ['borg', 'delete', '-v', '::macbook-n_nancy.mac_1485099919.checkpoint']
SSH_ORIGINAL_COMMAND: None

Subsequential delete (or create) attempts resulted in the same error. Error was resolved by running borg check --repair.

Output of the specific segment file 131098 at offset 5300305 is included (dd if=131098 of=output
bs=1 skip=5300305 count=1024). What is interesting is that offset 0 as well as 5300305 start with "BORG_SEG".

output.zip

Issue has been discussed on IRC:

2017-01-23 19:16:21     ThomasWaldmann  if that superfluous BORG_SEG magic header was the only problem, it should be able to recover all data
2017-01-23 19:16:51     bket    other strange thing is the file size: ~2 * 5MB
2017-01-23 19:17:15     bket    I expected a segment to be ~5M
2017-01-23 19:17:29     ThomasWaldmann  as these were checkpoints, there likely was some issue interrupting the backup. otherwise they would not exist.
2017-01-23 19:18:02     ThomasWaldmann  so dime segments were written to, but no final commit entry.
2017-01-23 19:18:08     ThomasWaldmann  some*
2017-01-23 19:19:15     ThomasWaldmann  the bug might be an "ab" file mode. if a seg file already exists, that appends...
2017-01-23 19:20:12     bket    that explains the double BORG_SEG I guess?
2017-01-23 19:20:15     ThomasWaldmann              self._write_fd = open(self.segment_filename(self.segment), 'ab')
2017-01-23 19:20:17     ThomasWaldmann              self._write_fd.write(MAGIC)
2017-01-23 19:20:25     ThomasWaldmann  that looks suspicious
2017-01-23 19:21:16     ThomasWaldmann  will to some digging into code and history
2017-01-23 19:21:39     ThomasWaldmann  but as i said, recover_segment() can cope with that, so no data loss, right?
2017-01-23 19:22:32     ThomasWaldmann  it skips forward bytewise until crc matches again
2017-01-23 19:23:50     ThomasWaldmann  if you look into that seg file, it also starts with BORG_SEG at the beginning (offset 0), right?
2017-01-23 19:24:04     bket    yes
2017-01-23 19:26:14     ThomasWaldmann  19:21  ThomasWaldmann$ but as i said, recover_segment() can cope with that, so no data loss, right?
2017-01-23 19:28:12     bket    I have not observed data loss
2017-01-23 19:29:30     bket    Both offests start with BORG_SEG
2017-01-23 19:33:38     ThomasWaldmann  e22fcdc0 darc/repository.py  (Jonas Borgström  2013-06-20 12:48:54 +0200 673)             self._write_fd = open(self.segment_filename(self.se
2017-01-23 19:33:41     ThomasWaldmann  gment), 'ab')
2017-01-23 19:33:43     ThomasWaldmann  b718a443 attic/repository.py (Jonas Borgström  2013-07-08 23:38:27 +0200 674)             self._write_fd.write(MAGIC)
2017-01-23 19:33:52     ThomasWaldmann  hah, ancient code
2017-01-23 19:37:32     ThomasWaldmann  about that time, Jonas renamed from "darc" to "attic"

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions