-
Notifications
You must be signed in to change notification settings - Fork 102
Description
Steps to reproduce
In short, the steps to reproduce it are:
- Create a new repository from a large data set (I used 7.2TB, across 4m files and 5m entities).
- Change some files and run a second backup.
At this point, attempting a full restore fails with "Data integrity error". Running attic check reports "Index mismatch for key". Running attic check --repair does not fix the issue.
After removing the second backup, an attic check reports no errors. However, upon creating another backup, the errors return exactly as above.
Environment details
I am using the latest version of Attic from master branch of the Git repository.
There is no real possibility of actual data corruption from the filesystem, as the source data and Attic repository both reside on a RAID 6 array.
The system is Debian 8 (Jessie) 64-bit, if that helps at all.
Detailed steps with output
Here are the actual commands I ran, with their output.
1. Initial backup
The initial backup was created with the following command - it was successful. However, I did not capture all the output at that point.
# date; HOME=/mnt/lsi0/.attic time attic create --stats /mnt/lsi0/backup/local/tardis/attic::tardis/$(date +%F) /etc /mnt/lsi0/git /mnt/lsi0/shares /root; date
2. Extraction (partial)
This worked fine - no errors.
# date; HOME=/mnt/lsi0/.attic time attic extract /mnt/lsi0/backup/local/tardis/attic::tardis/2015-03-16 mnt/lsi0/git; date
Fri 20 Mar 18:21:08 GMT 2015
195.33user 11.05system 5:57.84elapsed 57%CPU (0avgtext+0avgdata 5519440maxresident)k
21718136inputs+12970368outputs (2major+1811562minor)pagefaults 0swaps
Fri 20 Mar 18:27:06 GMT 2015
3. Second backup
This second backup was to sync any changed files - there weren't many.
# date; HOME=/mnt/lsi0/.attic time attic create --stats /mnt/lsi0/backup/local/tardis/attic::tardis/$(date +%F) /etc /mnt/lsi0/git /mnt/lsi0/shares /root; date
Sat 21 Mar 09:57:29 GMT 2015
------------------------------------------------------------------------------
Archive name: tardis/2015-03-21
Archive fingerprint: 1c900f1490f7fef3b2ce78eca2909c55e96fcc5de6293282cdf28f9fea8a2ca3
Start time: Sat Mar 21 09:57:29 2015
End time: Sat Mar 21 10:49:32 2015
Duration: 52 minutes 2.68 seconds
Number of files: 4384741
Original size Compressed size Deduplicated size
This archive: 7.81 TB 6.97 TB 114.22 MB
All archives: 15.61 TB 13.94 TB 6.10 TB
------------------------------------------------------------------------------
1017.30user 327.74system 52:06.24elapsed 43%CPU (0avgtext+0avgdata 16533176maxresident)k
178235032inputs+407023640outputs (175major+19663106minor)pagefaults 0swaps
Sat 21 Mar 10:49:36 GMT 2015
4. List backup sets
Both backup sets reported as existing and no problems seen.
# HOME=/mnt/lsi0/.attic time attic list /mnt/lsi0/backup/local/tardis/attic
tardis/2015-03-16 Fri Mar 20 16:43:56 2015
tardis/2015-03-21 Sat Mar 21 10:48:42 2015
0.20user 2.77system 0:05.74elapsed 51%CPU (0avgtext+0avgdata 5281780maxresident)k
5371536inputs+0outputs (36major+1322226minor)pagefaults 0swaps
5. Extraction (full)
(I should note that the current working directory was emptied between restore commands.)
# date; HOME=/mnt/lsi0/.attic time attic extract /mnt/lsi0/backup/local/tardis/attic::tardis/2015-03-21; date
Mon 23 Mar 13:58:12 GMT 2015
attic: Error: Data integrity error
Command exited with non-zero status 1
2962.62user 380.72system 1:12:41elapsed 76%CPU (0avgtext+0avgdata 5478780maxresident)k
358242808inputs+753630256outputs (41major+1632191minor)pagefaults 0swaps
Mon 23 Mar 15:10:54 GMT 2015
6. Repository check
Ouch!
# date; HOME=/mnt/lsi0/.attic time attic check /mnt/lsi0/backup/local/tardis/attic
Mon 23 Mar 17:42:37 GMT 2015
Starting repository check...
Index mismatch for key b'\xa3*;y\xe2\x9c\xabw\x9f-f\x15\x8cG\xf1\xc3\xe6\xa8\x88X \x81z\xf5\x8e\xfb-\x12\x1b\xda\xafL'. (284324, 3431159) != (-1, -1)
Index mismatch for key b'\t\x9dsJ\xb4O\xb8\x1c`\xb6\xac\xe4\xb2\x91\xf4\xf0$(\x01\xa8\xca\xc2Mf\x97\xebB\xa1\x03G\xcd\x04'. (896973, 2792174) != (-1, -1)
Index mismatch for key b'\xd6\xaf\xd3\nI\xea\xa3Ju\xf11\xaaumB\x0e[E(\xeb\x95\xa8e\xa0<\x02\x8c\xbbf\xabH\x9d'. (64814, 3158785) != (60718, 3158785)
Index mismatch for key b'\xd6\x1d"\xfbf&\xb7\x9eg#\xe2\xd7\x1at3\xd2I\xf2\xcd\x80\x1b\'\x89\xd6O\xe3\'/./&\xb8'. (932547, 1968998) != (928451, 1968998)
Index mismatch for key b'\xd6Ae3\x82d\xd7?\x9a\xfe\xe5O\xe9\x9f\x15v\x19\xfc\xfcj\x8b\xf1\xe1\xee\xd2QD\xd2)I\xbc\xaa'. (151909, 424462) != (147813, 424462)
Index mismatch for key b'\xd6\xdd\xc1\xe3Y\x9f+\xce\xa1O\xe9{\x116\x0ex\x84\xf3\xa3:\x9a\xbb\xf4"\xd6\x0e\x13\x1c\xe3\x8b\x8c\x16'. (31049, 4234035) != (26953, 4234035)
Index mismatch for key b'\xa2Lye\x18\x0cE\x9c@\xa9=\xb8j\xfb\xcc\xd5H+A\xd7(\xda|j\xc1\x1e\x89\xb1\xd4\xa6L/'. (50865, 102435) != (-1, -1)
attic: Exiting with failure status due to previous errors
Command exited with non-zero status 1
9083.93user 2191.83system 6:05:05elapsed 51%CPU (0avgtext+0avgdata 13394448maxresident)k
11931422072inputs+0outputs (30major+4071002minor)pagefaults 0swaps
7. Repository repair
I tried running attic check --repair to see if it helped at all - it didn't. I still have all the original data, so running potentially-risky operations is not a problem whilst trying to find the cause of the problem.
Note that the first backup appears to be fine in this check; the error seems to be encountered in the second backup.
# date; HOME=/mnt/lsi0/.attic time attic check --repair /mnt/lsi0/backup/local/tardis/attic
Tue 24 Mar 12:49:21 GMT 2015
attic: Warning: 'check --repair' is an experimental feature that might result
in data loss.
Type "Yes I am sure" if you understand this and want to continue.
Do you want to continue? Yes I am sure
Starting repository check...
Repository check complete, no problems found.
Starting archive consistency check...
Analyzing archive tardis/2015-03-16 (1/2)
Analyzing archive tardis/2015-03-21 (2/2)
attic: Error: Data integrity error
Command exited with non-zero status 1
10177.83user 2157.50system 6:21:42elapsed 53%CPU (0avgtext+0avgdata 13378152maxresident)k
11953745528inputs+19954344outputs (35major+10261162minor)pagefaults 0swaps
8. Remove second backup
No errors reported...
# date; HOME=/mnt/lsi0/.attic time attic delete /mnt/lsi0/backup/local/tardis/attic::tardis/2015-03-21; date
Thu 26 Mar 10:11:23 GMT 2015
318.83user 30.28system 9:19.93elapsed 62%CPU (0avgtext+0avgdata 11369940maxresident)k
41607640inputs+42205112outputs (34major+3335066minor)pagefaults 0swaps
Thu 26 Mar 10:20:43 GMT 2015
9. List backup sets
...and yup, it's gone.
# date; HOME=/mnt/lsi0/.attic time attic list /mnt/lsi0/backup/local/tardis/attic
Thu 26 Mar 16:22:00 GMT 2015
tardis/2015-03-16 Fri Mar 20 16:43:56 2015
0.18user 1.74system 0:02.37elapsed 81%CPU (0avgtext+0avgdata 5269672maxresident)k
10984inputs+0outputs (27major+1318418minor)pagefaults 0swaps
10. Repository check
No problems this time, confirming that the problem occurred in the second backup set.
# date; HOME=/mnt/lsi0/.attic time attic check /mnt/lsi0/backup/local/tardis/attic
Thu 26 Mar 16:22:22 GMT 2015
Starting repository check...
Repository check complete, no problems found.
Starting archive consistency check...
Analyzing archive tardis/2015-03-16 (1/1)
Archive consistency check complete, no problems found.
9785.26user 2155.90system 6:16:57elapsed 52%CPU (0avgtext+0avgdata 13378892maxresident)k
11941453632inputs+8outputs (0major+8857085minor)pagefaults 0swaps
11. Second backup (again)
Once more, no problems reported. Not much difference in the changed files in between the two occasions this was done.
# date; HOME=/mnt/lsi0/.attic time attic create --stats /mnt/lsi0/backup/local/tardis/attic::tardis/$(date +%F) /etc /mnt/lsi0/git /mnt/lsi0/shares /root; date
Fri 27 Mar 12:12:30 GMT 2015
------------------------------------------------------------------------------
Archive name: tardis/2015-03-27
Archive fingerprint: 99b56abb3413a7dd88a08796994cbb38cbd242ce08d6993785211caac59bb918
Start time: Fri Mar 27 12:12:31 2015
End time: Fri Mar 27 13:03:57 2015
Duration: 51 minutes 26.51 seconds
Number of files: 4384741
Original size Compressed size Deduplicated size
This archive: 7.81 TB 6.97 TB 114.95 MB
All archives: 15.61 TB 13.94 TB 6.10 TB
------------------------------------------------------------------------------
982.53user 330.76system 51:30.23elapsed 42%CPU (0avgtext+0avgdata 16568576maxresident)k
234467296inputs+407040208outputs (349major+19967580minor)pagefaults 0swaps
Fri 27 Mar 13:04:01 GMT 2015
12. Repository check
...and the error's back. Interestingly, last time there were seven mismatches on the backup set, and this time there is just one. Still, something's going wrong!
# date; HOME=/mnt/lsi0/.attic time attic check /mnt/lsi0/backup/local/tardis/attic
Fri 27 Mar 13:30:13 GMT 2015
Starting repository check...
Index mismatch for key b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'. (-4097, 0) != (1153907, 3864090)
attic: Exiting with failure status due to previous errors
Command exited with non-zero status 1
8989.32user 2143.78system 6:00:53elapsed 51%CPU (0avgtext+0avgdata 13395500maxresident)k
11931614400inputs+0outputs (32major+4071397minor)pagefaults 0swaps