Skip to content

restorer: Preallocate files#2893

Merged
MichaelEischer merged 3 commits intorestic:masterfrom
MichaelEischer:restore-preallocate
Sep 8, 2020
Merged

restorer: Preallocate files#2893
MichaelEischer merged 3 commits intorestic:masterfrom
MichaelEischer:restore-preallocate

Conversation

@MichaelEischer
Copy link
Copy Markdown
Member

@MichaelEischer MichaelEischer commented Aug 19, 2020

What does this PR change? What problem does it solve?

The new restorer implementation writes data blobs in a random order to the destinations. On filesystems which implicitly create sparse files (probably on most unixoid systems) when writing to a location somewhere after the file end this can lead to a very large amount of file fragments. Even without sparse files (e.g. the default on Windows) the file will grow in multiple increments which can cause fragmentation.

The restore loop itself now queues pack files for download in to order of their first reference in one of the restored files. For a large file this should usually mean that the pack download order closely resembles the order in which data blobs are referenced. That way the access pattern more closely resembles that of a file which grows over time.

In addition, after a file is created, the fileswriter will issue a call to preallocate the full file size. This allows the filesystem to allocate space with as little fragmentation as possible.

There's currently a preallocate implementation for Linux and macOS. For Windows the generic implementation with Truncate calls SetEndOfFile which is supposed to preallocate disk space. On Linux the file system is able to track not yet initialized file extents, whereas on Windows there's only an offset up to which the file content was initialized. This means that a write into the middle of a file has to zero everything up to then.

The preallocate implementation is currently located in internal/restorer which is probably not the ideal place, but internal/restic is already overcrowded. There's also a test in internal/restorer/preallocate_test.go, which tests something, but I'm not really sure how fragile it is.

Was the change discussed in an issue or in the forum before?

Closes #2675

Checklist

  • I have read the Contribution Guidelines
  • I have enabled maintainer edits for this PR
  • I have added tests for all changes in this PR
  • No user visible changes. [ ] I have added documentation for the changes (in the manual)
  • Not necessary, as the new restorer implementation is not yet released. [ ] There's a new file in changelog/unreleased/ that describes the changes for our users (template here)
  • I have run gofmt on the code in all commits
  • All commit messages are formatted in the same style as the other commits in the repo
  • I'm done, this Pull Request is ready for review

@MichaelEischer MichaelEischer force-pushed the restore-preallocate branch 2 times, most recently from 94e5322 to 23bd285 Compare August 20, 2020 18:40
@MichaelEischer
Copy link
Copy Markdown
Member Author

@ifedorenko could you take a look at this PR?

@ifedorenko
Copy link
Copy Markdown
Contributor

sorry for the slow response... will try to find time to review this next week

@MichaelEischer MichaelEischer force-pushed the restore-preallocate branch 2 times, most recently from 25fb1b2 to e5dd6bf Compare September 5, 2020 15:46
@MichaelEischer MichaelEischer force-pushed the restore-preallocate branch 2 times, most recently from b9f39d1 to d0264b0 Compare September 6, 2020 19:41
@MichaelEischer MichaelEischer merged commit 88664ba into restic:master Sep 8, 2020
@MichaelEischer MichaelEischer deleted the restore-preallocate branch September 8, 2020 20:43
@rawtaz
Copy link
Copy Markdown
Contributor

rawtaz commented Sep 8, 2020

Really great work by you guys. Thanks so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use of fallocate in restore to prevent extreme fragmentation of (large) files

3 participants