Draft
Conversation
11 tasks
1b3915a to
a149c40
Compare
Add the skeleton, but leave filling in the details to later commits.
The implementations are 90% copy&paste from the go standard library as the existing code does not offer any way to read the symlink target based on a filehandle. Fall back to a standard readlink on platforms other than Linux and Windows as those either don't even provide the necessary syscall or in case of macOS are not yet available in Go.
The xattrs are read from /proc/self/fd/%d but instead show the original file path to make the error message more useful.
a149c40 to
cee0594
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR change? What problem does it solve?
Currently the archiver collects metadata and the file content in multiple steps during which a file could disappear or be renamed. The underlying reason for this is that files are currently access in multiple steps (stat, xattrs etc.) using their filepath. This PR introduces a new code path that instead opens a filehandle once and uses that to collect metadata and file content. This ensures that the file cannot disappear in the middle of the operation and ensures that the metadata actually belongs to the intended file.
The implementation turned out to be far more complicated than I've expected. Unfortunately, there is no standardized way across operating systems to open filehandles for arbitrary filetypes (well, reading all required metadata from a filehandle is also much more complex than it should be). In particular, symlinks turned out to be a major problem. Each filesystem API is broken or weird in its own way:
O_PATH|O_NOFOLLOWflags. However, this filehandle cannot be used to retrieve xattrs. Luckily there is a workaround: read the xattrs from/proc/self/fd/%d. Reading the file content requires atomically reopening the file, this is possible by opening the just mentioned procfs filepath. Linux before 3.6 does not support the required syscalls (fstat on a file handle opened using O_PATH)ReOpenFile. Symlinks are relatively straightforward to handle by specifyingOPEN_REPARSE_POINTwhere necessary.O_SYMLINKflag. There is no concept of a metadata-only filehandle, so the file has always be opened for reading. This could become a problem if restic is not allowed to do that. However, this case has already resulted in an error in the past. xattrs can be read from this filehandle. In contrast, reading the symlink target from a filehandle is only possible since macOS 13, but is not exposed in Go, therefore requiring a fallback.This leads to a few requirements.
For this purpose, the implementation introduces an new
metadataHandlewhich is used bynodeFromFileInfoto collect all metadata. The interface is implemented byfdMetadataHandle(the just describefilehandleapproach) andpathMetadataHandle. The active implementation can be selected at runtime.fs.Localinternally used alocalFiletype. It is now wrapped by eitherfdLocalFileorpathLocalFile. Those two structs implement the open and reopen lifecycle described in the requirements.The PR copies a lot of code for
freadlinkfrom the Go standard library as it unfortunately only exposes a path based interface, whereas this PR absolutely require filehandles.As a drive-by bugfix, the PR fixes xattr retrieval if a component of the backup source paths is a symlink. For example if
/testis a symlink to/examplewhich context/example/file. Then a backup of/test/fileshould retrieve all metadata for/testfrom/example. However, only basic metadata was collected from/examplewhereas the xattrs were read from/test.Remaining TODOs
Was the change previously discussed in an issue or on the forum?
Part of #5021
Builds upon #5143
Fixes #3098
Fixes #2165
Checklist
changelog/unreleased/that describes the changes for our users (see template).