fileutils: new function to check for path existence#1875
fileutils: new function to check for path existence#1875openshift-merge-bot[bot] merged 13 commits intocontainers:mainfrom
Conversation
pkg/fileutils/fileutils_darwin.go
Outdated
| } | ||
|
|
||
| // PathExists checks whether a file or directory exists at the given path. | ||
| func PathExists(path string, followSymlinks bool) error { |
There was a problem hiding this comment.
Should the name of this function just return an err.
err:=fileutils.Exist(PATH);
err = nil -> True
err == ENOEXIST == False
err == (!nil && not ENOEXIST) == err.
There was a problem hiding this comment.
Then perhaps have a fileutils.Lexist(PATH) for not follow symlinks.
There was a problem hiding this comment.
I didn't change the type of the returned value because it is easier to replace the existing calls to os.Stat(). Every existing call to _, err := os.Stat(...) simply becomes err := fileutils.PathExists(..., true) and the existing code for handling err is the same.
There was a problem hiding this comment.
I don't see the difference, since my function will do the same, except there is two functions, Exists, and Lexists, which is similar to Stat and Lstat.
There was a problem hiding this comment.
pushed a new version with Exist and Lexist
|
LGTM, can you squash the commits. |
|
I think keeping them separate helps with the review and potentially the backport. That's what @mtrmac asked me to do last time I've sent a big patch |
|
Except that you are really only making a single change here. Adding two functions and then using them everywhere. I don't see any difference between the changes, so no need to split them out. |
pkg/fileutils/fileutils_darwin.go
Outdated
|
|
||
| // Exist checks whether a file or directory exists at the given path. | ||
| func Exist(path string) error { | ||
| _, err := os.Stat(path) |
There was a problem hiding this comment.
golang.org/x/sys/unix defines an Faccessat() on Darwin. What's the rationale for not using it?
There was a problem hiding this comment.
I've no way to test it (and if it is worth in terms of performance)
There was a problem hiding this comment.
FWIW, it works the same (the test just needs the OS check removed), and
BenchmarkExist-10 413298 2890 ns/op
BenchmarkStat-10 333810 3559 ns/op
… but also, the tests of that package are now failing on macOS (because /tmp is a symlink), we are clearly not running them.
So, at this moment I’d also lean towards keeping the simplest possible implementation. (Or, alternatively, we could start testing things in CI, but I can’t build a case for doing that based on this PR alone.)
Maybe add a todo here to help future maintainers?
There was a problem hiding this comment.
thanks for testing it, I will add it to Darwin too
mtrmac
left a comment
There was a problem hiding this comment.
- Efficiency improvements are always nice
- Looking at
git grep -C5 '_, .*os.Lstat'andit grep -C5 '_, .*os.Stat', there seem to be more places where this could be used — e.g.drivers/aufs, and other files inpkg/archive. That’s not blocking this PR, of course. - (Aesthetically, the name feels a bit off to me.
fileutils.{Exists,Lexists}would be a small improvement, but really I’d expect a function named like that to return abool, not anerror. OTOH the returned error is clearly useful… and I can’t think of a good name.EnsurePathExistsis so ugly.) - Do we want this in a ~stable public API of the module? If it is expected to be beneficial in other repos like Podman, sure, let’s make it stable. OTOH making it private would make the naming concerns much less relevant. I don’t really have an opinion here, I just want to make sure this is an intentional decision.
My personal view of this is that it is ideal if there is one idea per patch. Introduce one new feature. Fix one bug, or a homogenous class of bugs. Do one refactor. (Ideally the refactors are so trivial that in case of a merge conflict, they could just be thrown away and made again just based on the commit description.) In that sense, the migration to the new API could well be a single commit. OTOH more commits never really hurt — and they make allocating time for reviews a tiny bit easier, because I know there are natural stopping points and I know before starting to look at that patch that I can make definite progress in 15 minutes, (as opposed to allocating an hour or two to stare at a huge do-12-things commit; that can happen only a few times a day). Or course, everyone thinks and works a bit differently. Seeing small commits probably makes me more likely to take on the review, and to do it quickly — OTOH there is a definite cost to creating those commits in the first place. |
872d522 to
b1b8f9a
Compare
I've fixed these callers too.
Renamed the functions to `fileutils.{Exists, Lexists}. My first version of the API looked like
Yes, let's make it stable and fix the same pattern in other repos. |
mtrmac
left a comment
There was a problem hiding this comment.
LGTM; let’s maybe wait a few hours to confirm consensus on the macOS situation.
mtrmac
left a comment
There was a problem hiding this comment.
LGTM other than that one thing, feel free to merge without another review.
On unix they use the access syscall which is faster than stat for simply checking the existence of a file. with a file in the cache I see on my machine: $ go test -bench=. goos: linux goarch: amd64 pkg: github.com/containers/storage/pkg/fileutils cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz BenchmarkExists-12 925832 1245 ns/op BenchmarkStat-12 745896 1602 ns/op PASS I'd expect even better results if the file is not already in the cache. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
@nalind PTAL |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe, rhatdan The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
fix for c/common: containers/common#1953 |
On unix it uses the access syscall which is faster than stat for simply checking the existence of a file.
with a file in the cache I see on my machine:
$ go test -bench=.
goos: linux
goarch: amd64
pkg: github.com/containers/storage/pkg/fileutils
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
BenchmarkPathExists-12 1000000 1037 ns/op
BenchmarkStat-12 897534 1330 ns/op
I'd expect even better results if the file is not already in the cache.
Signed-off-by: Giuseppe Scrivano gscrivan@redhat.com