storage: filesystem/mmap, Add PackScanner to handle large repos#1776
Conversation
3942799 to
d14e796
Compare
|
In my attempt to improve performance, I did come across this issue and did think about using mmap. There is one drawback to be aware of with mmap, you can end up with blocking the entire system thread on page fault while the OS fetch the missing file page. It is my understanding that the go ecosystem has been avoiding this system call sadly. There is a lot of benefit to it with reduce syscall, better OS interaction for memory usage. Still having this blocking an entire system thread, means that we could see latency spike everywhere. Way to minimize the problem would be to use a goroutine pined to a specific system thread and before touching any page send a read request to that goroutine to make sure that only one system thread at most would get blocked. Maybe another approach is to use io.ReaderAt. Or make the mmap code path optional. |
| ) | ||
|
|
||
| // mmapFile creates a memory-mapped region for the given file. | ||
| func mmapFile(f billy.File) ([]byte, func() error, error) { |
There was a problem hiding this comment.
What are your thoughts on adding Mmap to the billy? It'll be nice if other filesystems like memfs or customized ones support mmap.
There was a problem hiding this comment.
Mmap is a syscall that maps files or devices into memory, it is generally backed by a file descriptor. The file descriptor has been exposed in go-bilyl for the osfs abstractions.
Given that we already have an in-memory abstraction (memfs and memory storage), I'm not sure we would benefit from tying those to an OS-specific syscall.
There was a problem hiding this comment.
My thought was that the billy.Filesystem interface is conceptually based on an actual OS filesystem (especially posix). I’m aware that we probably wouldn’t benefit much from supporting mmap in filesystems like memfs, but I thought it should still be available for extensibility, similar to how we can mmap files on tmpfs.
I’m fine with either decision, though. It might be an unnecessary API addition for a very specific use case.
There was a problem hiding this comment.
To be fair I thought about this a bit more, and I actually agree that we should aim to push this into the filesystem abstraction layer. I created an issue to track that in go-billy as a separate effort.
| if o.diskType.IsDelta() { | ||
| data := o.scanner.packMmap[start : start+o.size] | ||
| br := sync.GetBufioReader(bytes.NewReader(data)) | ||
| return io.NopCloser(br), nil |
There was a problem hiding this comment.
| return io.NopCloser(br), nil | |
| rc := ioutil.NewReadCloser(br, ioutil.CloserFunc(func() error { | |
| sync.PutBufioReader(br) | |
| return nil | |
| })) | |
| return rc, nil |
It seems like currently br isn't being returned to the pool.
There was a problem hiding this comment.
Really nice spot, thanks for that. 🙇
| return io.NopCloser(br), nil | ||
| } | ||
|
|
||
| data := o.scanner.packMmap[start:] |
There was a problem hiding this comment.
| data := o.scanner.packMmap[start:] | |
| data := o.scanner.packMmap[start:start+o.size] |
If I understand correctly, Read should be limited to the size of the object.
There was a problem hiding this comment.
That was my initial implementation, but when trying to limit the returning slice here I had some consistency issues - IIRC some objects would break the Zlib process. Note that this is already in place for the raw reader (L95) - which is slightly more important in this case.
Given that this is just a new slice, pointing to the same underlying array, and that the Zlib process stops at the end of the compressed data, I'm not sure it is worth changing it. But happy to change if you find a way around the issue I mentioned above.
There was a problem hiding this comment.
Given that this is just a new slice, pointing to the same underlying array, and that the Zlib process stops at the end of the compressed data, I'm not sure it is worth changing it.
I wasn't aware that the zlib reader only consumes a single stream. I agree that it can stay as is. Thanks for letting me know.
| var baseOffset uint64 | ||
| var baseHash plumbing.Hash | ||
| var err error | ||
|
|
||
| if o.diskType == plumbing.OFSDeltaObject { | ||
| reader := bytes.NewReader(o.scanner.packMmap[pos:]) | ||
| negativeOffset, err := binary.ReadVariableWidthInt(reader) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to read OFS delta offset: %w", err) | ||
| } | ||
|
|
||
| baseOffset = uint64(o.offset) - uint64(negativeOffset) | ||
| consumed := len(o.scanner.packMmap[pos:]) - reader.Len() | ||
| pos += int64(consumed) | ||
| } else { | ||
| hashSize := o.scanner.hashSize | ||
| if pos+int64(hashSize) > int64(len(o.scanner.packMmap)) { | ||
| return fmt.Errorf("invalid REF delta: hash out of bounds") | ||
| } | ||
|
|
||
| baseHash, _ = plumbing.FromBytes(o.scanner.packMmap[pos : pos+int64(hashSize)]) | ||
| pos += int64(hashSize) | ||
| } | ||
|
|
||
| // Get the base object to inherit its type. | ||
| var base plumbing.EncodedObject | ||
| if o.diskType == plumbing.OFSDeltaObject { | ||
| base, err = o.scanner.GetByOffset(baseOffset) | ||
| } else { | ||
| base, err = o.scanner.Get(baseHash) | ||
| } |
There was a problem hiding this comment.
| var baseOffset uint64 | |
| var baseHash plumbing.Hash | |
| var err error | |
| if o.diskType == plumbing.OFSDeltaObject { | |
| reader := bytes.NewReader(o.scanner.packMmap[pos:]) | |
| negativeOffset, err := binary.ReadVariableWidthInt(reader) | |
| if err != nil { | |
| return fmt.Errorf("failed to read OFS delta offset: %w", err) | |
| } | |
| baseOffset = uint64(o.offset) - uint64(negativeOffset) | |
| consumed := len(o.scanner.packMmap[pos:]) - reader.Len() | |
| pos += int64(consumed) | |
| } else { | |
| hashSize := o.scanner.hashSize | |
| if pos+int64(hashSize) > int64(len(o.scanner.packMmap)) { | |
| return fmt.Errorf("invalid REF delta: hash out of bounds") | |
| } | |
| baseHash, _ = plumbing.FromBytes(o.scanner.packMmap[pos : pos+int64(hashSize)]) | |
| pos += int64(hashSize) | |
| } | |
| // Get the base object to inherit its type. | |
| var base plumbing.EncodedObject | |
| if o.diskType == plumbing.OFSDeltaObject { | |
| base, err = o.scanner.GetByOffset(baseOffset) | |
| } else { | |
| base, err = o.scanner.Get(baseHash) | |
| } | |
| // Get the base object to inherit its type. | |
| var base plumbing.EncodedObject | |
| var err error | |
| if o.diskType == plumbing.OFSDeltaObject { | |
| reader := bytes.NewReader(o.scanner.packMmap[pos:]) | |
| negativeOffset, err := binary.ReadVariableWidthInt(reader) | |
| if err != nil { | |
| return fmt.Errorf("failed to read OFS delta offset: %w", err) | |
| } | |
| baseOffset := uint64(o.offset) - uint64(negativeOffset) | |
| consumed := len(o.scanner.packMmap[pos:]) - reader.Len() | |
| pos += int64(consumed) | |
| base, err = o.scanner.GetByOffset(baseOffset) | |
| } else { | |
| hashSize := o.scanner.hashSize | |
| if pos+int64(hashSize) > int64(len(o.scanner.packMmap)) { | |
| return fmt.Errorf("invalid REF delta: hash out of bounds") | |
| } | |
| baseHash, _ := plumbing.FromBytes(o.scanner.packMmap[pos : pos+int64(hashSize)]) | |
| pos += int64(hashSize) | |
| base, err = o.scanner.Get(baseHash) | |
| } |
IMO it looks cleaner to limit the scope of baseOffset and baseHash.
There was a problem hiding this comment.
Nice one, completely agree. 👍
| } | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to get base object: %w", err) | ||
| } |
There was a problem hiding this comment.
These lines above seems to be duplicate of the logic in resolveMetadata. Moving this to a separate function could be a good idea.
There was a problem hiding this comment.
I agree with the sentiment, however a big part of the delta resolution logic is duplicated from what already exists in the packfile package - but in a slightly different shape and flow. A follow-up PR should take care of extracting that logic (potentially) to its own package that can then be used there and here.
| return s.offset(pos) | ||
| } | ||
|
|
||
| // FindOffset is deprecated. Use OffsetByHash instead. |
There was a problem hiding this comment.
| // FindOffset is deprecated. Use OffsetByHash instead. | |
| // Deprecated: Use OffsetByHash instead. |
This enables some great features. https://go.dev/wiki/Deprecated
There was a problem hiding this comment.
I thought about this a bit more, and I think we are better off completely dropping off these deprecation for the time being. Ideally that would be done across all the "Pack handlers", so it can be done as a separate PR later.
| return id, nil | ||
| } | ||
|
|
||
| // FindHash is deprecated. Use HashByOffset instead. |
There was a problem hiding this comment.
| // FindHash is deprecated. Use HashByOffset instead. | |
| // Deprecated: Use HashByOffset instead. |
Same as above.
| // resetGlobalSyncPools resets the global sync pools. This is needed as | ||
| // the Suite execution ends up copying global vars by value, which in this | ||
| // case specifically results in the `New` becoming nil. | ||
| func resetGlobalSyncPools() { | ||
| bufioPool = sync.Pool{New: func() any { | ||
| return bufio.NewReader(nil) | ||
| }} | ||
| } |
There was a problem hiding this comment.
I'm struggling to find out how this exactly happens. I would really appreciate it if you could elaborate on this.
There was a problem hiding this comment.
Not sure I understand the question, a) are unsure on why the problem occurs or b) on how the fix works?
For the former, it seems that Testify.Suite would value copy global vars to ensure the different suites instances don't impact on one another. sync.Pool objects are not meant to be copied, so once you have 2-3 Suites tests you can step debug and see New set to nil, which leads to a panic (due to the reset that follows it).
As for the latter, I'm using linkname (L57) to make the remote sync.bufioReader available via the local bufioPool var. That way I can reset it to a new sync.Pool.
There was a problem hiding this comment.
Thanks for answering my ambiguous question. 😅
I was curious about the former, that testify.Suite can copy global variables.
sync.Poolobjects are not meant to be copied, so once you have 2-3 Suites tests you can step debug and seeNewset tonil, which leads to a panic (due to the reset that follows it).
However, in my local debug this seems to be not the case. I have debugged TestPackfileWithFS and got the result below.
bufioReader.New wasn't nil. It was x (the value from the pool) being non-nil, with nil data inside.
I believe the panic was caused by Packfile.Close() putting nil reader(p.rbuf) to the pool, specifically on the tests of FindHash() and FindOffset(). These two methods don't call p.init() to initialize p.rbuf, as they belong to idxfile.
Removing resetGlobalSyncPools() from SetupTest and calling handler.Get() in those tests to manually trigger p.init() fixed the issue for me.
I think we should make Packfile.Close() handle this case.
There was a problem hiding this comment.
For future reference, while debugging this without resetGlobalSyncPools, r is nil as the pool is empty and the func New is nil and therefore not called (as stepped into golib code). As per stack trace, init is being called:
--- FAIL: TestPackfileEmbedFS (0.00s)
--- FAIL: TestPackfileEmbedFS/TestGet (0.00s)
--- FAIL: TestPackfileEmbedFS/TestGet/valid_commit_hash (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered, repanicked]
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x7b4e18]
goroutine 29 [running]:
testing.tRunner.func1.2({0x89eaa0, 0x3b8c610})
/usr/lib64/go/1.25/src/testing/testing.go:1872 +0x237
testing.tRunner.func1()
/usr/lib64/go/1.25/src/testing/testing.go:1875 +0x35b
panic({0x89eaa0?, 0x3b8c610?})
/usr/lib64/go/1.25/src/runtime/panic.go:783 +0x132
bufio.(*Reader).Reset(...)
/usr/lib64/go/1.25/src/bufio/bufio.go:81
github.com/go-git/go-git/v6/utils/sync.GetBufioReader({0x0, 0x0})
/home/levi/git/go-git/go-git/utils/sync/bufio.go:23 +0x78
github.com/go-git/go-git/v6/plumbing/format/packfile.(*Packfile).init.func1()
/home/levi/git/go-git/go-git/plumbing/format/packfile/packfile.go:211 +0x39
sync.(*Once).doSlow(0x3b46440?, 0x8dd400?)
/usr/lib64/go/1.25/src/sync/once.go:78 +0xac
sync.(*Once).Do(...)
/usr/lib64/go/1.25/src/sync/once.go:69
github.com/go-git/go-git/v6/plumbing/format/packfile.(*Packfile).init(0xc0000c6370)
/home/levi/git/go-git/go-git/plumbing/format/packfile/packfile.go:200 +0x45
github.com/go-git/go-git/v6/plumbing/format/packfile.(*Packfile).Get(0xc0000c6370, {{0xe8, 0xd3, 0xff, 0xab, 0x55, 0x28, 0x95, 0xc1, 0x9b, ...}, ...})
/home/levi/git/go-git/go-git/plumbing/format/packfile/packfile.go:69 +0x37
github.com/go-git/go-git/v6/tests/pack_test.(*PackHandlerSuite[...]).TestGet.func1()
/home/levi/git/go-git/go-git/tests/pack/pack_test.go:280 +0x52
testing.tRunner(0xc0001eb180, 0xc0000e0510)
/usr/lib64/go/1.25/src/testing/testing.go:1934 +0xea
created by testing.(*T).Run in goroutine 28
/usr/lib64/go/1.25/src/testing/testing.go:1997 +0x465
FAIL github.com/go-git/go-git/v6/tests/pack 0.008s
FAIL
We can leave this as is for now, and review on a follow-up PR as this is related to the testing code as opposed to the main change proposed.
@cedric-appdirect Thanks for pointing that out. For context, handling very large repositories with the vanilla filesystem implementation results in GB's of memory churn. That combined with the impact of the GC pressure results in a very tough time for memory-constraint machines, as that may starve the rest of the system of both CPU and memory. Now considering the benchmark I shared, the mmap implementation has a rough cost of
It would make a lot of sense to use that to add support for Windows. Perhaps that becomes one additional back-end for the PackScanner.
This is already optional - as per PR description and linked issue. |
Absolutely. mmap is really good at working with the OS at evicting page and only bringing and keeping the bit of the file actively needed in memory.
In general, this is true. It is more an edge case scenario problem. Once your process is under memory pressure, things can become really bad. The closer I can describe it is when you computer start to be using the swap for data needed by active code, there is nothing that can bring back that computer to live. You are good to restart. That can get that bad. The main concern not being for the performance of go-git, but the entire go application itself as it could end up with all its system thread being blocked (the less CPU cores you have the more likely that would be to a problem). This would add latency to unrelated goroutine even if the go-git one would perform better than previous version.
Or for people that use the package inside things like web server. It would be less efficient/performant, I would expect, than the mmap variant, but without the random latency penalty.
Excellent, I am not yet accustomed to the full code base and missed it. Sorry. |
|
I may have an interest in implementing the io.ReaderAt version, as I use go-git in web service, once this PR is merged. Let me know if you do not plan to work on it. |
The latest version of both dependencies enable the use of OSFixture and Fd(), both being requirements to test the mmap implementation. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
The new PackScanner is able to navigate through a packfile and inflate objects on demand. This avoids costly operations such as loading into memory the entire index and resolving all delta objects which may or may not be used. Performance-wise, the impact is negligible for small repositories, with the exception of the 60% drop in number of allocations: BenchmarkPackHandlers/packfile-cache-osfs:_basic-fixture-16 4020 288678 ns/op 294714 B/op 743 allocs/op BenchmarkPackHandlers/packfile-nocache-memfs:basic-fixture-16 4272 280586 ns/op 294884 B/op 742 allocs/op BenchmarkPackHandlers/mmap-pack-scanner:basic-fixture-16 5646 212349 ns/op 293650 B/op 286 allocs/op In very large repositories is where most gains can be observed. For the linux kernel, the memory churn of fetching 4 objects drops from 1.3GB to 1MB. Both allocations and time per operation are a fraction of the existing implementation: BenchmarkPackHandlers/packfile-cache-osfs:_linux-kernel-16 1 2240495709 ns/op 1331433920 B/op 35933 allocs/op BenchmarkPackHandlers/packfile-nocache-memfs:linux-kernel-16 1 2026880001 ns/op 1331437040 B/op 35935 allocs/op BenchmarkPackHandlers/mmap-pack-scanner:linux-kernel-16 477 2513275 ns/op 1011512 B/op 1488 allocs/op Signed-off-by: Paulo Gomes <pjbgf@linux.com>
onee-only
left a comment
There was a problem hiding this comment.
LGTM. For some unknown reason, I can't resolve those reviews.
The new PackScanner is able to navigate through a packfile and inflate objects on demand. This avoids costly operations such as loading into memory the entire index and resolving all delta objects which may or may not be used.
Performance-wise, the impact is negligible for small repositories, with the exception of the 60% drop in number of allocations:
In very large repositories is where most gains can be observed. For the linux kernel, the memory churn of fetching 4 objects drops from
1.3GBto1MB.Both allocations and time per operation are a fraction of the existing implementation:
Additionally, this PR:
Relates to: #1451 #1691 #1673 #1601.
Part of: #1774.
This is part of a openSUSE Hack Week 25 project.
Special thanks to @runxiyu for the initial nudge on mmap, which triggered the research that led to this PR. 🙇