storage: filesystem/mmap, Add PackScanner to handle large repos by pjbgf · Pull Request #1776 · go-git/go-git

pjbgf · 2025-12-03T23:15:40Z

The new PackScanner is able to navigate through a packfile and inflate objects on demand. This avoids costly operations such as loading into memory the entire index and resolving all delta objects which may or may not be used.

Performance-wise, the impact is negligible for small repositories, with the exception of the 60% drop in number of allocations:

BenchmarkPackHandlers/packfile-cache-osfs:_basic-fixture-16   	    4020	    288678 ns/op	  294714 B/op	     743 allocs/op
BenchmarkPackHandlers/packfile-nocache-memfs:basic-fixture-16 	    4272	    280586 ns/op	  294884 B/op	     742 allocs/op
BenchmarkPackHandlers/mmap-pack-scanner:basic-fixture-16      	    5646	    212349 ns/op	  293650 B/op	     286 allocs/op

In very large repositories is where most gains can be observed. For the linux kernel, the memory churn of fetching 4 objects drops from 1.3GB to 1MB.
Both allocations and time per operation are a fraction of the existing implementation:

BenchmarkPackHandlers/packfile-cache-osfs:_linux-kernel-16    	       1	2240495709 ns/op	1331433920 B/op	   35933 allocs/op
BenchmarkPackHandlers/packfile-nocache-memfs:linux-kernel-16  	       1	2026880001 ns/op	1331437040 B/op	   35935 allocs/op
BenchmarkPackHandlers/mmap-pack-scanner:linux-kernel-16       	     477	   2513275 ns/op	   1011512 B/op	    1488 allocs/op

Additionally, this PR:

Refactors some internal pack functions so that it can be used by new and existing "Pack Handlers".
Creates a new test/ dir, to amass tests that goes across implementations. This is also the home for some high-level benchmarks, which will help monitor performance improvements over time.

Relates to: #1451 #1691 #1673 #1601.
Part of: #1774.

This is part of a openSUSE Hack Week 25 project.

Special thanks to @runxiyu for the initial nudge on mmap, which triggered the research that led to this PR. 🙇

cedric-appdirect · 2025-12-04T16:30:11Z

In my attempt to improve performance, I did come across this issue and did think about using mmap. There is one drawback to be aware of with mmap, you can end up with blocking the entire system thread on page fault while the OS fetch the missing file page. It is my understanding that the go ecosystem has been avoiding this system call sadly. There is a lot of benefit to it with reduce syscall, better OS interaction for memory usage. Still having this blocking an entire system thread, means that we could see latency spike everywhere. Way to minimize the problem would be to use a goroutine pined to a specific system thread and before touching any page send a read request to that goroutine to make sure that only one system thread at most would get blocked. Maybe another approach is to use io.ReaderAt. Or make the mmap code path optional.

onee-only · 2025-12-04T14:57:01Z

+)
+
+// mmapFile creates a memory-mapped region for the given file.
+func mmapFile(f billy.File) ([]byte, func() error, error) {


What are your thoughts on adding Mmap to the billy? It'll be nice if other filesystems like memfs or customized ones support mmap.

Mmap is a syscall that maps files or devices into memory, it is generally backed by a file descriptor. The file descriptor has been exposed in go-bilyl for the osfs abstractions.

Given that we already have an in-memory abstraction (memfs and memory storage), I'm not sure we would benefit from tying those to an OS-specific syscall.

My thought was that the billy.Filesystem interface is conceptually based on an actual OS filesystem (especially posix). I’m aware that we probably wouldn’t benefit much from supporting mmap in filesystems like memfs, but I thought it should still be available for extensibility, similar to how we can mmap files on tmpfs.

I’m fine with either decision, though. It might be an unnecessary API addition for a very specific use case.

To be fair I thought about this a bit more, and I actually agree that we should aim to push this into the filesystem abstraction layer. I created an issue to track that in go-billy as a separate effort.

onee-only · 2025-12-04T15:06:23Z

+	if o.diskType.IsDelta() {
+		data := o.scanner.packMmap[start : start+o.size]
+		br := sync.GetBufioReader(bytes.NewReader(data))
+		return io.NopCloser(br), nil


Suggested change

return io.NopCloser(br), nil

rc := ioutil.NewReadCloser(br, ioutil.CloserFunc(func() error {

sync.PutBufioReader(br)

return nil

}))

return rc, nil

It seems like currently br isn't being returned to the pool.

Really nice spot, thanks for that. 🙇

onee-only · 2025-12-04T15:10:13Z

+		return io.NopCloser(br), nil
+	}
+
+	data := o.scanner.packMmap[start:]


Suggested change

data := o.scanner.packMmap[start:]

data := o.scanner.packMmap[start:start+o.size]

If I understand correctly, Read should be limited to the size of the object.

That was my initial implementation, but when trying to limit the returning slice here I had some consistency issues - IIRC some objects would break the Zlib process. Note that this is already in place for the raw reader (L95) - which is slightly more important in this case.

Given that this is just a new slice, pointing to the same underlying array, and that the Zlib process stops at the end of the compressed data, I'm not sure it is worth changing it. But happy to change if you find a way around the issue I mentioned above.

Given that this is just a new slice, pointing to the same underlying array, and that the Zlib process stops at the end of the compressed data, I'm not sure it is worth changing it.

I wasn't aware that the zlib reader only consumes a single stream. I agree that it can stay as is. Thanks for letting me know.

onee-only · 2025-12-04T15:22:19Z

+	var baseOffset uint64
+	var baseHash plumbing.Hash
+	var err error
+
+	if o.diskType == plumbing.OFSDeltaObject {
+		reader := bytes.NewReader(o.scanner.packMmap[pos:])
+		negativeOffset, err := binary.ReadVariableWidthInt(reader)
+		if err != nil {
+			return fmt.Errorf("failed to read OFS delta offset: %w", err)
+		}
+
+		baseOffset = uint64(o.offset) - uint64(negativeOffset)
+		consumed := len(o.scanner.packMmap[pos:]) - reader.Len()
+		pos += int64(consumed)
+	} else {
+		hashSize := o.scanner.hashSize
+		if pos+int64(hashSize) > int64(len(o.scanner.packMmap)) {
+			return fmt.Errorf("invalid REF delta: hash out of bounds")
+		}
+
+		baseHash, _ = plumbing.FromBytes(o.scanner.packMmap[pos : pos+int64(hashSize)])
+		pos += int64(hashSize)
+	}
+
+	// Get the base object to inherit its type.
+	var base plumbing.EncodedObject
+	if o.diskType == plumbing.OFSDeltaObject {
+		base, err = o.scanner.GetByOffset(baseOffset)
+	} else {
+		base, err = o.scanner.Get(baseHash)
+	}


Suggested change

var baseOffset uint64

var baseHash plumbing.Hash

var err error

if o.diskType == plumbing.OFSDeltaObject {

reader := bytes.NewReader(o.scanner.packMmap[pos:])

negativeOffset, err := binary.ReadVariableWidthInt(reader)

if err != nil {

return fmt.Errorf("failed to read OFS delta offset: %w", err)

}

baseOffset = uint64(o.offset) - uint64(negativeOffset)

consumed := len(o.scanner.packMmap[pos:]) - reader.Len()

pos += int64(consumed)

} else {

hashSize := o.scanner.hashSize

if pos+int64(hashSize) > int64(len(o.scanner.packMmap)) {

return fmt.Errorf("invalid REF delta: hash out of bounds")

}

baseHash, _ = plumbing.FromBytes(o.scanner.packMmap[pos : pos+int64(hashSize)])

pos += int64(hashSize)

}

// Get the base object to inherit its type.

var base plumbing.EncodedObject

if o.diskType == plumbing.OFSDeltaObject {

base, err = o.scanner.GetByOffset(baseOffset)

} else {

base, err = o.scanner.Get(baseHash)

}

// Get the base object to inherit its type.

var base plumbing.EncodedObject

var err error

if o.diskType == plumbing.OFSDeltaObject {

reader := bytes.NewReader(o.scanner.packMmap[pos:])

negativeOffset, err := binary.ReadVariableWidthInt(reader)

if err != nil {

return fmt.Errorf("failed to read OFS delta offset: %w", err)

}

baseOffset := uint64(o.offset) - uint64(negativeOffset)

consumed := len(o.scanner.packMmap[pos:]) - reader.Len()

pos += int64(consumed)

base, err = o.scanner.GetByOffset(baseOffset)

} else {

hashSize := o.scanner.hashSize

if pos+int64(hashSize) > int64(len(o.scanner.packMmap)) {

return fmt.Errorf("invalid REF delta: hash out of bounds")

}

baseHash, _ := plumbing.FromBytes(o.scanner.packMmap[pos : pos+int64(hashSize)])

pos += int64(hashSize)

base, err = o.scanner.Get(baseHash)

}

IMO it looks cleaner to limit the scope of baseOffset and baseHash.

Nice one, completely agree. 👍

onee-only · 2025-12-04T15:31:56Z

+	}
+	if err != nil {
+		return nil, fmt.Errorf("failed to get base object: %w", err)
+	}


These lines above seems to be duplicate of the logic in resolveMetadata. Moving this to a separate function could be a good idea.

I agree with the sentiment, however a big part of the delta resolution logic is duplicated from what already exists in the packfile package - but in a slightly different shape and flow. A follow-up PR should take care of extracting that logic (potentially) to its own package that can then be used there and here.

onee-only · 2025-12-04T15:44:55Z

+	return s.offset(pos)
+}
+
+// FindOffset is deprecated. Use OffsetByHash instead.


Suggested change

// FindOffset is deprecated. Use OffsetByHash instead.

// Deprecated: Use OffsetByHash instead.

This enables some great features. https://go.dev/wiki/Deprecated

I thought about this a bit more, and I think we are better off completely dropping off these deprecation for the time being. Ideally that would be done across all the "Pack handlers", so it can be done as a separate PR later.

onee-only · 2025-12-04T15:48:00Z

+	return id, nil
+}
+
+// FindHash is deprecated. Use HashByOffset instead.


Suggested change

// FindHash is deprecated. Use HashByOffset instead.

// Deprecated: Use HashByOffset instead.

Same as above.

onee-only · 2025-12-04T16:17:09Z

+// resetGlobalSyncPools resets the global sync pools. This is needed as
+// the Suite execution ends up copying global vars by value, which in this
+// case specifically results in the `New` becoming nil.
+func resetGlobalSyncPools() {
+	bufioPool = sync.Pool{New: func() any {
+		return bufio.NewReader(nil)
+	}}
+}


I'm struggling to find out how this exactly happens. I would really appreciate it if you could elaborate on this.

Not sure I understand the question, a) are unsure on why the problem occurs or b) on how the fix works?

For the former, it seems that Testify.Suite would value copy global vars to ensure the different suites instances don't impact on one another. sync.Pool objects are not meant to be copied, so once you have 2-3 Suites tests you can step debug and see New set to nil, which leads to a panic (due to the reset that follows it).

As for the latter, I'm using linkname (L57) to make the remote sync.bufioReader available via the local bufioPool var. That way I can reset it to a new sync.Pool.

Thanks for answering my ambiguous question. 😅
I was curious about the former, that testify.Suite can copy global variables.

sync.Pool objects are not meant to be copied, so once you have 2-3 Suites tests you can step debug and see New set to nil, which leads to a panic (due to the reset that follows it).

However, in my local debug this seems to be not the case. I have debugged TestPackfileWithFS and got the result below.

bufioReader.New wasn't nil. It was x (the value from the pool) being non-nil, with nil data inside.

I believe the panic was caused by Packfile.Close() putting nil reader(p.rbuf) to the pool, specifically on the tests of FindHash() and FindOffset(). These two methods don't call p.init() to initialize p.rbuf, as they belong to idxfile.

Removing resetGlobalSyncPools() from SetupTest and calling handler.Get() in those tests to manually trigger p.init() fixed the issue for me.

I think we should make Packfile.Close() handle this case.

For future reference, while debugging this without resetGlobalSyncPools, r is nil as the pool is empty and the func New is nil and therefore not called (as stepped into golib code). As per stack trace, init is being called:

--- FAIL: TestPackfileEmbedFS (0.00s) --- FAIL: TestPackfileEmbedFS/TestGet (0.00s) --- FAIL: TestPackfileEmbedFS/TestGet/valid_commit_hash (0.00s) panic: runtime error: invalid memory address or nil pointer dereference [recovered, repanicked] [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x7b4e18] goroutine 29 [running]: testing.tRunner.func1.2({0x89eaa0, 0x3b8c610}) /usr/lib64/go/1.25/src/testing/testing.go:1872 +0x237 testing.tRunner.func1() /usr/lib64/go/1.25/src/testing/testing.go:1875 +0x35b panic({0x89eaa0?, 0x3b8c610?}) /usr/lib64/go/1.25/src/runtime/panic.go:783 +0x132 bufio.(*Reader).Reset(...) /usr/lib64/go/1.25/src/bufio/bufio.go:81 github.com/go-git/go-git/v6/utils/sync.GetBufioReader({0x0, 0x0}) /home/levi/git/go-git/go-git/utils/sync/bufio.go:23 +0x78 github.com/go-git/go-git/v6/plumbing/format/packfile.(*Packfile).init.func1() /home/levi/git/go-git/go-git/plumbing/format/packfile/packfile.go:211 +0x39 sync.(*Once).doSlow(0x3b46440?, 0x8dd400?) /usr/lib64/go/1.25/src/sync/once.go:78 +0xac sync.(*Once).Do(...) /usr/lib64/go/1.25/src/sync/once.go:69 github.com/go-git/go-git/v6/plumbing/format/packfile.(*Packfile).init(0xc0000c6370) /home/levi/git/go-git/go-git/plumbing/format/packfile/packfile.go:200 +0x45 github.com/go-git/go-git/v6/plumbing/format/packfile.(*Packfile).Get(0xc0000c6370, {{0xe8, 0xd3, 0xff, 0xab, 0x55, 0x28, 0x95, 0xc1, 0x9b, ...}, ...}) /home/levi/git/go-git/go-git/plumbing/format/packfile/packfile.go:69 +0x37 github.com/go-git/go-git/v6/tests/pack_test.(*PackHandlerSuite[...]).TestGet.func1() /home/levi/git/go-git/go-git/tests/pack/pack_test.go:280 +0x52 testing.tRunner(0xc0001eb180, 0xc0000e0510) /usr/lib64/go/1.25/src/testing/testing.go:1934 +0xea created by testing.(*T).Run in goroutine 28 /usr/lib64/go/1.25/src/testing/testing.go:1997 +0x465 FAIL github.com/go-git/go-git/v6/tests/pack 0.008s FAIL

We can leave this as is for now, and review on a follow-up PR as this is related to the testing code as opposed to the main change proposed.

pjbgf · 2025-12-04T22:24:40Z

There is one drawback to be aware of with mmap, you can end up with blocking the entire system thread on page fault while the OS fetch the missing file page.

@cedric-appdirect Thanks for pointing that out. For context, handling very large repositories with the vanilla filesystem implementation results in GB's of memory churn. That combined with the impact of the GC pressure results in a very tough time for memory-constraint machines, as that may starve the rest of the system of both CPU and memory.

Now considering the benchmark I shared, the mmap implementation has a rough cost of 2.5ms per operation. The page fault would need to make it at least 500 times slower, for the performance to match the current implementation.

Maybe another approach is to use io.ReaderAt.

It would make a lot of sense to use that to add support for Windows. Perhaps that becomes one additional back-end for the PackScanner.

Or make the mmap code path optional.

This is already optional - as per PR description and linked issue.

cedric-appdirect · 2025-12-04T23:44:45Z

There is one drawback to be aware of with mmap, you can end up with blocking the entire system thread on page fault while the OS fetch the missing file page.

@cedric-appdirect Thanks for pointing that out. For context, handling very large repositories with the vanilla filesystem implementation results in GB's of memory churn. That combined with the impact of the GC pressure results in a very tough time for memory-constraint machines, as that may starve the rest of the system of both CPU and memory.

Absolutely. mmap is really good at working with the OS at evicting page and only bringing and keeping the bit of the file actively needed in memory.

Now considering the benchmark I shared, the mmap implementation has a rough cost of 2.5ms per operation. The page fault would need to make it at least 500 times slower, for the performance to match the current implementation.

In general, this is true. It is more an edge case scenario problem. Once your process is under memory pressure, things can become really bad. The closer I can describe it is when you computer start to be using the swap for data needed by active code, there is nothing that can bring back that computer to live. You are good to restart. That can get that bad. The main concern not being for the performance of go-git, but the entire go application itself as it could end up with all its system thread being blocked (the less CPU cores you have the more likely that would be to a problem). This would add latency to unrelated goroutine even if the go-git one would perform better than previous version.

Maybe another approach is to use io.ReaderAt.

It would make a lot of sense to use that to add support for Windows. Perhaps that becomes one additional back-end for the PackScanner.

Or for people that use the package inside things like web server. It would be less efficient/performant, I would expect, than the mmap variant, but without the random latency penalty.

Or make the mmap code path optional.

This is already optional - as per PR description and linked issue.

Excellent, I am not yet accustomed to the full code base and missed it. Sorry.

cedric-appdirect · 2025-12-04T23:47:39Z

I may have an interest in implementing the io.ReaderAt version, as I use go-git in web service, once this PR is merged. Let me know if you do not plan to work on it.

The latest version of both dependencies enable the use of OSFixture and Fd(), both being requirements to test the mmap implementation. Signed-off-by: Paulo Gomes <pjbgf@linux.com>

Signed-off-by: Paulo Gomes <pjbgf@linux.com>

The new PackScanner is able to navigate through a packfile and inflate objects on demand. This avoids costly operations such as loading into memory the entire index and resolving all delta objects which may or may not be used. Performance-wise, the impact is negligible for small repositories, with the exception of the 60% drop in number of allocations: BenchmarkPackHandlers/packfile-cache-osfs:_basic-fixture-16 4020 288678 ns/op 294714 B/op 743 allocs/op BenchmarkPackHandlers/packfile-nocache-memfs:basic-fixture-16 4272 280586 ns/op 294884 B/op 742 allocs/op BenchmarkPackHandlers/mmap-pack-scanner:basic-fixture-16 5646 212349 ns/op 293650 B/op 286 allocs/op In very large repositories is where most gains can be observed. For the linux kernel, the memory churn of fetching 4 objects drops from 1.3GB to 1MB. Both allocations and time per operation are a fraction of the existing implementation: BenchmarkPackHandlers/packfile-cache-osfs:_linux-kernel-16 1 2240495709 ns/op 1331433920 B/op 35933 allocs/op BenchmarkPackHandlers/packfile-nocache-memfs:linux-kernel-16 1 2026880001 ns/op 1331437040 B/op 35935 allocs/op BenchmarkPackHandlers/mmap-pack-scanner:linux-kernel-16 477 2513275 ns/op 1011512 B/op 1488 allocs/op Signed-off-by: Paulo Gomes <pjbgf@linux.com>

onee-only

LGTM. For some unknown reason, I can't resolve those reviews.

pjbgf mentioned this pull request Dec 3, 2025

Introduce a Pack Scanner for handling large repositories #1774

Open

3 tasks

pjbgf force-pushed the perf2 branch 7 times, most recently from 3942799 to d14e796 Compare December 4, 2025 13:17

onee-only reviewed Dec 4, 2025

View reviewed changes

pjbgf mentioned this pull request Dec 5, 2025

Built-in support for mmap go-git/go-billy#175

Open

pjbgf force-pushed the perf2 branch from d14e796 to 7117323 Compare December 5, 2025 12:06

pjbgf added 2 commits December 5, 2025 12:07

build: Bump go-billy and go-git-fixtures

e2ffb47

The latest version of both dependencies enable the use of OSFixture and Fd(), both being requirements to test the mmap implementation. Signed-off-by: Paulo Gomes <pjbgf@linux.com>

plumbing: format/packfile, Extract pack and patch logic into new package

0f1f07a

Signed-off-by: Paulo Gomes <pjbgf@linux.com>

github-advanced-security AI found potential problems Dec 5, 2025

View reviewed changes

Comment thread storage/filesystem/mmap/fsobject.go Fixed

pjbgf force-pushed the perf2 branch from 7117323 to 5086513 Compare December 5, 2025 12:12

github-advanced-security AI found potential problems Dec 5, 2025

View reviewed changes

Comment thread storage/filesystem/mmap/fsobject.go Dismissed

onee-only approved these changes Dec 5, 2025

View reviewed changes

pjbgf merged commit 4aac0c5 into go-git:main Dec 5, 2025
16 checks passed

pjbgf deleted the perf2 branch December 5, 2025 12:56

Bluebugs mentioned this pull request Dec 23, 2025

storage: filesystem/readerat, Add PackScanner to handle repository more efficiently in a portable way #1809

Closed

pjbgf mentioned this pull request Jan 20, 2026

Adopt use of .rev files for reverse index lookup #1819

Closed

7 tasks

naemono mentioned this pull request Feb 3, 2026

Optimize memory usage of clone operation in operatorhub tool elastic/cloud-on-k8s#9104

Merged

haines mentioned this pull request May 6, 2026

storage: filesystem/mmap, Surface error getting object by delta offset #2063

Open

-		return io.NopCloser(br), nil
+		rc := ioutil.NewReadCloser(br, ioutil.CloserFunc(func() error {
+			sync.PutBufioReader(br)
+			return nil
+		}))
+		return rc, nil

	data := o.scanner.packMmap[start:]
	data := o.scanner.packMmap[start:start+o.size]

	// FindOffset is deprecated. Use OffsetByHash instead.
	// Deprecated: Use OffsetByHash instead.

	// FindHash is deprecated. Use HashByOffset instead.
	// Deprecated: Use HashByOffset instead.

Conversation

pjbgf commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cedric-appdirect commented Dec 4, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

onee-only Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pjbgf commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cedric-appdirect commented Dec 4, 2025

Uh oh!

cedric-appdirect commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

onee-only left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pjbgf commented Dec 3, 2025 •

edited

Loading

onee-only Dec 5, 2025 •

edited

Loading

pjbgf commented Dec 4, 2025 •

edited

Loading

cedric-appdirect commented Dec 4, 2025 •

edited

Loading