Skip to content

[DRAFT] plumbing: fully support TREE, REUC, LINK, UNTR, EOIE, FSMN, IEOT index extensions#1622

Draft
christian-roggia wants to merge 12 commits into
go-git:mainfrom
christian-roggia:main
Draft

[DRAFT] plumbing: fully support TREE, REUC, LINK, UNTR, EOIE, FSMN, IEOT index extensions#1622
christian-roggia wants to merge 12 commits into
go-git:mainfrom
christian-roggia:main

Conversation

@christian-roggia

@christian-roggia christian-roggia commented Aug 9, 2025

Copy link
Copy Markdown
Contributor

This pull request introduces full support for the TREE, REUC, LINK, UNTR, FSMN, IEOT, and EOIE index extensions. Partial decoding support for the TREE, EOIE and REUC extensions already existed, but encoding was missing. There are a few other official extensions not yet implemented, which can be added in future updates.

I would appreciate an initial review of these changes as I continue testing and validating support for the new index extensions in our environment.

@christian-roggia

christian-roggia commented Aug 9, 2025

Copy link
Copy Markdown
Contributor Author

NOTE: The TREE extension decoder has been updated to match the behavior of the C implementation. The decoder should continue reading the entry value until a newline is encountered to ensure the buffer advances correctly to the next entry. Additionally, invalidated TREE entries should be preserved rather than discarded as in the original logic. Preserving these entries retains valuable information and enables re-encoding the index byte-for-byte exactly as intended.

@christian-roggia

Copy link
Copy Markdown
Contributor Author

NOTE: The REUC decoder has been updated to correctly decode stages in the intended order. Previously, iterating over the map caused a random order since maps are unordered and iteration order is not guaranteed.

Comment thread plumbing/format/index/encoder.go Outdated
@christian-roggia christian-roggia changed the title plumbing: fully support TREE, REUC, LINK, UNTR, EOIE index extensions [DRAFT] plumbing: fully support TREE, REUC, LINK, UNTR, EOIE, FSMN, IEOT index extensions Aug 27, 2025

@pjbgf pjbgf left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christian-roggia thanks for looking into this. The changes are looking good, although I need to take a closer look around the extensions on a follow-up review.

Please add some tests around the ewah code and rebase the PR.

Comment thread plumbing/ewah/ewah_io.go
)

func ReadFrom(r io.Reader) (*Bitmap, error) {
var bits uint32

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a nil check for r.

Comment thread plumbing/ewah/bitmap.go
RLWLargestLiteralCount = (1 << RLWLiteralBits) - 1
)

func GetRunBit(rlw uint64) bool {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func GetRunBit(rlw uint64) bool {
// RunBit returns whether the run bit in rlw is set.
func RunBit(rlw uint64) bool {

Comment thread plumbing/ewah/bitmap.go
return rlw&1 != 0
}

func GetRunningLen(rlw uint64) uint64 {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func GetRunningLen(rlw uint64) uint64 {
// RunningLen extracts rlw's running length.
func RunningLen(rlw uint64) uint64 {

Comment thread plumbing/ewah/bitmap.go
return uint64((rlw >> 1) & RLWLargestRunningCount)
}

func GetLiteralWords(rlw uint64) uint64 {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func GetLiteralWords(rlw uint64) uint64 {
// LiteralWords extracts the number of literal words in rlw.
func LiteralWords(rlw uint64) uint64 {

Comment thread plumbing/ewah/bitmap.go
return false
}

// ForEach calls fn() for each set bit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ForEach calls fn() for each set bit.
// ForEach calls fn() for each set bit.
// The returning bool from fn defines whether iteration should continue.

Comment thread plumbing/ewah/bitmap.go
}
}

func (b *Bitmap) NumBits() uint64 {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the core difference between NumBits and Bits? Please document both funcs.

Comment thread plumbing/ewah/bitmap.go
return uint64(rlw >> (1 + RLWRunningBits))
}

func (b *Bitmap) Get(pos uint64) bool {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't At be a better name here? I'm assuming this checks whether a bit is set at a given position. Is that right?

Suggested change
func (b *Bitmap) Get(pos uint64) bool {
func (b *Bitmap) At(pos uint64) bool {

Please document this func.

Comment on lines +239 to +244
if idx.ResolveUndo != nil {
if err := e.encodeREUC(idx.ResolveUndo); err != nil {
return err
}
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not duplicated with L233-L237?

Suggested change
if idx.ResolveUndo != nil {
if err := e.encodeREUC(idx.ResolveUndo); err != nil {
return err
}
}

@pjbgf pjbgf added the no-autoclose Issues/PRs to be ignored by stale bot label Jan 22, 2026
@christian-roggia christian-roggia marked this pull request as draft May 11, 2026 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-autoclose Issues/PRs to be ignored by stale bot

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants