Skip to content

fix(table): fix refcount leak in enrichRecordsWithPosDeleteFields#762

Merged
zeroshade merged 5 commits intoapache:mainfrom
laskoviymishka:fix/enrich-records-arrow-refcount-leak
Mar 4, 2026
Merged

fix(table): fix refcount leak in enrichRecordsWithPosDeleteFields#762
zeroshade merged 5 commits intoapache:mainfrom
laskoviymishka:fix/enrich-records-arrow-refcount-leak

Conversation

@laskoviymishka
Copy link
Contributor

Arrays returned by NewArray() have refcount=1. NewRecordBatch calls Retain() on each column, bumping to refcount=2. Without an explicit Release() on the temporary arrays, the count never drops back to 1 when the record batch is released by the caller.

Fix by assigning NewArray() results to local variables and deferring their Release(), so the lifecycle is: NewArray() -> refcount 1, NewRecordBatch Retain() -> refcount 2, deferred Release() -> refcount 1 (owned by outData), caller releases outData -> refcount 0 -> freed.

Also extend TestEnrichRecordsWithPosDeleteFields to use memory.NewCheckedAllocator with mem.AssertSize(t, 0) to catch this class of leak going forward.

Fixes leak introduced in #721.

…teFields

Arrays returned by NewArray() have refcount=1. NewRecordBatch calls
Retain() on each column, bumping to refcount=2. Without an explicit
Release() on the temporary arrays, the count never drops back to 1
when the record batch is released by the caller.

Fix by assigning NewArray() results to local variables and deferring
their Release(), so the lifecycle is: NewArray() -> refcount 1,
NewRecordBatch Retain() -> refcount 2, deferred Release() -> refcount 1
(owned by outData), caller releases outData -> refcount 0 -> freed.

Also extend TestEnrichRecordsWithPosDeleteFields to use
memory.NewCheckedAllocator with mem.AssertSize(t, 0) to catch this
class of leak going forward.

Fixes leak introduced in apache#721.
@laskoviymishka laskoviymishka marked this pull request as ready for review March 2, 2026 22:02
@laskoviymishka
Copy link
Contributor Author

@zeroshade i see some flackiness in one integration test, i will spend some cycle to try to track this, but this PR seems reviewable now.

@laskoviymishka laskoviymishka requested a review from zeroshade March 3, 2026 12:21
Copy link
Member

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a couple small nitpicks, but otherwise looks good!

@laskoviymishka laskoviymishka requested a review from zeroshade March 4, 2026 16:24
Comment on lines +200 to +209
panic("position delete schema should have required field 'pos'")
}

newColsSchema := arrow.NewSchema([]arrow.Field{filePathField[0], posField[0]}, nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we control the PositionalDeleteArrowSchema as a global, do we really need these checks?

Can't we just use PositionalDeleteArrowSchema as-is without needing to create an entirely new schema object here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, fixed.

Comment on lines +231 to +238
filePathBuilder, ok := rb.Field(0).(*array.StringBuilder)
if !ok {
return nil, errors.New("file path field is not a string")
}
posBuilder, ok := rb.Field(1).(*array.Int64Builder)
if !ok {
return nil, errors.New("pos field is not a int64")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we control the schema, and are making the builder, we don't need the type checks here, we should be able to just assume they are correct (unit tests can validate that the output schema is correct) and just do filePathBuilder, posBuilder := rb.Field(0).(*array.StringBuilder), rb.Field(1).(*array.Int64Builder)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm a bit paranoic here :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, a bit too paranoic.

@zeroshade zeroshade merged commit 52dbce5 into apache:main Mar 4, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants