Skip to content

fix(table): ensure partition path is deterministic#767

Open
ferhatelmas wants to merge 1 commit intoapache:mainfrom
ferhatelmas:partition-path
Open

fix(table): ensure partition path is deterministic#767
ferhatelmas wants to merge 1 commit intoapache:mainfrom
ferhatelmas:partition-path

Conversation

@ferhatelmas
Copy link
Contributor

@ferhatelmas ferhatelmas commented Mar 2, 2026

partition record order is expected to match partition spec
but maps.Values can change it and cause cross-partition writer reuse,
delete expected files, etc.

related to #721

@alexandre-normand
Copy link
Contributor

Ah, oops. Thanks for catching this and making the fix 🙇 .

Comment on lines +139 to +147
i := 0
for field := range spec.Fields() {
val, ok := partitionContext.partitionData[field.FieldID]
if !ok {
return "", fmt.Errorf("unexpected missing partition value for field id %d in spec id %d", field.FieldID, partitionContext.specID)
}
data[i] = val
i++
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we change the Fields() method of partition specs to follow the more common pattern of returning an iterator of itr.Seq2[int, PartitionField] so that this can be for i, field := range spec.Fields() ?

It would just be changing Fields() to use slices.All instead of slices.Values. Thoughts? It would let us avoid this separately defined i var

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would cause some churn and would prevent AppendSeq usage.

If we change, I think we should change SortOrder.Fields() as well and do it separately and then use here since it's a bigger scope. Happy to do if agreed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we utilize AppendSeq much at all? It still might be better making the change to simplify scenarios like this (and users get a more familiar interaction with range that returns index and field) than to optimize for the AppendSeq scenario.

If we change, I think we should change SortOrder.Fields() as well and do it separately and then use here since it's a bigger scope. Happy to do if agreed

I say go for it unless either of you can think of a reason not to do so. We can hold off on merging this until you make the change and then we can update this. Thanks for volunteering!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#772

I will update this when it's merged

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

ferhatelmas added a commit to ferhatelmas/apache-iceberg-go that referenced this pull request Mar 5, 2026
related to apache#767

Signed-off-by: ferhat elmas <elmas.ferhat@gmail.com>
zeroshade pushed a commit that referenced this pull request Mar 5, 2026
)

related to #767

Signed-off-by: ferhat elmas <elmas.ferhat@gmail.com>
partition record order is expected to match partition spec
but map.Values can change it and cause cross-partition
writer reuse, delete expected files, etc.

Signed-off-by: ferhat elmas <elmas.ferhat@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants