Skip to content

Conversation

@gruuya
Copy link
Contributor

@gruuya gruuya commented Aug 13, 2024

This removes one pathological scenario, and that is when the changeset contains very small old pks and very large new pks (or vice-versa), such that the whole range basically encompasses all partition files, so we end up re-writing everything.

In addition it fixes a bug whereby the sync base scan does both partition and row pruning, but the former is inclusive (which we want) while the later is exclusive (which we don't). The fix is to use a new API for creating a Delta TableProvider based on a particular set of files, which can then be converted to TableSource to be used in logical planning.

@gruuya gruuya requested a review from mildbyte August 14, 2024 09:02
Also revise the deterministric UUID generator, since the preiovus one cycled after 16 values.
@gruuya gruuya merged commit b6a5926 into main Aug 14, 2024
@gruuya gruuya deleted the granular-sync-scan branch August 14, 2024 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants