Skip to content

Optimistic writes: flush the last row group in all scenarios#14759

Merged
Mytherin merged 2 commits intoduckdb:mainfrom
Mytherin:rowgroupflush
Nov 8, 2024
Merged

Optimistic writes: flush the last row group in all scenarios#14759
Mytherin merged 2 commits intoduckdb:mainfrom
Mytherin:rowgroupflush

Conversation

@Mytherin
Copy link
Collaborator

@Mytherin Mytherin commented Nov 8, 2024

When optimistically writing data to disk - there were a few scenarios in which we would not optimistically write row groups:

  • For batch insert, when the batches were approximately as large as our internal row group size, we would not always flush them as the CollectionMerger would have a collection with a single ColumnDataCollection in it
  • For regular insert, we would not flush the last row group in Combine

For regular insertions, this would not have a large impact as most data would still be written optimistically - but for the optimistic WAL write added in #13372 we need all row groups written in sequence to be optimistically written out. By not flushing all row groups, large WAL files would still be created.

@Mytherin Mytherin merged commit fd5de06 into duckdb:main Nov 8, 2024
@Mytherin Mytherin deleted the rowgroupflush branch December 8, 2024 06:51
github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Dec 21, 2024
Optimistic writes: flush the last row group in all scenarios (duckdb/duckdb#14759)
github-actions bot added a commit to duckdb/duckdb-r that referenced this pull request Dec 21, 2024
Optimistic writes: flush the last row group in all scenarios (duckdb/duckdb#14759)

Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant