Conversation
|
This is an automated comment for commit 0e0716f with description of existing statuses. It's updated for the latest CI running ✅ Click here to open a full report in a separate page Successful checks
|
src/Core/FormatFactorySettings.h
Outdated
| * 41 bits - 0.001% | ||
| )", 0) \ | ||
| DECLARE(UInt64, output_format_parquet_bloom_filter_flush_threshold_bytes, 128 * 1024 * 1024, R"( | ||
| Where in the parquet file to place the bloom filters. If 0, each row group's bloom filters are written immediately after the row group. If very big, bloom filters for all row groups are written together near the end of the file. If something in the middle, bloom filters are written in groups of about this size. |
There was a problem hiding this comment.
very big, something in the middle is really not clear, I would prefer:
| Where in the parquet file to place the bloom filters. If 0, each row group's bloom filters are written immediately after the row group. If very big, bloom filters for all row groups are written together near the end of the file. If something in the middle, bloom filters are written in groups of about this size. | |
| Where in the parquet file to place the bloom filters. If 0, each row group's bloom filters are written immediately after the row group. Otherwise, bloom filters are written in groups of about this size. |
|
https://s3.amazonaws.com/clickhouse-test-reports/71681/d541dbf40da39bd0686d675fc531dc7630c54a38/stateless_tests__release_.html seems to say that parquet encoding thread got stuck forever, but there's no stack trace (#72819), and I couldn't reproduce it after running the test ~1000 times using the binary from CI :( |
|
The test failure looks unrelated: #72817 |
|
Dear @antonio2368, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself. |
Changelog category (leave one):
A pretty straightforward feature. Controlled by setting
output_format_parquet_write_bloom_filter. Adds ~2 bytes per row+column (can't be compressed). E.g. +10% file size ontest.hits.DisabledEnabled by default.Also refactored
write_page_index, it had a couple of bugs.