-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Unexpected uncompressed page size: read stage: MainData: column: bool - with parquet native reader v3 #87396
Copy link
Copy link
Closed
Labels
comp-formatsInput/output formats (CSV/JSON/Parquet/ORC/Arrow/Protobuf/etc.).Input/output formats (CSV/JSON/Parquet/ORC/Arrow/Protobuf/etc.).potential bugTo be reviewed by developers and confirmed/rejected.To be reviewed by developers and confirmed/rejected.
Description
Company or project name
No response
Describe what's wrong
I get the following exception when using native reader v3:
Error on processing query: Code: 117. DB::Exception: Unexpected uncompressed page size: read stage: MainData: column: bool: (in file/uri /var/lib/clickhouse/user_files/file_6bb64ca7_97a4_11f0_88de_498af214c503.Parquet): While executing ParquetV3BlockInputFormat: While executing File: data for INSERT was parsed from file. (INCORRECT_DATA) (version 25.8.4.13 (official build))
I wasn't able to create smaller steps to reproduce the issue but the sequence is ass follows:
- I create a table with a schema containing a lot of columns with different datatypes
- I insert the data from that table into a FUNCTION file()
- Insert the data from the generated parquet file back to the table
Here is the gist with all the steps as the queries are too big to fit here: https://gist.github.com/Selfeer/1acfea443d1155b7fc834025b2c33eda
Does it reproduce on the most recent release?
Yes
How to reproduce
ClickHouse version: 25.8.4.13
Settings: input_format_parquet_use_native_reader_v3 = 1
Expected behavior
No response
Error message and/or stacktrace
No response
Additional context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
comp-formatsInput/output formats (CSV/JSON/Parquet/ORC/Arrow/Protobuf/etc.).Input/output formats (CSV/JSON/Parquet/ORC/Arrow/Protobuf/etc.).potential bugTo be reviewed by developers and confirmed/rejected.To be reviewed by developers and confirmed/rejected.