-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Different decimal precision value when reading with parquet native reader v3 #87413
Copy link
Copy link
Closed
Closed
Copy link
Labels
bugConfirmed user-visible misbehaviour in official releaseConfirmed user-visible misbehaviour in official releasecomp-formatsInput/output formats (CSV/JSON/Parquet/ORC/Arrow/Protobuf/etc.).Input/output formats (CSV/JSON/Parquet/ORC/Arrow/Protobuf/etc.).
Description
Company or project name
No response
Describe what's wrong
When reading from a parquet file with a Decimal column, the precision is different with and without v3 native reader:
input_format_parquet_use_native_reader_v3 = 0
DESCRIBE TABLE file('int32_decimal_1.parquet')
FORMAT TabSeparated
SETTINGS input_format_parquet_use_native_reader_v3 = 0
value Nullable(Decimal(4, 2)) input_format_parquet_use_native_reader_v3 = 1
DESCRIBE TABLE file('int32_decimal_1.parquet')
FORMAT TabSeparated
SETTINGS input_format_parquet_use_native_reader_v3 = 1
value Nullable(Decimal(9, 2)) Does it reproduce on the most recent release?
Yes
How to reproduce
ClickHouse Version: 25.8.4.13
Settings: input_format_parquet_use_native_reader_v3=1
Here is the python code that generated the parquet file:
from decimal import Decimal, ROUND_HALF_EVEN
import pyarrow as pa
import pyarrow.parquet as pq
import os
OUT_PATH = "test_decimal/int32_decimal.parquet"
os.makedirs(os.path.dirname(OUT_PATH), exist_ok=True)
scale = Decimal("0.01")
values = [Decimal(i).quantize(scale, rounding=ROUND_HALF_EVEN) for i in range(1, 25)]
dtype = pa.decimal128(4, 2)
arr = pa.array(values, type=dtype)
table = pa.Table.from_arrays(
[arr], schema=pa.schema([pa.field("value", dtype, nullable=True)])
)
pq.write_table(table, OUT_PATH, version="1.0", compression=None, use_dictionary=False)Expected behavior
No response
Error message and/or stacktrace
No response
Additional context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugConfirmed user-visible misbehaviour in official releaseConfirmed user-visible misbehaviour in official releasecomp-formatsInput/output formats (CSV/JSON/Parquet/ORC/Arrow/Protobuf/etc.).Input/output formats (CSV/JSON/Parquet/ORC/Arrow/Protobuf/etc.).