Skip to content

DataType::Decimal Non-Compliant? #1779

@tustvold

Description

@tustvold

Describe the bug

DataType::Decimal is defined as

/// Exact decimal value with precision and scale
///
/// * precision is the total number of digits
/// * scale is the number of digits past the decimal
///
/// For example the number 123.45 has precision 5 and scale 2.
Decimal(usize, usize),

This appears to be at odds with both the C++ and python implementations (I can't actually find the specification).

These define it as

Arrow decimals are fixed-point decimal numbers encoded as a scaled integer. The precision is the number of significant digits that the decimal type can represent; the scale is the number of digits after the decimal point (note the scale can be negative).

i.e. unscaledValue * 10^(-scale)

In particular with the current rust definition it is unclear how to represent numbers with more than 38 digits, either because of leading or trailing 0s.

To Reproduce

Inspect code

Expected behavior

We should be conforming to the other arrow implementations

Additional context

Noticed whilst reviewing apache/datafusion#2680

The parquet logical type is similarly defined - https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal

Metadata

Metadata

Assignees

No one assigned

    Labels

    arrowChanges to the arrow cratequestionFurther information is requested

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions