Skip to content

Conversation

@theirix
Copy link
Contributor

@theirix theirix commented Oct 12, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

  • Additional unit tests
  • Existing SLT tests

Are there any user-facing changes?

No

@github-actions github-actions bot added the functions Changes to functions implementation label Oct 12, 2025
@theirix theirix changed the title feat: support decimal for math functions: pow feat: support decimal for math functions: power Oct 12, 2025
@AdamGS
Copy link
Contributor

AdamGS commented Oct 13, 2025

supporting decimal32 and 64 would also be nice

@theirix
Copy link
Contributor Author

theirix commented Oct 13, 2025

@AdamGS yes, it's a good idea, I saw an implementation in abs. I'll work on it

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Oct 13, 2025
@theirix theirix marked this pull request as ready for review October 20, 2025 21:44
@theirix
Copy link
Contributor Author

theirix commented Oct 20, 2025

Now it supports all four decimal types. The shared implementation in utils.rs is also updated to support for all math functions.

Unfortunately, it was not possible to make a generic implementation for DecimalType. I thought about splitting implementations for ArrowNumericType and DecimalType to include post-processing only for one of them, but it didn't work properly with types. Made an implementation with macro.

Also, I discovered that a default-constructed decimal type always has scale 10, which is not what we want for operations on ArrowPrimitiveType::Native. Added descaling to zero, leaving underlying data as-is.

Finally, ScalarUDFImpl::signature doesn't work great with parametrized types, so it's unclear how to specify decimals there. It is not only informational, but also used to coerce types, I suppose.

Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a few more test cases (preferably in SLT) like so:

  • Test some of the overflow/compute errors
  • Decimal bases that aren't integer numbers like 2.2

Decimal256(DECIMAL256_MAX_PRECISION, 0),
Float64,
]),
Numeric(2), // Catch-all for all decimals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this signature is a bit hard with decimals; it might be better off doing this as user_defined type and use coerce_types until we can get better support for arbitrary decimals in the type signature 🤔

Especially this last Numeric(2) which I think allows having a decimal type as the exponent which I don't think the implementation supports?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this link, it's helpful. Let me experiment with per-udf coercion rules and how to make them work in a generic way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jefffrey thank you for telling me about that coercion feature, it is very helpful. I implemented an even more compact version using helpers. Also, handling of null-to-int coercion became much more simple.

Comment on lines 376 to 377
DataType::Float64 => rescale_decimal(
calculate_binary_math::<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern of rescaling decimals I assume is because the output type of calculate_binary_math can lose the origin precision & scale? We should encompass this in another utility function since it's repeated multiple times here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. calculate_binary_math don't lose precision and scale, it just doesn't know about it. I tried to integrate rescaling to calculate_binary_math itself, but it became difficult to have rescaling enabled for decimal type and disabled for non-decimal types with the lack of specialisation in Rust.

Probably, it's worth too introduce a thin wrapper around it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced a shared wrapper to utils, so it can be reused when moving other functions. Code looks nicer now

&self.aliases
}

fn coerce_types(&self, arg_types: &[DataType]) -> Result<Vec<DataType>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to be concerned about here is does it allow having a base of int64 but exponent of float64 now? I think this was prevented before as it would either be both int64s or both float64s, but now it seems this coercion is done independently. I wonder if we have any test cases to catch this however 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question. In the original code it was prevented at a signature level, so the evaluation operated on allowed types only. However, when I tried to ban this int-to-float with new coercion rules, the scalar.slt test failed, assuming we can do it (second expression):

SELECT power(i32, exp_i) as power_i32,
       power(i64, exp_f) as power_i64,
       pow(f32, exp_i) as power_f32,
       power(f64, exp_f) as power_f64,
       pow(2, 3) as power_int_scalar,
       power(2.5, 3.0) as power_float_scalar
FROM (select test.*, 3 as exp_i, 3.0 as exp_f from test) a

I am unsure if it worked before then.

With decimals, we allow float power only if it is convertible to int in runtime (a focus of all these pow_decimal_float checks). Maybe we can apply the same logic to the int-to-float.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mystery solved, it was always executing as float-to-float operator

Copy link
Contributor

@Jefffrey Jefffrey Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully #18519 can provide a nicer API for specifying the allowed arguments; we can look into refactoring in future 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the improvement! I'll try to reformulate possible combinations in a follow-up PR.

Also, I've added a null handling after merging your changes from #18525 , could you please recheck?

Comment on lines 87 to 92
/// ```text
/// (b*s) ^ e
/// ```
/// However, the result should be scaled back from scale 0 to scale `s`,
/// which is done by multiplying by `10^s`.
/// At the end, the formula is:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm finding this explanation a little confusing, specifically around (b*s) ^ e which implies base is multiplied by scale directly instead of via a power of 10

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a mistake, updated formula. Please tell me if it can be improved

@Jefffrey
Copy link
Contributor

Jefffrey commented Nov 7, 2025

Looks like some CI failures to address

Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do refactors in followups

@Jefffrey Jefffrey added this pull request to the merge queue Nov 13, 2025
Merged via the queue into apache:main with commit d8c63ee Nov 13, 2025
28 checks passed
adriangb added a commit to pydantic/datafusion that referenced this pull request Dec 12, 2025
When calling pow(10, -2.0), the power function would fail with:
"Arrow error: Arithmetic overflow: Exponent -2 in integer computation is out of bounds."

This was a regression introduced in apache#18032 which added decimal support to power().
The issue was that the custom type coercion kept integer bases as Int64 while
float exponents became Float64, but the invoke_with_args() assumed Int64 exponent
for Int64 base, which doesn't support negative exponents.

The fix modifies coerce_types() to coerce integer bases to Float64 when the
exponent is Float64, since floating-point arithmetic is required for
negative/fractional exponents.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
adriangb added a commit to pydantic/datafusion that referenced this pull request Dec 13, 2025
When calling pow(10, -2.0), the power function would fail with:
"Arrow error: Arithmetic overflow: Exponent -2 in integer computation is out of bounds."

This was a regression introduced in apache#18032 which added decimal support to power().
The issue was that the custom type coercion kept integer bases as Int64 while
float exponents became Float64, but the invoke_with_args() assumed Int64 exponent
for Int64 base, which doesn't support negative exponents.

The fix modifies coerce_types() to coerce integer bases to Float64 when the
exponent is Float64, since floating-point arithmetic is required for
negative/fractional exponents.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
github-merge-queue bot pushed a commit that referenced this pull request Dec 15, 2025
#19303)

## Which issue does this PR close?

Closes #19314

Fixes a regression introduced in #18032 where `pow(10, -2.0)` fails
with:
```
Arrow error: Arithmetic overflow: Exponent -2 in integer computation is out of bounds.
```

## Rationale for this change

After #18032 added decimal support to the `power()` function, the custom
type coercion kept integer bases as `Int64` while float exponents became
`Float64`. However, `invoke_with_args()` assumed `Int64` exponent for
`Int64` base and used integer power computation, which doesn't support
negative exponents.

## What changes are included in this PR?

Modified `coerce_types()` in `datafusion/functions/src/math/power.rs` to
coerce integer bases to `Float64` when the exponent is `Float64`, since
floating-point arithmetic is required for negative/fractional exponents.

This matches the original behavior before #18032 where mixed types would
both be coerced to `Float64`.

## Are these changes tested?

Yes, added comprehensive tests:
- `test_power_coerce_types` - verifies type coercion behavior
- `test_power_int_base_negative_float_exp` - tests the specific
regression case
- `test_power_edge_cases_f64` - tests edge cases for float power (zero
exponent, one base, negative base, NaN)
- `test_power_edge_cases_i64` - tests edge cases for integer power
- `test_power_i64_negative_exp_fails` - verifies integer power with
negative exponent fails
- `test_power_zero_base_negative_exp` - tests infinity result

## Are there any user-facing changes?

No breaking changes. This restores the expected behavior of `pow()` with
integer base and float exponent.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: support decimal for math functions: pow

3 participants