ARROW-3329: [C++] Added casts Decimal128 to Decimal128 and Int64 #6427

JacekPliszka · 2020-02-14T18:51:50Z

Minimal implementation of casts:

from Decimal128 to Decimal128 - first it rescales to the output scale then assigns
from Decimal128 to Int64 - it first truncates the fractional part, then casts onto Int64

github-actions · 2020-02-14T19:02:39Z

https://issues.apache.org/jira/browse/ARROW-3329

kou · 2020-02-15T11:11:31Z

Thanks for your first contribution.

Could you update the pull request description to describe this change?
We use the pull request description as commit message.

JacekPliszka · 2020-02-15T12:10:00Z

Thanks for your first contribution.

Could you update the pull request description to describe this change?
We use the pull request description as commit message.

Could you tell me how to do it? I've pasted the description into the first comment.

But I can not see any way to add description to the PR itself. Normally I work with gitlab where there is button for it - can not find it here.

kou · 2020-02-17T05:21:35Z

You did update the description ("the first comment" is the description of this pull request). It's enough.
Thanks.

Someone will review this. Please wait for a while.

pitrou

Hi, and thanks for doing this. Here are some comments.

pitrou · 2020-02-17T14:39:55Z

cpp/src/arrow/compute/kernels/cast.cc

We shouldn't log errors but rather save them on the context. This is how it's done in another kernel:
https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/cast.cc#L295-L298

Corrected - hopefully correctly.

pitrou · 2020-02-17T14:42:34Z

cpp/src/arrow/compute/kernels/cast.cc

If options.allow_int_overflow is false, you should also check that the result value doesn't overflow out_type. You should be able to use Decimal::ToInteger for that.

Corrected - hopefully correctly.

pitrou · 2020-02-17T14:46:46Z

cpp/src/arrow/compute/kernels/cast.cc

Same here: should save the error on the context.

pitrou · 2020-02-17T14:47:25Z

cpp/src/arrow/compute/kernels/generated/codegen.py

Only Int64? I would expect all signed integer types.

Added other ints though they are not really needed as there is cast from int64 to them.

Added Int32 Int16 Int8

cpp/src/arrow/type.h

pitrou · 2020-02-17T14:49:52Z

cpp/src/arrow/type.h

Adding this include will worsen compile times. You should be able to add a forward declaration to type_fwd.h if needed, instead.

This depends on the above.

Moved to type_fwd.h

pitrou · 2020-02-17T14:50:39Z

cpp/src/arrow/compute/kernels/cast_test.cc

I would expect more test cases here, especially some examples where conversion fails because of overflow, but also examples including nulls.

Added more tests: nulls, overflow, truncation

pitrou · 2020-02-17T14:51:25Z

cpp/src/arrow/compute/kernels/cast_test.cc

Same remark here: I would expect more tricky conversion cases, failures (overflow) and nulls.

Added more tests: nulls, truncation

pitrou · 2020-02-17T14:51:59Z

cpp/src/arrow/compute/kernels/cast.h

You shouldn't add this if this isn't used anywhere.

Actually it is used now

pitrou · 2020-02-19T12:09:41Z

@JacekPliszka Did you forget to push your changes?

JacekPliszka · 2020-02-19T13:13:29Z

@JacekPliszka Did you forget to push your changes?

No. Handling overflow and truncation requires more work and it would take more time at my C++ skill level than I have.

nealrichardson · 2020-02-22T21:19:17Z

What about casting the other way (integer to decimal)? I was recently looking at adding some tests for decimal types in R but couldn't figure out how to make a decimal array to begin with.

JacekPliszka · 2020-02-23T17:20:38Z

What about casting the other way (integer to decimal)? I was recently looking at adding some tests for decimal types in R but couldn't figure out how to make a decimal array to begin with.

I am not planning to do it - the scope is already larger than I had planned.
My goal is to have Decimal128 to Int64 (not finished) and Decimal128 to Decimal128 (finished [though may be incorrect], not yet pushed).

pitrou · 2020-02-24T16:46:39Z

No. Handling overflow and truncation requires more work and it would take more time at my C++ skill level than I have.

Well, in this case, someone else will have to do it, but they may not make this a priority :-)

nealrichardson · 2020-03-11T20:35:20Z

@pitrou can we merge this and make a followup Jira for the outstanding questions?

pitrou · 2020-03-11T20:56:40Z

Why should we merge a deficient PR? There is no proper error handling here.

wesm · 2020-03-11T22:25:31Z

Yes if @JacekPliszka isn't able to complete it we should wait for someone to pick up the changes and add the necessary error handling etc.

jacek-pliszka · 2020-03-12T09:54:28Z

I will try to find some time during weekend. I believe have decimal to decimal done with options and error handling. decimal to int still needs some work.

jacek-pliszka · 2020-03-16T08:56:43Z

OK, I did decimal to int but still need some time for nulls handling in decimal to decimal - probably another day.

I have a question though about loop optimization - currently I move the loops inside ifs but maybe I can assume that compilers will do it for me - will they?

pitrou · 2020-03-16T11:20:23Z

I have a question though about loop optimization - currently I move the loops inside ifs but maybe I can assume that compilers will do it for me - will they?

Ideally, they will. But optimization is always based on heuristics and you never know what the compiler will decide. So in some cases it makes sense to duplicate loops by hand.

JacekPliszka · 2020-03-17T13:06:27Z

OK, tests should be all green soon so please review.

I am not happy with all solutions I've applied there but I believe it is more or less correct.
If something needs corrections - let me know - haven't programmed in C++ in years.

Notes;

looks like handling negative scales is missing in several places
probably decimal scaling operations could be somehow optimized and vectorized voiding the need for separate function calls and reducing the code size

pitrou · 2020-03-17T16:00:32Z

Thank you @JacekPliszka. I'm taking a look now.

pitrou · 2020-03-17T16:35:27Z

So, everything was good functionally, thank you :-) I just pushed a number of simplifications in the code. Will merge if CI is green.

…to Int64

Also, regenerate generated code.

JacekPliszka changed the title ~~ARROW-3329 [C++] Added casts Decimal128 to Decimal128 and Decimal128 …~~ ARROW-3329 [C++] Added casts Decimal128 to Decimal128 and Int64 Feb 15, 2020

pitrou changed the title ~~ARROW-3329 [C++] Added casts Decimal128 to Decimal128 and Int64~~ ARROW-3329: [C++] Added casts Decimal128 to Decimal128 and Int64 Feb 17, 2020

pitrou requested changes Feb 17, 2020

View reviewed changes

pitrou approved these changes Mar 17, 2020

View reviewed changes

JacekPliszka and others added 5 commits March 18, 2020 11:27

ARROW-3329 [C++] Added casts Decimal128 to Decimal128 and Decimal128 …

15779bf

…to Int64

Some simplifications.

74f1c94

Also, regenerate generated code.

Fix ARROW_PREDICT_* macros (!)

c0cb306

Make sure all data is initialized

38b77ed

Also improve gcc / clang version of ARROW_PREDICT_FALSE

2c303ab

pitrou closed this in 76fd44c Mar 18, 2020

asfimport mentioned this pull request Jul 6, 2020

[Python] Error casting decimal(38, 4) to int64 #19664

Closed

ARROW-3329: [C++] Added casts Decimal128 to Decimal128 and Int64 #6427

ARROW-3329: [C++] Added casts Decimal128 to Decimal128 and Int64 #6427

Uh oh!

Conversation

JacekPliszka commented Feb 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 14, 2020

Uh oh!

kou commented Feb 15, 2020

Uh oh!

JacekPliszka commented Feb 15, 2020

Uh oh!

kou commented Feb 17, 2020

Uh oh!

pitrou left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pitrou Feb 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pitrou commented Feb 19, 2020

Uh oh!

JacekPliszka commented Feb 19, 2020

Uh oh!

nealrichardson commented Feb 22, 2020

Uh oh!

JacekPliszka commented Feb 23, 2020

Uh oh!

pitrou commented Feb 24, 2020

Uh oh!

nealrichardson commented Mar 11, 2020

Uh oh!

pitrou commented Mar 11, 2020

Uh oh!

wesm commented Mar 11, 2020

Uh oh!

jacek-pliszka commented Mar 12, 2020

Uh oh!

jacek-pliszka commented Mar 16, 2020

Uh oh!

pitrou commented Mar 16, 2020

JacekPliszka commented Feb 14, 2020 •

edited

Loading

pitrou Feb 17, 2020 •

edited

Loading

JacekPliszka commented Mar 17, 2020 •

edited

Loading