Skip to content

Conversation

@rok
Copy link
Member

@rok rok commented Jan 13, 2022

@github-actions
Copy link

@rok rok force-pushed the ARROW-14097 branch 2 times, most recently from a8888c1 to 3de6888 Compare January 26, 2022 21:48
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't match the PR title, and time makes more sense than duration as a result right?

Also, in that case, we need to validate that the result is not <0 or >=24h (see #12014).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah switching to subtract(time32/64, duration) -> time32/64. I suppose validation should a be part of the subtract kernel then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the validation. Still needs tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it doesn't make sense to cast time to duration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Removed.

@rok rok force-pushed the ARROW-14097 branch 4 times, most recently from bcd65bd to d8db736 Compare January 28, 2022 23:28
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, but is there any advantage to using the type matcher + computed output type vs the literal types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we could subtract a negative duration, we should also check if the output is larger than the maximum acceptable value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively - maybe this arithmetic should wrap around? Does that make any sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. I wonder how expected this would be. But you're proposing something like this:

if (result < 0) {
  result = result % multiple;
}

Correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Though, I'm not sure if that really makes any sense to have.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels semantically correct but I'm going to check what Pandas and others do to see what conventions are.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I poked around stdlib datetime but it doesn't seem to implement this. However, I still think we need to check whether the result is in range on both ends.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this reference: https://bugs.python.org/issue1487389#msg54803

I think raising an error is OK. If we need wrap-around, we can add that as a separate kernel perhaps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still raise an error if the result is >24:00 though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this! Pandas appears delegate time to Python and rather work with timedelta which is like our duration.
I made this raise on time < 0 and time >= 24:00. Lets revisit if/when desired.

@lidavidm
Copy link
Member

lidavidm commented Feb 4, 2022

Sorry, I'll take a look at this today or Monday

@rok
Copy link
Member Author

rok commented Feb 4, 2022

Sorry, I'll take a look at this today or Monday

No worries, this is on me anyway :)

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I just think we need to add a check for something like '23:59:59 - (-1s)"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this reference: https://bugs.python.org/issue1487389#msg54803

I think raising an error is OK. If we need wrap-around, we can add that as a separate kernel perhaps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still raise an error if the result is >24:00 though.

@lidavidm
Copy link
Member

lidavidm commented Feb 8, 2022

Looks like the linter is unhappy here.

@rok rok requested a review from lidavidm February 8, 2022 15:09
@lidavidm lidavidm closed this in 11e9f26 Feb 8, 2022
@ursabot
Copy link

ursabot commented Feb 8, 2022

Benchmark runs are scheduled for baseline = 49d2a41 and contender = 11e9f26. 11e9f26 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.77% ⬆️0.0%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.65% ⬆️0.17%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants