Skip to content

Conversation

@paleolimbot
Copy link
Member

This PR implements the Duration type in the R bindings, mapped to "difftime", which is the equivalent base R type. Currently a draft PR because I'm unsure how to implement the Extend() method of the converter:

arrow/r/src/r_to_arrow.cpp

Lines 824 to 826 in 7240420

Status Extend(SEXP x, int64_t size, int64_t offset = 0) override {
// TODO: look in lubridate
return Status::NotImplemented("Extend");

I've also copy-and-pasted the converter for the TIME32/hms object conversion because I was unsure how to subclass it to avoid the repetition in the moment.

The gist of the PR is to allow this (which currently fails):

library(arrow, warn.conflicts = FALSE)
Array$create(as.difftime(123, units = "secs"))
#> Error: NotImplemented: Extend

Created on 2021-12-03 by the reprex package (v2.0.1)

@github-actions
Copy link

github-actions bot commented Dec 3, 2021

@github-actions
Copy link

github-actions bot commented Dec 3, 2021

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

Co-authored-by: Romain François <romain@rstudio.com>
@paleolimbot paleolimbot marked this pull request as ready for review December 10, 2021 16:06
@paleolimbot
Copy link
Member Author

Ok! This now works:

library(arrow, warn.conflicts = FALSE)
(arr <- Array$create(as.difftime(123, units = "secs")))
#> Array
#> <time32[s]>
#> [
#>   00:02:03
#> ]
as.vector(arr)
#> Time difference of 123 secs

Created on 2021-12-10 by the reprex package (v2.0.1)

I haven't implemented the lubridate::duration() translation because it seems out of scope for this set of changes but could also include it with this PR.

@paleolimbot
Copy link
Member Author

Just updating my reprex to show that it does, in fact work! (forgot to reinstall arrow from this branch before running reprex):

# remotes::insatll_github("paleolimbot/arrow/r@r-duration-type")
library(arrow, warn.conflicts = FALSE)
(arr <- Array$create(as.difftime(123, units = "secs")))
#> Array
#> <duration[s]>
#> [
#>   123
#> ]
as.vector(arr)
#> Time difference of 123 secs

@jonkeane jonkeane closed this in cf8d81d Dec 13, 2021
@nealrichardson
Copy link
Member

nealrichardson commented Dec 13, 2021

Thanks for this! Looks like there's one more entry in arrow.Rmd that needs to be edited, showing the mapping of Arrow Duration type to R, the table below the one that was.

@jonkeane
Copy link
Member

Thanks for catching that. I've made https://issues.apache.org/jira/browse/ARROW-15082

@ursabot
Copy link

ursabot commented Dec 13, 2021

Benchmark runs are scheduled for baseline = ab46725 and contender = cf8d81d. cf8d81d is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.45% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.31% ⬆️0.04%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@thisisnic
Copy link
Member

I haven't implemented the lubridate::duration() translation because it seems out of scope for this set of changes but could also include it with this PR.

All good, though would you mind opening a new ticket for that?

@paleolimbot
Copy link
Member Author

Done! See ARROW-15098.

pitrou pushed a commit that referenced this pull request Dec 15, 2021
This PR is an attempt to fix the [sanitizer failure](https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=17171&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181&l=12824) caused by ARROW-14941 (#11850). @jonkeane could you remind me how to submit a job to the sanitizer checker?

Closes #11947 from paleolimbot/r-fix-duration

Authored-by: Dewey Dunnington <dewey@fishandwhistle.net>
Signed-off-by: Antoine Pitrou <antoine@python.org>
jonkeane pushed a commit that referenced this pull request Dec 16, 2021
The PR updates the 'Get Started' vignette that should have been updated to reflect the mapping of the duration type to R's difftime in #11850 / ARROW-14941 but wasn't.

Closes #11948 from paleolimbot/r-duration-in-vignette

Authored-by: Dewey Dunnington <dewey@fishandwhistle.net>
Signed-off-by: Jonathan Keane <jkeane@gmail.com>
@paleolimbot paleolimbot deleted the r-duration-type branch December 9, 2022 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants