-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-13684: [C++][Compute] Strftime kernel follow-up #10998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Not fully sure about this one. In #10647 (comment) I mentioned the options of showing no timezone information (what you did now), include a numeric offset (so adding Including a numeric offset of using literal Z (for UTC) both mean that the default format depends on the type (because for timestamp without timezone we shouldn't add an offset).
Timestamps without timezone should be formatted as is, showing the wall clock time they represent. So I think formatting as if they are in UTC but without any timezone indication is correct. Although we should not allow adding timezone information with
You could start with the compute.rst table notes, but personally I would also include it in the docstring. |
Right. And for the option where we would have UTC timestamps having "Z" at the end would require something like: Where we would have
Ok, let's return invalid when detecting
Sounds good. |
da3581a to
de9fa8b
Compare
|
@jorisvandenbossche I've pushed changes for second and third (printing timezoneless timestamps and docs) point. I left the first as was (not timezone data by default). |
westonpace
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor wording nits but overall the change seems good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I don't know if cannot print is quite right. The user might be running strftime to store or display on a GUI or send to some other tool. Perhaps cannot convert to string? Or cannot format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good point. I went with your first proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems to suggest that only microsecond and second are supported. I like the wording down below which makes it more clear that seconds through nanoseconds are supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Changed the language. Could you check if it's good now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very clear. Thanks.
|
Thanks for the comments @westonpace ! |
|
@jorisvandenbossche @westonpace I've made a minor change in the text (decimal points -> decimal places). |
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also add something about "To obtain integer seconds, cast to timestamp with second resolution" to be explicit about the workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer up to date
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems error description is outdated for all extract kernels. Thanks for pointing it out! Fixing.
docs/source/cpp/compute.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is accurate? It doesn't depend on the number of seconds present, but solely on the unit of the timestamp type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you used some explanation about this from https://howardhinnant.github.io/date/date.html#to_stream_formatting, but I would maybe use the same text as you added above in strftime_doc, as I think that's clearer / more relevant from Arrow's perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense yeah. Done.
|
Thanks for the review @jorisvandenbossche @pitrou. |
b779d14 to
48ba3ca
Compare
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @rok. I will merge now if CI is green.
|
Thanks @westonpace @jorisvandenbossche @pitrou ! |
This is to resolve [ARROW-13684](https://issues.apache.org/jira/browse/ARROW-13684). 1. Default strftime string is now `%Y-%m-%dT%H:%M:%S`. Perhaps `%Y-%m-%dT%H:%M:%S%z` would be better? 2. Timestamps without timezone are now strftime-ed as if they were in `UTC`. Not sure this is the way to go. What if the local time is really invalid but we can't tell? 3. Document `%S` behavior. What would be a good location to do that? Closes apache#10998 from rok/ARROW-13684 Lead-authored-by: Rok <rok@mihevc.org> Co-authored-by: Rok Mihevc <rok@mihevc.org> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
This is to resolve ARROW-13684.
%Y-%m-%dT%H:%M:%S. Perhaps%Y-%m-%dT%H:%M:%S%zwould be better?UTC. Not sure this is the way to go. What if the local time is really invalid but we can't tell?%Sbehavior. What would be a good location to do that?