Add functionality to parse datetimes according C standard format codes.#165
Conversation
|
No f-string please |
|
@wimglenn done. |
|
Thanks. I just blackened the code in #166 which made some conflicts for you, sorry about that |
# Conflicts: # parse.py
|
@wimglenn fixed |
|
@wimglenn is merging this on the roadmap? Are there any more changes you would like to request? |
|
@bendichter This looks like a great idea to me, I'll give a careful review when I have a spare moment. |
|
Sounds good. I'd be happy to answer any questions |
|
I do have a question - I'd really like anything that works with strftime to work here, e.g. this should be able to roundtrip: So on line 616 where there is |
|
Yes, good catch! I'll make that change and add a test |
parse.py
Outdated
| "Y": "[0-9]{4}", | ||
| "H": "[0-9]{1,2}", | ||
| "B": "(?:January|February|March|April|May|June|July|August|September|October|November|December)", | ||
| "b": "(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)", |
There was a problem hiding this comment.
I wonder if these should be made locale-aware...
There was a problem hiding this comment.
yeah good idea I'll look into it
|
I gave it more flexibility for the %z directive. %Z gets complicated
(source) If using a timezone out of that range, datetime.strptime("2023-11-21 13:23:27 PST", "%Y-%m-%d %H:%M:%S %Z")ValueError: time data '2023-11-21 13:23:27 PST' does not match format '%Y-%m-%d %H:%M:%S %Z'If using a timezone within that range, datetime.strptime("2023-11-21 13:23:27 EST", "%Y-%m-%d %H:%M:%S %Z")datetime.datetime(2023, 11, 21, 13, 23, 27)This seems to me like a bug. And in any case things are getting messy so I'd rather not support this. |
|
Yes the zone stuff looks tricky due to Python itself (e.g. https://bugs.python.org/issue22377) |
|
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #165 +/- ##
==========================================
- Coverage 94.72% 94.59% -0.13%
==========================================
Files 1 1
Lines 512 537 +25
Branches 123 131 +8
==========================================
+ Hits 485 508 +23
- Misses 17 18 +1
- Partials 10 11 +1 ☔ View full report in Codecov by Sentry. |
parse.py
Outdated
|
|
||
|
|
||
| dt_format_to_regex = {symbol: "[0-9]{2}" for symbol in "ymdIMSUW"} | ||
| dt_format_to_regex.update({"-" + symbol: "[0-9]{1,2}" for symbol in "ymdIMS"}) |
There was a problem hiding this comment.
For some reason I get a problem with these:
>>> datetime(2023, 1, 1).strftime("%Y/%-m/%-d")
'2023/1/1'
But
>>> parse("{:%Y/%-m/%-d}", "2023/1/1")
ValueError: '-' is a bad directive in format '%Y/%-m/%-d'
There was a problem hiding this comment.
Hmmm I'll take a look tomorrow
There was a problem hiding this comment.
This is now fixed and tested. It turns out strptime does not use the negative sign (I'm not quite sure why I thought it did). In strftime, %d outputs a zero-padded number e.g. "01". For strptime, %d matches a zero-padded number and can also match a non-zero padded number e.g. "1".
There was a problem hiding this comment.
Now the following works:
>>> parse("{:%Y/%m/%d}", "2023/01/01")
<Result (datetime.datetime(2023, 1, 1, 0, 0),) {}>
>>> parse("{:%Y/%m/%d}", "2023/1/1")
<Result (datetime.datetime(2023, 1, 1, 0, 0),) {}>There was a problem hiding this comment.
That's better. %j should get the same treatment, because it has the same issue (i.e. "%-j" doesn't work with strptime, and %j doesn't care if the number is zero-padded or not).
After removing the mapping for "-j", you can also remove the re.escape since the remaining characters are all letters and don't need escaping.
…_flexible_dates # Conflicts: # parse.py
|
hold it, I think you want j to map to {1,3}. Let's also add a test for it |
|
I tried to see if I could replace some of the existing datetime logic, but most of them allow for variants of each format, e.g. using either slashes or dashes, so this approach does not work well. I was able to change "tc" to use this approach though. |
…_flexible_dates # Conflicts: # parse.py
|
Let's not change those, because there is little benefit, and I can't be sure it doesn't modify existing behaviour. I'm also not sure about the condition for returning a datetime vs a time. I think we should either remove that, and always return a datetime, or extend it to return a datetime, time, or date (depending on whether the directives are solely time-related or solely date-related). |
|
OK, I reverted to the old logic for "tc". The datetime vs. date vs. time thing is tricky. I would like to avoid situations like having Line 194 in 04314d8 but I still think that only makes sense when a month is present as it is in Lines 575 to 589 in 04314d8 I suppose the logic could be:
|
|
Maybe consider configuring black as a precommit hook, e.g. https://github.com/catalystneuro/neuroconv/blob/1d3d58d4b53570d20b5e81abde7ada7f9f02fa47/.pre-commit-config.yaml#L8-L12 |
|
Yes, that logic sounds good to me. |
If it contains day, month, or year, it's a date. Missing years default to the current year. If it contains hours, minutes, seconds, or milliseconds, its a time. If it is a date and a time it's a datetime.
|
@wimglenn Thanks for all the collaboration on this. I am happy with its current state. Feel free to merge when you are ready. |
|
Merged and released 1.20.0 to PyPI. Thank you for your contributions, Ben! |
| elif is_time: | ||
| return dt.time() | ||
| else: | ||
| ValueError("Datetime not a date nor a time?") |
There was a problem hiding this comment.
@bendichter Somehow we both missed a raise here (ref #196)
There was a problem hiding this comment.
oops. Thanks for catching that
fix #150