implement next_jump, prev_jump to fix first-century cases #883

jbrockmendel · 2019-02-25T03:58:33Z

Fixes (with tests) currently-failing cases "0031 Jun 11" and "0031Jun11".

next_jump and prev_jump are from a fork I made ages ago to handle some corner cases. If accepted, they can be used to clean up a bunch of existing checks in a follow-up.

pganssle · 2019-02-26T14:34:13Z

dateutil/parser/_parser.py

        return year, month, day


+def next_jump(tokens, idx, skip_comma=True):


I think this should be a method somewhere, no?

I definitely think it shouldn't be a public, top-level function yet. Can you move it onto either parser or parserinfo (whichever you think is best), and start it out as private (_next_jump)?

I think this should be a method somewhere, no? [...] Can you move it onto either parser or parserinfo

Downstream I actually have these as methods in a TokenStream class (subclasses list) and tokens = TokenStream(timestr) takes the place of l = _timelex(timestr). In #884 I'll describe some of the other virtues of the TokenStream class. If this is made into a method, I think TokenStream is the right abstraction for where it belongs.

For the time being I've privatized the function.

pganssle

I think I'm fine with the logic of this change, to the extent that I understand it, just have some comments about the specifics of the implementation.

pganssle · 2019-02-26T14:41:24Z

dateutil/test/test_parser.py

        res = parse(dstr)
        assert res.year == 2001, res

+    def test_first_century(self):


Can you make this a parametrized test? I prefer "one assert per test" wherever possible.

pganssle · 2019-02-26T14:43:30Z

dateutil/parser/_parser.py

+    Get the next sibling and its index, skipping over whitespace
+    and (if specified) commas.
+
+    Parameters


This is inconsistent with the rest of the documentation, where we're using the standard sphinx style. I'm not opposed to the idea of converting over the documentation to using this numpy-style, but I don't know if I want it to be a mishmash, which confuses new contributors.

That's fair, will try to convert (I'm much less familiar with the sphinx style, so will bear review). Other than "I like it better" Im not sure if there is a compelling reason to convert to the numpy-style. Regardless, that would be its own Issue.

pganssle · 2019-02-26T14:44:02Z

dateutil/parser/_parser.py

+    return sib, sib_idx
+
+
+def prev_jump(tokens, idx, skip_comma=True):


It doesn't seem like you're actually using this function anywhere, other than in the function itself. Is it necessary to keep it?

We don't need it at the moment, but I thought I'd port it since it goes with next_jump. I'll remove it now and can re-introduce it if/when it becomes necessary.

pganssle · 2019-02-26T14:45:28Z

dateutil/parser/_parser.py


+def next_jump(tokens, idx, skip_comma=True):
+    """
+    Get the next sibling and its index, skipping over whitespace


An example of what this is looking for would probably be useful to future software archaeologists. I think you have one somewhere in the comments of the function itself.

Good idea, will do.

pganssle · 2019-02-26T14:46:57Z

dateutil/parser/_parser.py

+            #  dropping leading zeros, e.g. "0031" represents 31 AD,
+            #  not 2031
+            # Note: this check must come before the check for info.jump below
+            ymd.append(value_repr, 'Y')


I'm mildly surprised that we can put strings in the ymd tuple like this without it breaking anything.

_ymd.append starts with a check for hasattr(val, '__len__') which I guess is intended to catch strings (not sure why it doesnt just check for strings directly...)

pganssle reviewed Feb 26, 2019

View reviewed changes

pganssle requested changes Feb 26, 2019

View reviewed changes

This was referenced Feb 26, 2019

POC: de-duplicate code via Result setter methods #884

Open

POC: TokenStream #887

Open

jbrockmendel force-pushed the send_upstream2 branch from 08b0774 to 58fea3b Compare March 1, 2019 02:15

jbrockmendel mentioned this pull request Mar 1, 2019

Add several parser test cases #892

Merged

jbrockmendel added 6 commits March 13, 2019 07:07

implement next_jump, prev_jump to fix first-century cases

9606a8c

parametrize test cases

be08dff

change docstring format

95d5e64

Docstring examples

d484edd

remove prev_jump

ef7edff

privatize next_jump

74db519

jbrockmendel force-pushed the send_upstream2 branch from 58fea3b to 74db519 Compare March 13, 2019 14:11

		return year, month, day


		def next_jump(tokens, idx, skip_comma=True):

		return sib, sib_idx


		def prev_jump(tokens, idx, skip_comma=True):

implement next_jump, prev_jump to fix first-century cases #883

Are you sure you want to change the base?

implement next_jump, prev_jump to fix first-century cases #883

Uh oh!

Conversation

jbrockmendel commented Feb 25, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pganssle left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants