Skip to content

Conversation

@jbrockmendel
Copy link
Contributor

Fixes (with tests) currently-failing cases "0031 Jun 11" and "0031Jun11".

next_jump and prev_jump are from a fork I made ages ago to handle some corner cases. If accepted, they can be used to clean up a bunch of existing checks in a follow-up.

return year, month, day


def next_jump(tokens, idx, skip_comma=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a method somewhere, no?

I definitely think it shouldn't be a public, top-level function yet. Can you move it onto either parser or parserinfo (whichever you think is best), and start it out as private (_next_jump)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a method somewhere, no? [...] Can you move it onto either parser or parserinfo

Downstream I actually have these as methods in a TokenStream class (subclasses list) and tokens = TokenStream(timestr) takes the place of l = _timelex(timestr). In #884 I'll describe some of the other virtues of the TokenStream class. If this is made into a method, I think TokenStream is the right abstraction for where it belongs.

For the time being I've privatized the function.

Copy link
Member

@pganssle pganssle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm fine with the logic of this change, to the extent that I understand it, just have some comments about the specifics of the implementation.

res = parse(dstr)
assert res.year == 2001, res

def test_first_century(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this a parametrized test? I prefer "one assert per test" wherever possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Get the next sibling and its index, skipping over whitespace
and (if specified) commas.
Parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inconsistent with the rest of the documentation, where we're using the standard sphinx style. I'm not opposed to the idea of converting over the documentation to using this numpy-style, but I don't know if I want it to be a mishmash, which confuses new contributors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, will try to convert (I'm much less familiar with the sphinx style, so will bear review). Other than "I like it better" Im not sure if there is a compelling reason to convert to the numpy-style. Regardless, that would be its own Issue.

return sib, sib_idx


def prev_jump(tokens, idx, skip_comma=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like you're actually using this function anywhere, other than in the function itself. Is it necessary to keep it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need it at the moment, but I thought I'd port it since it goes with next_jump. I'll remove it now and can re-introduce it if/when it becomes necessary.


def next_jump(tokens, idx, skip_comma=True):
"""
Get the next sibling and its index, skipping over whitespace
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example of what this is looking for would probably be useful to future software archaeologists. I think you have one somewhere in the comments of the function itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, will do.

# dropping leading zeros, e.g. "0031" represents 31 AD,
# not 2031
# Note: this check must come before the check for info.jump below
ymd.append(value_repr, 'Y')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mildly surprised that we can put strings in the ymd tuple like this without it breaking anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_ymd.append starts with a check for hasattr(val, '__len__') which I guess is intended to catch strings (not sure why it doesnt just check for strings directly...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants