Implement intermediate representation ParseState #525

jbrockmendel · 2017-11-15T00:41:24Z

Implement the intermediate representation as a full-fledged (easily-subclassable) class.

A subset of parser methods are focused on taking this intermediate representation and building the final representation. These methods are moved into the new class. As a result, they actually use self! (Well a couple don't, but I kept them as-is as a token of goodwill).

Note: tokens could pretty reasonably be an attribute of the intermediate representation.

pganssle · 2017-11-15T13:50:34Z

What exactly is subclassable about this? It has no public methods and exists only in private modules.

jbrockmendel · 2017-11-15T17:25:01Z

Good catch. _build_result should be public. Will update.

The bit that's relevant for subclass-ability is that _result is made (or kept) an attribute of the parser class. So if you want to subclass ParseState you just can tell the parser to use your subclass by overriding parser._result.

jbrockmendel · 2017-11-15T17:25:40Z

BTW as a follow-up I'll make a PR to ensure we get coverage for the unhit lines in _build_taware and _build_tzinfo

pganssle · 2017-11-15T17:46:35Z

For one thing, _result is also a private attribute, so I would not suggest anyone messing with it in a subclass, though I can imagine what you meant was we'd have a parser.ResultClass class or instance attribute.

That said, if the only thing that matters is that a class implements build_result, then ParseState doesn't need to be public (and probably shouldn't be), and shouldn't be subclassed from. We would just advertise "pass us a class that implements this public interface".

I think some of this is already being done in the parserinfo class, where are actively encouraged to implement your methods. Not quite sure if it's worth making a distinction between parserinfo and parsestate here.

pganssle · 2017-11-15T17:50:21Z

One thing I'll note is that the reason why I wanted all these broken up methods to be private wasn't because they should be private forever, but because I want to give a hard think about how the parser interface works and whether we can design something with an elegant way of implementing custom parsing behaviors, and I'm not sure a simple refactoring of the existing logic will work for that. As a result, I don't really want to commit to any significant changes to the public interface until we have an idea of what the final version should look like.

I don't think this really solves the problem that I see with customization. You either have to rely on private interfaces or completely re-implement them, neither of which is ideal.

jbrockmendel · 2017-11-15T23:48:55Z

Several small comments instead of one long one.

_result is also a private attribute, so I would not suggest anyone messing with it in a subclass,

_result is private in the PR mainly because it is private in the status quo. The current class-in-a-class definition of _result is not a pattern I see often; I can only guess that it is defined there so that it is accessible as res = self._result() instead of res = _result().

Even if we don't put post-parsing methods into the _result/ParseState class, defining it at the normal level of indentation and pinning it to the class will be an aesthetic improvement.

jbrockmendel · 2017-11-15T23:49:20Z

Setting aside subclassing, the methods _build_naive and _build_tzaware are nice functions (i.e. I would advocate breaking versions of them them out of parse even if we keep them in parser). Based on the ways in which these _build_foo methods actually use self, they make a lot more sense as methods of _result/ParseState than of parser.

jbrockmendel · 2017-11-15T23:49:53Z

I think some of this is already being done in the parserinfo class, where are actively encouraged to implement your methods

Not sure what you mean here. The _build_foo methods aren't part of parserinfo... That said, I do like the idea of centralizing customizable parsing behavior in parserinfo.

jbrockmendel · 2017-11-15T23:54:16Z

It's probably premature to separate responsibilities out of parser. But suppose for the moment that you decided it was worthwhile (say a bunch of new parsing methods are added and the class gets larger than you like). In this context, a reasonable separation of responsibilities if "everything directly dealing with parsing the string" and "everything after that".

(Note: acknowledging that the refactoring is premature is part of an effort to meet you halfway on this)

jbrockmendel · 2017-11-15T23:55:49Z

I don't really want to commit to any significant changes to the public interface until we have an idea of what the final version should look like.

Totally on board, and this has the added benefit of not feeling like the goal posts keep moving.

jbrockmendel · 2017-11-16T00:03:37Z

I think focusing on subclassability was a mistake on my part: it's a red herring not really central here.

So how about the following:

_construction.ParseState --> _parser._ParseState, so there is no new module and nothing new in the public namespace. The nice qualities of _build_foo are retained.

Keep _build_result private for the time being to keep options open for the future.

Make skipped_idxs an attribute of _ParseState so its one less variable to keep around, and then self._recombine_tokens makes even more sense as a method.

jbrockmendel · 2017-12-09T23:53:20Z

Closing for now, may revisit later. #539 implements the less controversial bits of this.

Implement subclassable ParseState

5cdd189

jbrockmendel changed the title ~~Implement subclassable ParseState~~ Implement intermediate representation ParseState Nov 17, 2017

jbrockmendel mentioned this pull request Dec 5, 2017

separate out build_naive and build_aware methods #539

Merged

jbrockmendel closed this Dec 9, 2017

pganssle added this to the wontfix milestone Mar 11, 2018

pganssle added the parser label Mar 11, 2018

Implement intermediate representation ParseState #525

Implement intermediate representation ParseState #525

Uh oh!

Conversation

jbrockmendel commented Nov 15, 2017

Uh oh!

pganssle commented Nov 15, 2017

Uh oh!

jbrockmendel commented Nov 15, 2017

Uh oh!

jbrockmendel commented Nov 15, 2017

Uh oh!

pganssle commented Nov 15, 2017

Uh oh!

pganssle commented Nov 15, 2017

Uh oh!

jbrockmendel commented Nov 15, 2017

Uh oh!

jbrockmendel commented Nov 15, 2017

Uh oh!

jbrockmendel commented Nov 15, 2017

Uh oh!

jbrockmendel commented Nov 15, 2017

Uh oh!

jbrockmendel commented Nov 15, 2017

Uh oh!

jbrockmendel commented Nov 16, 2017

Uh oh!

jbrockmendel commented Dec 9, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants