Skip to content

Add functions to ignore subtrees & result-streaming (yield) parsers#40

Closed
ulikoehler wants to merge 28 commits intosnoyberg:masterfrom
ulikoehler:ignoretree
Closed

Add functions to ignore subtrees & result-streaming (yield) parsers#40
ulikoehler wants to merge 28 commits intosnoyberg:masterfrom
ulikoehler:ignoretree

Conversation

@ulikoehler
Copy link
Copy Markdown
Contributor

This is basically a draft that adds three sets of functions:

  1. Result-streaming functions (i.e. your yieldWhileJust from this SO issue)
  2. Functions to ignore empty tags (i.e. no children)
  3. Functions to ignore tags that are not neccessarily empty (i.e. subtrees), solves this SO issue)

For most projects I've been using HXT, mostly because of the lack of these functions (especially 2) and 3): You had to parse absolutely everything or nothing would work).

Just as for tag and related functions, more flexible functions (using predicates) and easier-to-use functions (using a single Name) have been added.

From the library user's standpoint, this pull request mainly provides the new function many' that ignores anything (i.e. content, attributes and entire subtrees) if the consumer does not match.
Therefore it is now possible to parse this XML document:

<people>
    <person age="25">Michael</person>
    <tagImNotInterestedIn></tagImNotInterestedIn>
    <foobar>
        <foo foobar="123">abc</foo>
        <bar></bar>
    </foobar>
    <person age="2">Eliezer</person>
</people>

by only replacing many with many' in parsePeople = tagNoAttr "people" $ many parsePerson in the example. Additionally, a result-streaming counterpart manyYield' exists.

I think the implementation is somewhat incomplete and could probably be improved significantly but I'd like to hear your opinion on it.

Please feel free not to merge if this doesn't fit your needs or you'd like something to be done differently (hints will be appreciated)! I might also be unaware of some characteristics of the new functions.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar: "if any exist"

@snoyberg
Copy link
Copy Markdown
Owner

snoyberg commented Aug 1, 2014

It looks good to me, I didn't see any issues. Would you be able to add a few test cases to the test suite to cover the new functions?

@ulikoehler
Copy link
Copy Markdown
Contributor Author

Sure, I'll add some unit tests

@ulikoehler
Copy link
Copy Markdown
Contributor Author

So far I added tests for many' and manyYield. I'm still trying to figure out how to properly add more unit tests.

@snoyberg
Copy link
Copy Markdown
Owner

Just pinging the issue.

@ulikoehler
Copy link
Copy Markdown
Contributor Author

Still working on it, albeit slowly. I'm currently trying to add unit tests that test if certain parsers fail for certain XMLs.

@pavelkogan
Copy link
Copy Markdown
Contributor

Any chance of this being merged sometime soon?

@snoyberg
Copy link
Copy Markdown
Owner

snoyberg commented Mar 6, 2015

Currently not, as the branch no longer merges. If @ulikoehler or someone else wants to clean this up and get it ready for merge, I have no objection to reviewing again.

@ulikoehler
Copy link
Copy Markdown
Contributor Author

I still have this on my radar and intend to fix it, however I'm currently occupied with my Bachelor's thesis (probably until May).

If someone else can spare some time for writing more unit tests or other fixes, I'll be happy to assist.

@tkvogt tkvogt mentioned this pull request May 1, 2015
@pavelkogan
Copy link
Copy Markdown
Contributor

I have a branch containing the commits here that merges cleanly with master (but not this branch). Is there some way to add it to this PR, or should I open a new one?

Btw, could you give some idea of what level of unit test coverage is needed to get this merged?

@snoyberg
Copy link
Copy Markdown
Owner

I'd say new PR is best. There's no specific standard on test coverage, but of course the more the merrier

@ulikoehler
Copy link
Copy Markdown
Contributor Author

I agree with @snoyberg . You could technically submit a PR to my repository on the ignoretree branch with which I could update this PR but I believe that would only make things more complicated.

@ulikoehler
Copy link
Copy Markdown
Contributor Author

Closing as #58 provided a mergeable solution for the issues discussed here.

Special thanks to @pavelkogan for finishing this feature.

@ulikoehler ulikoehler closed this Jul 16, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants