Added lookahead combinators that allow conditional consumption of tags.#96
Added lookahead combinators that allow conditional consumption of tags.#96merijn wants to merge 1 commit intosnoyberg:masterfrom
Conversation
|
Actually, this patch as-is is broken, since it depends on |
|
It makes sense, I don't think there's a way to do that with the existing parsers/combinators. I will just try and find a different name for the new functions, as |
|
After thinking about it, I intend to implement the following: -- Consume-and-yield events of a single tag tree, as long as it matches given name and attribute parsers
takeTreesContent :: MonadThrow m => NameMatcher a -> AttrParser b -> ConduitM Event Event m (Maybe ())
takeAllTreesContent = takeTreesContent anyName ignoreAttrs(I've introduced This should cover your use case, as follows: debugParser :: MonadThrow m => Name -> ConduitM Event o m ()
debugParser name = do
t <- takeTreesContent (matching (/= name)) ignoreAttrs .| renderBytes def .| foldC
if null t
then return Nothing
else putStrLn t >> return (Just ())
parseAll = do
manyYield $ choose [fooParser, barParser, debugParser "quux"]
tagName "quux" {- ... -}(I didn't typecheck it, but you get the idea) If you're okay with that approach, I'll implement it and make a release out of everything you've contributed. |
|
Yeah, that looks fine for my usecase(s). |
|
Implemented in release 1.5.0 (not yet on Hackage). Thank you ! |
|
@merijn Release 1.5.0 is now available on Hackage, you can start working with it by setting it explicitly as a dependence of your project, otherwise a version <1.5 will be selected by default by cabal. |
When dealing with messier XML scraped from the web I occasionally don't know exactly which tags I need to accept, so I find it useful to write some catch-all combinators that output "unexpected" tags. Now, there is a
takeAllTreesContent, but that consume anything and as a result makes a rather poor fallback in, for example, uses ofmany.The newly added
lookaheadand friends allow me to conditionally consume an entire tree/tag. One example would be something like:Here
debugParserconsumes any tag that is not "quux" and plugs nicely into renderText fromText.XML.Stream.Renderto display the offending tags.