Skip to content

[core,swift] Refactor antlr implementation#2463

Merged
adangel merged 32 commits into
pmd:pmd/7.0.xfrom
oowekyala:antlr-touchups
Jun 21, 2020
Merged

[core,swift] Refactor antlr implementation#2463
adangel merged 32 commits into
pmd:pmd/7.0.xfrom
oowekyala:antlr-touchups

Conversation

@oowekyala

@oowekyala oowekyala commented May 2, 2020

Copy link
Copy Markdown
Member

Describe the PR

  • Reorganize the antlr implementation to avoid extending both antlr base classes and PMD interfaces. This avoids API clashes like [core] Conflicting contracts of getChild between PMD & Antlr #2235
    • This is explained on BaseAntlrNode and AntlrGeneratedParserBase
  • Make the ant script more generic, like the javacc wrapper. I don't want to end up in a situation like before [core] Merge JavaCC build scripts #2211 for antlr modules
  • Fully flesh out the implementation of Node for Antlr, including
    • implement getIndexInParent, which is strongly relied upon by node streams
    • implement getXPathNodeName, including for terminals. See AntlrNameDictionary
    • add a SwiftNode interface, which implements GenericNode<SwiftNode>, take measures to enforce this is safe (eg add SwiftErrorNode, SwiftTerminalNode, add some runtime checks to make sure that antlr only adds children from our implementation).
  • Add some parser tests for swift to ensure this is working properly
    • I just used the test files that were here for CPD. The parser actually fails on them, as evidenced by the 3000 error nodes on the BTree example....
  • Rewrite some of the swift rules to not be so reliant on getText (which is the worst way to implement a rule), that method is renamed to joinTokenText and deprecated, which we can do since we don't inherit it from ParserRuleContext anymore

It's still unclear how we can add new attributes to antlr nodes, since they're all generated into the same file.... For the same reason, we might run into trouble if we try to add functionality to the RootNode interface. I guess we can deal with that later

Of note, that the generated tree is not an abstract syntax tree in any way, it's a parse tree. There's a node for every token, or every error, which makes gigantic trees. Antlr recommends building a separate AST if we want one, but then we would have to define all the nodes and visitor interface ourselves... This is a fundamental problem with using antlr, possibly it should be mentioned somewhere in the doc.

Related issues

Ready?

  • Added unit tests for fixed bug/feature
  • Passing all unit tests
  • Complete build ./mvnw clean verify passes (checked automatically by travis)
    • I only did a partial build so will let travis take over
  • Added (in-code) documentation (if needed)

@ghost

ghost commented May 2, 2020

Copy link
Copy Markdown
1 Message
📖 No java rules are changed!

Generated by 🚫 Danger

@adangel adangel self-requested a review June 20, 2020 13:16
@adangel adangel self-assigned this Jun 21, 2020
@adangel adangel mentioned this pull request Jun 21, 2020
2 tasks

@adangel adangel left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oowekyala Many thanks for this refactoring!

I don't fully understand yet, what you did and how the new nodes/abstractions etc. play together.
We need to make sure, to provide a proper documentation at the end: for developers, who want to
add a new antlr based language (see #2500 and #2501) but also for developers, who want to contribute
to the inners of PMD (like us) -> see #2517 . Otherwise I fear, this won't be maintainable if you
are not an expert...

I assume, that the API right now is not fully fleshed out yet (since we have also already some legacy code
in PMD 7, like joinTokenText) and will mature (also with #2589 which is on my TODO list).
I've started some small parts on https://github.com/pmd/pmd/wiki/PMD-7.0.0-API
(section "AST Visitors").

Please be patient, while I walk through your other PRs.

adangel added a commit that referenced this pull request Jun 21, 2020
[swift] Refactor antlr implementation #2463
@adangel adangel merged commit fbc93e4 into pmd:pmd/7.0.x Jun 21, 2020
@oowekyala oowekyala deleted the antlr-touchups branch June 21, 2020 14:29
@oowekyala

Copy link
Copy Markdown
Member Author

Thanks for merging! Picking up #1881 is in my todo list.

Otherwise I fear, this won't be maintainable if you are not an expert...

I feel the same way, it's very messy. But what we're doing in the first place is shady, we're fooling antlr parsers into building PMD nodes. This looks more and more like a bad idea tbh.

@adangel adangel changed the title [swift] Refactor antlr implementation [core,swift] Refactor antlr implementation Jan 13, 2023
@adangel adangel mentioned this pull request Jan 23, 2023
55 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants