[core] Use annotations to resolve rule dependencies to frameworks by oowekyala · Pull Request #437 · pmd/pmd

oowekyala · 2017-06-11T16:04:53Z

Changes description:

Rules written in Java can now use annotations to specify their dependencies to a specific framework (@UsesDfa, @UsesTypeResolution, @UsesMetrics). The concept can be extended to other languages.

Rationale:

Atm these dependencies must be specified as an xml attribute in the ruleset. However, the dependencies really are part of the implementation of the rules, and the rules can't work properly without them. For example, DataflowAnomalyAnalysisRule throws hundreds of NPEs if the dfa="true" attribute is ommited in the xml. Correct me if I'm wrong, but that's not a feature, that's a bug.

I think it's safer that those dependencies be specified with the implementation of the rule (and annotations are an elegant way to do so). That way, the user need not worry about them (the implementation is none of their concern), and their rulesets can be a little bit cleaner. And of course old rulesets still work.

Possible improvements and discussion

One thing could justify the need for a dfa attribute, and it would be precisely to set it to false: should a user not want to use DFA (for performance issues, or benchmarking), setting the attribute to false should deactivate the rule (to prevent its malfunction). I could implement that if you agree.
Do you think the class attribute of a <rule> tag is necessary? I understand it helps the user find the class if they want to modify it, but wouldn't rulesets be easier to make and read if we could identify the class of the rule from its name?

jsotuyod · 2017-06-12T21:56:33Z

@oowekyala this is really interesting. I have to confess I've been thinking about this myself (actually, I even spent some time pondering on this earlier today!).

Pros of this approach:

ease of use
self containment of the rule (keep all implementation details in a single place)

Cons of this approach:

the startup penalty of using reflection (not necessarily significant)
the possible inconsistencies (have the XML say it's false, and the annotation it's true, and you end up with bugs, misconfigured rules enabling features they don't really need or a very confused developer)
this alone doesn't guarantee self-containment yet. Messages, description, and rule setup (class reference / site url) are all still done in the ruleset.xml
there is no way to kill the XML tags AFAIK, since XPath rules wouldn't be able to use annotations; and I don't think we can tell ahead if an XPath expression uses a function short of awefull and hacky regular expressions; so we are stuck with 2 different ways to do the same, and therefore loose consistency

I think this is very interesting, but needs a deep discussion and a proper planning on how such feature should work / rollout plan shall we decide to proceed with it.

oowekyala · 2017-06-13T15:48:57Z

@jsotuyod Thanks for your input!

I don't think the startup penalty would be significant. Besides, we already use reflection through instantiating rules with newInstance ^^

I agree we must think more deeply about how to tackle inconsistencies between the ruleset and the implementation of the rules. Dependencies are part of the implementation of a rule, so it makes sense that Java rules declare theirs inside the class and not the ruleset. However XPath rules have their implementation contained in the rulesets files, so we could make it compulsory for xpath rules to declare their dependencies, something like declaring an xpath rule like the following:

<rule name="xxx"
      description="xxxx" >
      <xpath>
          <dependencies typeResolution="true" /> <!-- the others are implicitly false -->
          <expression>
 <![CDATA[
//ForStatement (...)
 ]]>
          </expression>
      </xpath>
</rule>

This would make it implicit that the class is net.sourceforge.pmd.lang.rule.XPathRule, but make the dependencies explicit (the dependencies tag would be required) and warn programmers to declare them. Moreover it would encapsulate the actual implementation of XPath rules (instances of XPathRule with an xpath property), which might do us some good should we tweak it oneday.

Then we could drop the metrics, dfa and typeResolution attributes from the rule tag itself, instead specifying the dependencies only in the class for Java rules and only in the xpath/dependencies tag for XPath rules.

We could enforce stricter validation assertions in the XML schema, for example making sure that

no class attribute or xpath tag are specified in a rule that has a ref attribute.
no class attribute is specified when there is an xpath tag and conversely

Custom java rules would still use the class attribute and annotations.

this alone doesn't guarantee self-containment yet. Messages, description, and rule setup (class reference / site url) are all still done in the ruleset.xml

Message, description and site URL are metadata, so I don't think they should stick with the implementation of a rule. We could however broaden the contract of the Rule interface to include methods that return description and default message as strings ---but then xpath rules would have to declare them in the xpath tag.

All in all, this approach would make implementation stick with implementation, and metadata with metadata, which I think is great. Plus, that would be a best effort to ensure that rule dependencies can't be misconfigured in the ruleset.
...
Such a change in the schema would be IMO for the better, but obviously not retrocompatible (although old rulesets could be ported somewhat easily to the new version). This would only be appropriate for a major release. It would imply modifying the schema, parse XPath rules separately in the RuleSetFactory, and refactoring all rulesets to conform to the new schema.

(Besides, RuleSetFactory could benefit from a little refreshing ---the study the other day saying it was a god class is probably not so wrong.)

adangel · 2017-06-16T08:53:30Z

Nice idea, indeed!

Just my 2 cents:

Users of rules shouldn't be bothered by the implementation - and they aren't right now: When using a PMD-provided rule, you just reference the rule (and you should not define a new rule referencing the same class). Say, you want to use "CollapsibleIfStatements" from the java basic ruleset, you'd use in your custom ruleset:
```
  <rule ref="rulesets/java/basic.xml#CollapsibleIfStatements" />
```
You can override some parts, like priority, description, message, properties ... . I'm not sure, whether we currently prevent the users from overriding the class attribute, dfa attribute, typeresolution attributes. This should not be possible.
In terms of possible inconsistencies - the annotation would always need to be the "truth". Disabling typeresolution for a rule, which depends on typeresolution will make the rule failing (that would good) or run the rule and yield many false positives (bad).
Could we somehow detect automatically for XPath rules, whether they depend on type resolution? E.g. they could call specific helper XPath functions or access specific attributes of nodes (type). That way, we could move this entirely to the implementation side out of the meta data.

jsotuyod · 2017-06-16T21:47:52Z

Could we somehow detect automatically for XPath rules, whether they depend on type resolution? E.g. they could call specific helper XPath functions or access specific attributes of nodes (type). That way, we could move this entirely to the implementation side out of the meta data.

I've been researching this a bit and there may be a way to do this...

For Saxon we can write a visitor, start from xpathExpression.getInternalExpression() and recursively iterateSubExpressions(), looking for expressions of type FunctionCall (or one of it's subtypes more accurate?), cast it, and we may check the function being referenced before evaluation.

For Jaxen it's a little more complicated. We would need to take the BaseXPath.getExpr(), and recursively ask if it's an instance of BinaryExpression or whatever to cast it and be able to transverse it, and then again check for FunctionCallExpr. Upon finding one, we may get the prefix and function name to know which one we are referring to.

Of course we would want all of this to be done just once upon creation.

Also of note, during this research I found that xpath expressions are recreated (and therefore re-parsed) each time they are evaluated. I'm not certain this is actually needed.

oowekyala · 2017-11-01T17:56:17Z

@jsotuyod Out of curiosity, I wonder why we still support Jaxen. I mean, from what I understand, XPath 2.0 is a superset of XPath 1.0 so maybe there's no point in maintaining support for both separately? This would make the change you describe easier to implement in case we decide to proceed with it

jsotuyod · 2017-11-01T18:12:23Z

@oowekyala I'm not sure why, but our current implementation of XPath 2.0 is not necessarily compatible with XPath 1.0. All XPath rules are currently defaulting to 1.0 (Jaxen), and most of the time, changing them to 2.0 (or sometimes even 1.0 compatibility mode) will result in failure to match nodes. This may be an issue on our implementation (specially due to the shortcuts taken to use the Rule Chain), but at this point I'm not certain.

For future releases we may check on those difference, check performance differences between both implementations, and deprecate whichever one we want to drop (also, we use a very old Saxon version, but updating is non-trivial). For the time being we are stuck with supporting both.

oowekyala · 2017-11-01T18:43:36Z

Interesting, thanks!

oowekyala · 2018-11-02T07:43:43Z

Closing this because of clutter. The discussion may be continued here or on an RFC when the time comes

oowekyala added 3 commits June 10, 2017 17:32

Added annotations to tag rule dependencies

bf724ec

Fix formatter chaos

bdb2c99

Tagged Typeres rules

dac1ef6

jsotuyod added the a:suggestion An idea, with little analysis on feasibility, to be considered label Jun 12, 2017

Merge branch 'master' into annotations

7045cbd

adangel mentioned this pull request Jun 19, 2017

[core] RFC: RuleSet Reorganization / Refactoring #450

Closed

oowekyala mentioned this pull request Jul 17, 2018

[core] Abstract XPath rulechain conversion using our own XPath parser #1243

Closed

oowekyala closed this Nov 2, 2018

oowekyala deleted the annotations branch November 2, 2018 07:43

oowekyala mentioned this pull request Nov 8, 2018

[core] Abstract away optional AST traversals (first step) #1426

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[core] Use annotations to resolve rule dependencies to frameworks#437

[core] Use annotations to resolve rule dependencies to frameworks#437
oowekyala wants to merge 4 commits into
pmd:masterfrom
oowekyala:annotations

oowekyala commented Jun 11, 2017 •

edited

Loading

Uh oh!

jsotuyod commented Jun 12, 2017

Uh oh!

oowekyala commented Jun 13, 2017 •

edited

Loading

Uh oh!

adangel commented Jun 16, 2017

Uh oh!

jsotuyod commented Jun 16, 2017

Uh oh!

oowekyala commented Nov 1, 2017

Uh oh!

jsotuyod commented Nov 1, 2017

Uh oh!

oowekyala commented Nov 1, 2017

Uh oh!

oowekyala commented Nov 2, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

oowekyala commented Jun 11, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes description:

Rationale:

Possible improvements and discussion

Uh oh!

jsotuyod commented Jun 12, 2017

Pros of this approach:

Cons of this approach:

Uh oh!

oowekyala commented Jun 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adangel commented Jun 16, 2017

Uh oh!

jsotuyod commented Jun 16, 2017

Uh oh!

oowekyala commented Nov 1, 2017

Uh oh!

jsotuyod commented Nov 1, 2017

Uh oh!

oowekyala commented Nov 1, 2017

Uh oh!

oowekyala commented Nov 2, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

oowekyala commented Jun 11, 2017 •

edited

Loading

oowekyala commented Jun 13, 2017 •

edited

Loading