[core] Token scheme generalization by gibarsin · Pull Request #679 · pmd/pmd

gibarsin · 2017-10-21T23:12:00Z

Summary

It is intended to a make language-independent token interface to be accessed in a generic way to access them when making auto-fixes or comment-based CPD suppressions.

Any Token should have access to the next token according to the input stream which created it, as well as comment-type tokens which are to be used by each language. Furthermore, these tokens should have a text representation and the limits to where it belongs to the file. This limits are represented by the tuple (beginLine, beginColumn, endLine, endColumn).

These are ideas are reflected by changing the current type of GenericToken to an interface type declaring these methods and having the JavaCC Token classes be modified to suit this interface.

Changes

Changed GenericToken from class to interface type and declarated methods to be implemented in every language-specific Token class
Updated Token classes after being generated by JavaCC using Ant tasks, to comply with the implementation of the GenericToken. The addressed languages are Java, JSP, VF, C/C++, Javascript, Matlab, Objective-C, PL/SQL, Python and VM.

jsotuyod · 2017-10-22T00:17:36Z

Thanks for the PR. This looks very interesting, and is something I've been looking forward in order to bring comment-based CPD suppression to other languages for quite some time (see #250)

Current implementations of GenericToken (Java, JSP, VisualForce) only return a new instance of the RegionByLine when needed, which addresses the following issues:

I can't help but wonder if this is actually a good idea. Both JavaCC and Antlr expose the token's location on the text directly. What are we seeing they are not? These guys have been working with tokens for ages, and I don't presume to know better than they do on this topic.

To take into account. This is a breaking API change, meaning it has to make it for 6.0.0, or wait until 7.0.0. Moreover, whatever form this change takes (either a new object or not) it will have to remain supported for the whole major version it gets in. I'm not so comfortable with the "let's do this and measure memory usage once the feature is ready" approach unless we are certain of an issue we can avoid this way.

jsotuyod · 2017-10-22T00:23:57Z

the concept of special token is a little too JavaCC-related... Antlr just sends the tokens to a different channel.

Maybe we should be a little more semantic here on what we expect from grammars for PMD. Maybe something such as getPrevComment() or something of the sort?

Of course the docs would need to be updated

Ok so in the cases of auto-fixes or CPD we want to preserve comments so what you say fits in perfectly.

However, if we go through that way and the case is presented where a non-comment and "non-regular" token is needed to be used, the solution would be either to add a specific method for that case or to deprecate this method and make a more general one.

I currently cannot foresee if PMD will ever need to address this cases, so I'm just speculating.

A special token (JavaCC) is a token that:

can appear at any point in the grammar

is not part of the AST itself

I can think of no thing other than comments that may fit this criteria under a programming language perspective. Remember, what we define here is jut what we expect from PMD grammars. I feel comfortable with such semantics.

Perfect, I have just updated the interface's method. I will now follow to update the PR's comment

jsotuyod · 2017-10-22T00:24:15Z

I'd keep it simple, getNext()

jsotuyod · 2017-10-22T00:24:57Z

Gets the token's text

jsotuyod · 2017-10-22T00:27:26Z

this seems a little too generic... remember this will replace all occurrences of specialToken; with your contents

jsotuyod · 2017-10-22T00:27:45Z

jsotuyod · 2017-10-22T00:27:58Z

gibarsin · 2017-10-22T00:29:30Z

Let me address this by doing more research in ANTLR's code about why did they take the design decision of exposing directly those values.

jsotuyod · 2017-10-22T08:19:44Z

I see the change is impacted on Java, JSP and VF, but not on C/C++, Javascript, Matlab, Objective-C, PL/SQL, Python or VM. Any reason for this?

gibarsin · 2017-10-22T13:39:37Z

Reason at the time was that there did not exist a dependency over the GenericToken class for those languages. However, the whole reason for this generalization is to make it language-independent, so it doesn't make sense those cases not to have their GenericToken interface dependency. I'll put myself to work to impact these changes on those languages. Thanks for noticing.

gibarsin · 2017-10-23T00:44:15Z

For the record, the decision to have a separated class for storing the tuple (beginLine, beginColumn, endLine, endColumn) of a Token has been discontinued, due to the lack of proof that the creation of these instances do or do not affect the memory/cpu performance when running the analysis. The solution is to have a separated method for each of the fields in this tuple, directly exposed in the GenericToken interface.

jsotuyod · 2017-10-23T04:36:58Z

@adangel I'm having trouble figuring out why Travis fails with an IncompatibleClassChangeError analyzing pmd-visualforce here https://travis-ci.org/pmd/pmd/jobs/291306568#L8386

The change to GenericToken is the same to all modules, yet only pmd-visualforce is failing. Also, Maven works if we do mvn clean pmd:pmd, but fails if the code is compiled before pmd runs. Is this maybe an issue on how the Maven plugin populates the auxclasspath?

jsotuyod · 2017-10-23T15:56:26Z

@adangel upon further review, I believe this is down to the parent-first classloading strategy (which may cause trouble when analyzing PMD itself AND when analyzing a third-party app that uses a different version of a dependency we use). However, I'm still not certain why this doesn't happen on some modules such as pmd-java...

jsotuyod · 2017-10-25T04:47:55Z

Depends on #680

jsotuyod · 2017-10-28T21:09:40Z

@adangel in my opinion this is ready to be merged. However, this means Travis will now fail on master, and all new PRs until we can bump the PMD version we use to analyze PMD from 5.8.1 to a 6.0.0 version.

Unfortunately, doing so is non-trivial, and we can't even use a snapshot in the meantime. The maven plugin 3.8 creates instances of RuleSetFactory directly, and those constructors were changed as part of #680 which is the PR that changed classloading strategy for auxclasspath.

I see 3 alternatives:

we hold this PR until we are ready to publish PMD 6.0.0 (with ruleset reorganization being the biggest pending issue)
we merge as is and deal with broken Travis until we release 6.0.0
any chance to update the PMD plugin to work against the latest PMD snapshot, and publish a snapshot of that plugin we may use until we finally release 6.0.0?

adangel · 2017-10-29T10:07:43Z

@jsotuyod
I vote against merging right now and leave master broken.

One alternative would be, to temporarily skip the pmd plugin (-Dpmd.skip=true). Given, that we anyway don't enforce PMD on PMD (see #361), this sounds doable.

any chance to update the PMD plugin to work against the latest PMD snapshot, and publish a snapshot of that plugin we may use until we finally release 6.0.0?

Possibly: The current snapshots are available here: https://repository.apache.org/content/groups/snapshots/org/apache/maven/plugins/maven-pmd-plugin/3.9.0-SNAPSHOT/
However, I'm not sure, whether this will work: I just tried to set the property pmdVersion in our pom.xml to 6.0.0-SNAPSHOT (which should be available at least in my local repo, since I ran mvn clean install before), but maven sees now a cyclic dependency... So, I didn't even run into the changed API problem yet...

Btw. I still don't know, why the problem only occurs with the visualforce module and not earlier (e.g. java). It also appears with plsql.

jsotuyod · 2017-10-29T15:57:07Z

@adangel sounds reasonable, I'll merge disabling PMD check on Travis then.

However, I'm not sure, whether this will work: I just tried to set the property pmdVersion in our pom.xml to 6.0.0-SNAPSHOT (which should be available at least in my local repo, since I ran mvn clean install before), but maven sees now a cyclic dependency... So, I didn't even run into the changed API problem yet...

Yes, you would have to fix it at a particular snapshot build for it to work.

Possibly: The current snapshots are available here: https://repository.apache.org/content/groups/snapshots/org/apache/maven/plugins/maven-pmd-plugin/3.9.0-SNAPSHOT/

is this already compatible with the latest 6.0.0?

jsotuyod · 2017-10-30T19:56:54Z

Merged as part of 6.0.0, PMD checks temporarily disabled from Travis.

jsotuyod self-assigned this Oct 22, 2017

jsotuyod reviewed Oct 22, 2017

View reviewed changes

gibarsin force-pushed the tokenGeneralization branch 2 times, most recently from 60b8875 to e6487f6 Compare October 22, 2017 15:31

jsotuyod added the is:WIP For PRs that are not fully ready, or issues that are actively being tackled label Oct 23, 2017

jsotuyod removed the is:WIP For PRs that are not fully ready, or issues that are actively being tackled label Oct 23, 2017

adangel mentioned this pull request Oct 23, 2017

[core] Isolate classloaders for runtime and auxclasspath #680

Merged

jsotuyod added this to the 6.0.0 milestone Oct 25, 2017

gibarsin added 14 commits October 28, 2017 14:42

Convert GenericToken from class to interface && Add RegionByLine

2d01332

Change ant tasks over (Java) Token class

490b34f

Change ant tasks over (JSP) Token class

020abaf

[Not Working] Change ant tasks over (VisualForce) Token class

00ce16c

Add RegionByLineImpl javadoc

00ed0a1

Update getter in GenericToken

95835b8

Simplify method name in GenericToken && improve ant task replacetoken

26461e5

Change ant tasks over (Ecmascript5) Token class

647a17f

Change ant tasks over (CPP) Token class

f29e3f0

Change ant tasks over (Matlab) Token class

026576e

Change ant tasks over (Objective-C) Token class

da902d6

Change ant tasks over (PL/SQL) Token class

1c0d762

Change ant tasks over (Python) Token class

c606d40

Change ant tasks over (VM) Token class

0e01176

gibarsin added 2 commits October 28, 2017 14:42

Update GenericToken interface to expose directly region methods

e8fdbdb

Update GenericToken specialToken method to obtain only comment tokens

91b8a22

gibarsin force-pushed the tokenGeneralization branch from 99f8e22 to 91b8a22 Compare October 28, 2017 17:44

jsotuyod added an:enhancement An improvement on existing features / rules in:pmd-internals Affects PMD's internals labels Oct 28, 2017

jsotuyod mentioned this pull request Oct 28, 2017

[core] Extend comment-based suppression to all JavaCC languages #695

Closed

jsotuyod merged commit 91b8a22 into pmd:master Oct 30, 2017

jsotuyod added a commit that referenced this pull request Oct 30, 2017

Update changelog, refs #679

eb526f5

MatiasComercio mentioned this pull request Oct 30, 2017

[core] Autofixes feature development #693

Open

14 tasks

gibarsin deleted the tokenGeneralization branch October 30, 2017 20:58

jsotuyod mentioned this pull request Jan 12, 2018

[core] build problem: org.apache.maven.plugins:maven-pmd-plugin:3.8:pmd: java.lang.IncompatibleClassChangeError #843

Closed

adangel added the in:autofixes Affects the autofixes framework label Feb 7, 2018

Uh oh!

Conversation

gibarsin commented Oct 21, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Uh oh!

jsotuyod commented Oct 22, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gibarsin commented Oct 22, 2017

Uh oh!

jsotuyod commented Oct 22, 2017

Uh oh!

gibarsin commented Oct 22, 2017

Uh oh!

gibarsin commented Oct 23, 2017

Uh oh!

jsotuyod commented Oct 23, 2017

Uh oh!

jsotuyod commented Oct 23, 2017

Uh oh!

jsotuyod commented Oct 25, 2017

Uh oh!

jsotuyod commented Oct 28, 2017

Uh oh!

adangel commented Oct 29, 2017

Uh oh!

jsotuyod commented Oct 29, 2017

Uh oh!

jsotuyod commented Oct 30, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gibarsin commented Oct 21, 2017 •

edited

Loading