Skip to content

Issue #13167: Fix NPE in UnusedLocalVariable check#13543

Merged
romani merged 1 commit intocheckstyle:masterfrom
Vyom-Yadav:fixNullPointerExInUnusedLocalVar
May 25, 2024
Merged

Issue #13167: Fix NPE in UnusedLocalVariable check#13543
romani merged 1 commit intocheckstyle:masterfrom
Vyom-Yadav:fixNullPointerExInUnusedLocalVar

Conversation

@Vyom-Yadav
Copy link
Copy Markdown
Member

Resolves #13167

Copy link
Copy Markdown
Contributor

@nrmancuso nrmancuso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Item:

boolean result = false;
DetailAST parentAst = ast.getParent();
while (parentAst != null) {
if (TokenUtil.isOfType(parentAst, CONTAINERS_FOR_ANON_INNERS)
Copy link
Copy Markdown
Contributor

@nrmancuso nrmancuso Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this collection to something like CLASS_BODY_DECLARATION_TOKENS or even ANONYMOUS_INNER_CLASS_SCOPES in a separate commit first, container is nonsense in this context. Then update the method naming, etc. appropriately.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdiachenko , please step in to help review.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #14090 to change the variable name.

@nrmancuso nrmancuso self-assigned this Aug 9, 2023
Copy link
Copy Markdown
Member

@romani romani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Items

@romani romani requested a review from rdiachenko August 10, 2023 02:45
Copy link
Copy Markdown
Contributor

@nrmancuso nrmancuso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another item as you push changes for collection:

Comment on lines +475 to +411
* Whether the ast is nested under tokens of type
* {@link UnusedLocalVariableCheck#CONTAINERS_FOR_ANON_INNERS} or {@link TokenTypes#LAMBDA}.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove mentions of specific tokens, such a list is bound to get outdated/ignored, and is an implementation detail that the caller should not have to worry about.

Comment on lines +479 to +416
* @return {@code true} if ast is nested under tokens of type
* {@link UnusedLocalVariableCheck#CONTAINERS_FOR_ANON_INNERS}
* or {@link TokenTypes#LAMBDA}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Comment on lines +402 to +419
if (currentAst.getType() == TokenTypes.LAMBDA) {
topMostLambdaAst = currentAst;
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a time-complexity mutation. The reason I have added this to the suppressions list is because removing this condition will make the entire COMPILATION_UNIT to be processed again. And having many nested lambdas can make the entire COMPILATION_UNIT be processed multiple times.

If the penalty was small, I would have inclined towards pitest, but processing the entire COMPILATION_UNIT multiple times isn't a wise option IMO.

@rnveach
Copy link
Copy Markdown
Contributor

rnveach commented Apr 27, 2024

Regression for this PR: https://rveach.no-ip.org/checkstyle/regression/345/

1 NPE is removed, no other changes.

@rnveach rnveach force-pushed the fixNullPointerExInUnusedLocalVar branch from 64e6248 to a5057f3 Compare April 28, 2024 17:03
@rnveach rnveach assigned romani and unassigned rnveach Apr 28, 2024
@rnveach
Copy link
Copy Markdown
Contributor

rnveach commented Apr 28, 2024

As discussed, analysis is right at #13543 (comment) . This would be a very complex check, which I am unfamilar with, to fix pitest.

If we are good, I can write a reasoning for the suppression and create a new issue to resolve pitest suppressions. Maybe it can be done as part of the GSoC project.

@rnveach rnveach force-pushed the fixNullPointerExInUnusedLocalVar branch 3 times, most recently from 9e050eb to c2a3217 Compare April 28, 2024 18:40
@rnveach
Copy link
Copy Markdown
Contributor

rnveach commented Apr 28, 2024

It seems to me this check has a check inside itself. We have the first set of visits which is collecting information, then we have a custom tree walker with its own visits using the information collected from the first. It is not re-walking the entire tree in both times, but this is what makes this check complex by itself. The mutation is in the first "check", and its usage is being done in the 2nd.

Violations are picked and logged by exact ASTs, and we hide duplicate same violations, so I don't believe this can be killed, from what I am seeing so far.

I can confirm this on my local by printing all violations and ignoring duplicates. Running on InputUnusedLocalVariableGenericAnonInnerClasses, a violation for 34,12 is printed 4 times with the code here. Applying the mutation (removing the affected code), 34,12 is printed 6 times. We are doing the same work 2 more times if we follow pitest in this one example. It is already a concern, IMO, this check is logging the same violation 4 times.

These mutations may be a complex form of a time saver.

@Vyom-Yadav
Copy link
Copy Markdown
Member Author

Yeah, I remember some of these mutations to be like that. One mutation was particularly like processing the whole COMPILATION_UNIT again, so the statement that pitest suggests is actually to prevent that from happening. It's a shame I never got the time to complete this 😞.

It seems to me this check has a check inside itself. We have the first set of visits which is collecting information, then we have a custom tree walker with its own visits using the information collected from the first. It is not re-walking the entire tree in both times, but this is what makes this check complex by itself.

Yeah, it does that. The reason for two traversals is that a class may be defined further down the AST but may be used before, so scoping, etc., becomes difficult in that scenario, particularly with anon inner classes.

@rdiachenko
Copy link
Copy Markdown
Member

@Vyom-Yadav

And having many nested lambdas can make the entire COMPILATION_UNIT be processed multiple times.

Do you have any real examples with many nested lambdas? I'm interested in whether this is a genuine concern or just something we think is problematic when, in practice, it isn't.

If the penalty was small, I would have inclined towards pitest, but processing the entire COMPILATION_UNIT multiple times isn't a wise option IMO.

What's the penalty? Do you have some numbers at hand?

@Vyom-Yadav
Copy link
Copy Markdown
Member Author

Do you have any real examples with many nested lambdas? I'm interested in whether this is a genuine concern or just something we think is problematic when, in practice, it isn't.

I don't know if there is some pattern using nested lambdas. It is not a rare case, that I'm sure.

What's the penalty? Do you have some numbers at hand?

Penalty is directly proportional to the number of nodes in the ast. It will traverse the entire ast again (maybe multiple times too).

@rdiachenko rdiachenko assigned rdiachenko and unassigned romani May 15, 2024
@rdiachenko
Copy link
Copy Markdown
Member

I didn't find a quick and clear way to test Checkstyle performance for the fix with/without optimisation, so I ended up running both versions against the following list of projects via regression diff report:

checkstyle|git|https://github.com/checkstyle/checkstyle.git|master|...|
sevntu-checkstyle|git|https://github.com/sevntu-checkstyle/sevntu.checkstyle|master||
checkstyle-sonar|git|https://github.com/checkstyle/sonar-checkstyle|master||
guava|git|https://github.com/google/guava|v28.2||

spotbugs|git|https://github.com/spotbugs/spotbugs|3.1.2||
pmd|git|https://github.com/pmd/pmd|pmd_releases/6.21.0|**/pmd/pmd-java/src/test/**/*,**/pmd/cpd/files/*
spoon|git|https://github.com/INRIA/spoon.git|spoon-core-10.1.0|**/src/test/resources/**/*
lombok-ast|git|https://github.com/rzwitserloot/lombok.ast|v0.2|**/lombok-ast/test/**/*

spring-framework|git|https://github.com/spring-projects/spring-framework|v4.1.6.RELEASE||
hibernate-orm|git|https://github.com/hibernate/hibernate-orm|4.2.19.Final|**/hibernate-orm/documentation/**/*
elasticsearch|git|https://github.com/elastic/elasticsearch|v1.5.2||
java-design-patterns|git|https://github.com/iluwatar/java-design-patterns|dd855a376bc025aa61f6816584f79eb9854fe5d7||
MaterialDesignLibrary|git|https://github.com/navasmdc/MaterialDesignLibrary|1.3||
Hbase|git|https://github.com/apache/hbase|1.1.0.1||
Orekit|git|https://github.com/CS-SI/Orekit|8.0.1||

# Projects which contain a lot of labmda expressions
infinispan|git|https://github.com/infinispan/infinispan|7.2.5.Final||
protonpack|git|https://github.com/poetix/protonpack|protonpack-1.7||
jOOL|git|https://github.com/jOOQ/jOOL|version-0.9.7||
RxJava|git|https://github.com/ReactiveX/RxJava|v1.0.9||
Vavr|git|https://github.com/vavr-io/vavr|v0.9.0||

For this check's config:

...
    <module name="TreeWalker">
         <module name="UnusedLocalVariable">
         </module>
    </module>
...

I thought if we had a performance problem, the unoptimised fix would give x-times higher processing time. Here are the actual numbers (I ran the regression diff report 3 times for each build and choose the median):

# optimised fix
time groovy diff.groovy -r /var/tmp/checkstyle \
-b master -p fixNullPointerExInUnusedLocalVar -c my_check.xml -l projects-to-test-on.properties

1434.43s user 71.09s system 225% cpu 11:08.89 total

# not optimised fix
time groovy diff.groovy -r /var/tmp/checkstyle \
-b master -p fixNullPointerExInUnusedLocalVar-wo-optim -c my_check.xml -l projects-to-test-on.properties

1417.58s user 70.16s system 224% cpu 11:03.38 total

There is no difference. I suggest we remove optimisation and use a simplified fix that avoids two if (currentAst.getType() == TokenTypes.LAMBDA) mutations.

@romani
Copy link
Copy Markdown
Member

romani commented May 22, 2024

I would love to have no pitest suppression, rather than nano optimization. In long run no pitest will be more beneficial.

@romani
Copy link
Copy Markdown
Member

romani commented May 22, 2024

@nrmancuso , @rnveach , please share your opinion.

@romani romani assigned rnveach and nrmancuso and unassigned rdiachenko May 22, 2024
@rnveach
Copy link
Copy Markdown
Contributor

rnveach commented May 22, 2024

I am ok for this PR as it is now or closing it.

I don't agree with applying the pitest mutations to the code since that would degrade this check more than it is. We should already look into the issue that it is reporting the same token 2x already.

@nrmancuso
Copy link
Copy Markdown
Contributor

@nrmancuso , @rnveach , please share your opinion.

My opinion is that we have already spent too much effort on this pitest suppression, let’s either merge or close this. I think it is personally ridiculous to traverse the entire AST up to the root node when we know that we can (and should) stop when we reach the enclosing anonymous class.

We are humans that can think and reason, not robots that only deal in pitest logic.

@romani romani force-pushed the fixNullPointerExInUnusedLocalVar branch from c2a3217 to 4861a81 Compare May 23, 2024 13:19
@romani
Copy link
Copy Markdown
Member

romani commented May 23, 2024

All suppression(pitest + spotbugs) in this PR smells badly.
#14894
we will unblock users from NPE, and will deal with better code later on.

@romani romani assigned romani and unassigned rnveach May 23, 2024
Copy link
Copy Markdown
Member

@romani romani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok to merge.
We will improve in separate PR

@romani romani merged commit c15242f into checkstyle:master May 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Null pointer Exception in UnusedLocalVariableCheck

5 participants