Issue #10848: Support patterns in switch#11100
Conversation
849d36f to
e9d5578
Compare
| | | | | |--LAND -> && [47:27] | ||
| | | | | | |--PATTERN_VARIABLE_DEF -> PATTERN_VARIABLE_DEF [47:17] | ||
| | | | | | | |--MODIFIERS -> MODIFIERS [47:17] | ||
| | | | | | | |--TYPE -> TYPE [47:17] | ||
| | | | | | | | `--IDENT -> Integer [47:17] | ||
| | | | | | | `--IDENT -> i [47:25] | ||
| | | | | | `--GT -> > [47:32] | ||
| | | | | | |--IDENT -> i [47:30] | ||
| | | | | | `--NUM_INT -> 40 [47:34] |
There was a problem hiding this comment.
Corresponding code:
case Integer i && i > 40:
System.out.println("Integer");Should we add a new token, PATTERN_DEF ?
Example:
|-PATTERN_DEF -> [47:27]
| |--LAND -> && [47:27]
| | |--PATTERN_VARIABLE_DEF -> PATTERN_VARIABLE_DEF [47:17]
| | | |--MODIFIERS -> MODIFIERS [47:17]
| | | |--TYPE -> TYPE [47:17]
| | | | `--IDENT -> Integer [47:17]
| | | `--IDENT -> i [47:25]
| | `--GT -> > [47:32]
| | |--IDENT -> i [47:30]
| | `--NUM_INT -> 40 [47:34]
This would apply only to applications of the pattern production rule, and would have no effect on existing o instanceof String s definitions (causing no changes to existing AST structure). As pattern definitions become more prevalent (and complex) in Java grammar, it may prove helpful to have a dedicated PATTERN_DEF node.
Pattern:
PrimaryPattern
GuardedPattern
GuardedPattern:
PrimaryPattern && ConditionalAndExpression
PrimaryPattern:
TypePattern
( Pattern )
There was a problem hiding this comment.
Am I got it right,
case Integer i && i < 40:
System.out.println("Small integer");
break;will be
|-PATTERN_DEF -> [47:27]
| |--LAND -> && [47:27]
| | |--PATTERN_VARIABLE_DEF -> PATTERN_VARIABLE_DEF [47:17]
| | | |--MODIFIERS -> MODIFIERS [47:17]
| | | |--TYPE -> TYPE [47:17]
| | | | `--IDENT -> Integer [47:17]
| | | `--IDENT -> i [47:25]
| | `--LT -> > [47:32]
| | |--IDENT -> i [47:30]
| | `--NUM_INT -> 40 [47:34]
and
case Integer i:
System.out.println("Any integer");
break;will be also a PATTERN_DEF:
|-PATTERN_DEF -> [47:27]
| `--PATTERN_VARIABLE_DEF -> PATTERN_VARIABLE_DEF [47:17]
| |--MODIFIERS -> MODIFIERS [47:17]
| |--TYPE -> TYPE [47:17]
| | `--IDENT -> Integer [47:17]
| `--IDENT -> i [47:25]
?
There was a problem hiding this comment.
will be also a PATTERN_DEF:
?
We can do this.
Or, we can have a simple PATTERN_VARIABLE_DEF with no PATTERN_DEF parent node for switch, just like we do for instanceof. I am not sure what will prove useful from a static analysis viewpoint, because I can't really think of how any current checks would benefit from more easily accessing patterns in the AST (by adding a new token). I can work up some more examples if we want to consider adding a new token.
@pbludov can you think of any existing checks (or new check ideas) that would benefit from having a PATTERN_DEF node?
There was a problem hiding this comment.
I think having PATTERN_DEF over LAND makes sense. This will allow some checks to distinguish between && for patterns and conditions.
Adding PATTERN_DEF as the parent of PATTERN_VARIABLE_DEF will be a breaking change for some third-party checks.
I like simple PATTERN_VARIABLE_DEF for switch.
There was a problem hiding this comment.
One more thought: the if has no PATTERN_DEF:
if (o instanceof Integer i && i < 40) { // LAND, LITERAL_INSTANCEOF, PATTERN_VARIABLE_DEF, LT
...
}
switch (o) {
case Integer i && i < 40: // LAND, PATTERN_VARIABLE_DEF, LT ?
break;
}Should there be an extra node for case?
There was a problem hiding this comment.
why we think that it is not CASE specific syntax for pattern
It is not. This syntax can be used anywhere enhanced instance of with pattern matching can be used now, as long as expression is parenthesized.
Click for example
public class Test {
void method1(Object o) { // should have violation
if (o instanceof String s) {
// ^ pattern variable introduced (simple type pattern)
}
if (o instanceof String s && s.length() > 6) {
// ^ pattern variable introduced (simple type pattern),
// then evaluated in boolean expression
// 's.length() > 6' is usual boolean expression, not part of pattern
}
if (o instanceof (String s && s.length() > 4)) {
// ^ pattern variable introduced in guarded pattern; this refines the pattern itself
// 's.length() > 4` is part of the pattern
}
int x = o instanceof (String s && s.length() > 6) ? 4 : 5;
int y = o instanceof String s && s.length() > 4 ? 2 : 3;
int z = o instanceof String s ? 1 : 2;
while (o instanceof (String s && s.length() > 6)) {
}
while (o instanceof String s && s.length() > 4) {
}
while (o instanceof String s) {
}
for (int i = 0; o instanceof Integer myInt && myInt > 5;) {
}
for (int i = 0; o instanceof (Integer myInt && myInt > 5);) {
}
for (int i = 0; o instanceof Integer myInt; ) {
}
}
}
to keep it specific...
To me, specificity is more important for the difference between a simple pattern variable introduction, like:
if (o instanceof String s) {
// ^ pattern variable introduced (simple type pattern)
}
And
if (o instanceof (String s && s.length() > 4)) {
// ^ pattern variable introduced in guarded pattern; this refines the pattern itself
// 's.length() > 4` is part of the pattern
}
In simple pattern variable introduction, whatever surrounding context we need can already be gathered by a check (and probably already is). This does not require (or benefit from) a new parent PATTERN_DEF (and would cause breaking changes to AST.
In guarded and parenthesized patterns, we should be able to know exactly all that this pattern encompasses easily. Why? Because pattern definitions are going to get much more complicated soon due to pattern matching for records and arrays, in JEP405:
static void printXCoordOfUpperLeftPointWithPatterns(Rectangle r) {
if (r instanceof Rectangle(ColoredPoint(Point(var x, var y), var c), var lr)) {
System.out.println("Upper-left corner: " + x);
}
}
static void printSumOfFirstTwoXCoords(Object o) {
if (o instanceof Point[] { Point(var x1, var y1), Point(var x2, var y2), ... }) {
System.out.println(x1 + x2);
}
}
In this case, we can easily have entire AST for Rectangle(ColoredPoint(Point(var x, var y), var c), var lr) in one go, and have a parent to look for when analyzing scopes of pattern variables, etc.
There was a problem hiding this comment.
One more thought: the if has no PATTERN_DEF
Without parenthesized pattern, it would not. It's not really necessary, since && i < 40 is not part of the pattern. If a parenthesized pattern was used (o instanceof (Integer i && i < 40)), then it would now be a PATTERN_DEF by my proposal, because && i < 40 is now part of the pattern.
There was a problem hiding this comment.
How about this: I will make a few small/simple input files and printed AST files, with examples from above for evaluation and add to this PR using proposed implementation. This will make it easier to visualize exact AST for certain syntax. Thumbs up on this for agreement.
TLDR of proposed implementation:
Integer i -> no PATTERN_DEF parent, in any case.
All other patterns have PATTERN_DEF parent.
No breaking changes to AST.
There was a problem hiding this comment.
OK, implementation and tests updated. I am marking this ready for review. If we are good with AST structure, I will begin generating reports.
e9d5578 to
87fa76b
Compare
| System.out.println("The rest (including null)"); | ||
| } | ||
|
|
||
| // example of parenthesized pattern in instanceof |
There was a problem hiding this comment.
From https://openjdk.java.net/jeps/406:
We also change the grammar for instanceof expressions to:
InstanceofExpression:
RelationalExpression instanceof ReferenceType
RelationalExpression instanceof PrimaryPattern
This means that we can have a parenthesized pattern here now:
PrimaryPattern:
TypePattern
( Pattern )
There was a problem hiding this comment.
this is not breacking changes. parenthesized pattern not supported in jdk less jdk17. So this updates comes with "switch" of jdk17. I tested it.
This comment has been minimized.
This comment has been minimized.
2e59dec to
9b493c5
Compare
| value="[\\/]src[\\/](it|test)[\\/]resources(-noncompilable?)[\\/]"/> | ||
| <property name="fileNamePattern" | ||
| value="^(package-info.java)|(Input\w+\.java)|(.*\.properties)$"/> | ||
| value="^(package-info.java)|(Input\w+\.java)|(Input\w+\.txt)|(.*\.properties)$"/> |
There was a problem hiding this comment.
Existing pattern did not allow for expected AST output files in non-compilable. I think it is better to keep input files and expected output together.
8c034a9 to
4ead50d
Compare
8cf1d52 to
8831197
Compare
005fde1 to
7b017a2
Compare
pbludov
left a comment
There was a problem hiding this comment.
Please do regression tests. PMD already has some Java17 test sources:
https://github.com/pmd/pmd/blob/master/pmd-java/src/test/resources/net/sourceforge/pmd/lang/java/ast/jdkversiontests/java17p/EnhancedTypeCheckingSwitch.java
5f0cb64 to
cca37aa
Compare
|
Github, generate report |
|
@romani all related PR's/ issues opened. |
79c2c03 to
1321ae2
Compare
1321ae2 to
66dcaf8
Compare
Am I read this correctly that it can be: in Inputs I see code like: please add (if compilable): |
By JLS and our grammar, yes. By compiler, no.
Done. |
66dcaf8 to
40e73d0
Compare
romani
left a comment
There was a problem hiding this comment.
Ok to merge as item above to extend doc example is done
40e73d0 to
de09b32
Compare
|
Github, generate web site |
|
Github, generate site |
de09b32 to
7f5c147
Compare
Closes #10848.
Relevant grammar from https://docs.oracle.com/javase/specs/jls/se17/preview/specs/patterns-switch-jls.html:
Relevant grammar for patterns:
Also, minor update to
instanceof:From https://openjdk.java.net/jeps/406:
Just a test here, I noticed that PMD release in projects files is not up to date.
Diff Regression projects: https://gist.githubusercontent.com/nmancus1/ea9cd53c4dc4be8be3e776d059542f00/raw/6eb067547d67808e3bbab428a4e2dad7a031676d/pmd.properties
Diff Regression config: https://gist.githubusercontent.com/nmancus1/3db3e23a5063857160d7d92cbdfa1bfe/raw/a200e0fb1c4700ecd084f9e0b03bf7dbb9694832/config.xml
ANTLR regression report
https://nmancus1.github.io/issue-10848/issue-10848-antlr-diff-pmd-checkstyle/index.html
All diffs are exceptions vs files we can parse now.
OpenJDK16 Check Regression Reports
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_30_openjdk16/diff_checks-only-javadoc-error/index.html
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_30_openjdk16/diff_part1/index.html
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_30_openjdk16/diff_part2/index.html
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_30_openjdk16/diff_part3/index.html
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_30_openjdk16/diff_part4/index.html
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_30_openjdk16/diff_part5/index.html
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_30_openjdk16/diff_part6/index.html
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_30_openjdk16/diff_sevntu-check-regression_part_1/index.html
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_30_openjdk16/diff_sevntu-check-regression_part_2/index.html
All reports for openjdk16 are clean.
Check regression reports
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_31/diff_checks-only-javadoc-error/index.html
Diff in line number for exception messages, and diff in ability to parse new files in this PR branch (my-checkstyle).
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_31/diff_part1/index.html
New violations in files that we could not parse before; all violations are legit.
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_31/diff_part2/index.html
New violations in files that we could not parse before; all violations are legit.
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_31/diff_part3/index.html
Need to update
MissingSwitchDefault.Examples: https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_31/diff_part3/my-checkstyle/xref/home/nick/development/contribution/checkstyle-tester/repositories/my-checkstyle/src/test/resources-noncompilable/com/puppycrawl/tools/checkstyle/grammar/antlr4/InputAntlr4AstRegressionPatternMatchingInSwitch.java.html#L118
Issue opened at #11220
All other new violations in files that we could not parse before.
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_31/diff_part4/index.html
New violations in files that we could not parse before; all violations are legit.
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_31/diff_part5/index.html
New violations in files that we could not parse before; all violations are legit.
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_31/diff_part6/index.html
New violations in files that we could not parse before; all violations are legit.
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_31/diff_sevntu-check-regression_part_1/index.html
New violations in files that we could not parse before; all violations are legit.
https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_31/diff_sevntu-check-regression_part_2/index.html
Regression in
LogicConditionNeedOptimization: https://nmancus1.github.io/issue-10848/issue-10848_check_diff_reports_2021_12_31/diff_sevntu-check-regression_part_2/my-checkstyle/xref/home/nick/development/contribution/checkstyle-tester/repositories/my-checkstyle/src/test/resources-noncompilable/com/puppycrawl/tools/checkstyle/grammar/antlr4/InputAntlr4AstRegressionPatternsInFor.java.html#L6PR at sevntu-checkstyle/sevntu.checkstyle#881