Skip to content

[kotlin] Fix #6608: Improve kotlin parser error handling#6650

Merged
adangel merged 5 commits into
pmd:mainfrom
stokpop:issue-6608-improve-kotlin-parser-error-handling
May 22, 2026
Merged

[kotlin] Fix #6608: Improve kotlin parser error handling#6650
adangel merged 5 commits into
pmd:mainfrom
stokpop:issue-6608-improve-kotlin-parser-error-handling

Conversation

@stokpop

@stokpop stokpop commented May 7, 2026

Copy link
Copy Markdown
Contributor

improve kotlin parser error handling

Report lexer and parser errors via regular logging instead to std err.
Add reference to file and location.

Makes a difference between lexer and parser errors.

Throws ParserError to be used in reports and testing.

Related issues

Ready?

  • Added unit tests for fixed bug/feature
  • Passing all unit tests
  • Complete build ./mvnw clean verify passes (checked automatically by github actions)
  • Added (in-code) documentation (if needed)

@pmd-actions-helper

pmd-actions-helper Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

Documentation Preview

Compared to main:
This changeset changes 0 violations,
introduces 0 new violations, 0 new errors and 0 new configuration errors,
removes 0 violations, 0 errors and 0 configuration errors.
There are 0 changed duplications, 0 new duplications and 0 removed duplications.
There are 0 changed CPD errors, 0 new CPD errors and 0 removed CPD errors.

Regression Tester Report

(comment created at 2026-05-22 19:13:48+00:00 for 169fd69)

@adangel adangel changed the title [kotlin] improve kotlin parser error handling [kotlin] Fix #6608: Improve kotlin parser error handling May 8, 2026

@adangel adangel left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I would really throw a LexException when there are lexer errors. And fail fast.

Maybe we should add a note to

* We provide a [`AntlrBaseParser`](https://github.com/pmd/pmd/blob/main/pmd-core/src/main/java/net/sourceforge/pmd/lang/ast/impl/antlr4/AntlrBaseParser.java)
how to handle lex/parse errors.

The Swift Example (PmdSwiftParser) seems to be incomplete, we should probably also add there an error listener to the lexer.
Note: As you see in the TODO comment, we don't throw an exception yet, as the test sample swift code we have in our test resources cannot be parsed by the grammar...

Note 2: When this is merged, we'll get visibility of errors when parsing Kotlin code, that we didn't have before. There might be an increase on kotlin related issues in PMD...

public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol,
int line, int charPositionInLine,
String msg, RecognitionException e) {
LOG.warn("Lexer error at {}:{}:{}: {}", task.getFileId().getOriginalPath(), line, charPositionInLine, msg);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, we should throw a LexException here as well. If the lexer failed, the tokens are not correct, and so the parse tree won't be correct and so anything on the analyzed file is unknown.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now throws LexException to fail fast

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either log the error or throw an exception, but do not both - because when throwing an exception, its likely that this exception will be handled in one way or the other - we would end up in logging the same error multiple times.

-> Please remove the LOG.warn call here

@adangel adangel added the needs:user-input Maintainers are waiting for feedback from author label May 8, 2026
@stokpop

stokpop commented May 9, 2026

Copy link
Copy Markdown
Contributor Author

Sounds good to fail fast. I will add the LexException.

Also added implementation notes in the md docs for adding a new ANTLR base language.

Updating the Swift code to behave similar can be in another issue/PR?

Collect all errors idea

The thing missing might be that currently all errors are at least reported, and with failing fast on first lexer issue only the first one is reported. That might give multiple cycles to devs to solve issues (like we saw with the $ replacement case).

We can also collect all Lex and Parser exceptions for one file and report all the issues in one ParseException without processing the file any further.

For example:

/src/main/kotlin/nl/stokpop/BadCode.kt	-	ParseException: Parse exception in file '/src/main/kotlin/nl/stokpop/BadCode.kt' at line 3, column 30: 
Lexer error: /src/main/kotlin/nl/stokpop/BadCode.kt:3:30: token recognition error at: '^'
Parser error: /src/main/kotlin/nl/stokpop/BadCode.kt:3:32: extraneous input 'b'
Lexer error: /src/main/kotlin/nl/stokpop/BadCode.kt:4:30: token recognition error at: '^'
Parser error: /src/main/kotlin/nl/stokpop/BadCode.kt:4:32: extraneous input 'b'
Parser error: /src/main/kotlin/nl/stokpop/BadCode.kt:5:12: mismatched input '{'

for BadCode.kt:

package nl.stokpop

fun xor1(a: Int, b: Int) = (a ^ b)
fun xor2(a: Int, b: Int) = (a ^ b)
fun broken( { }

Good idea to bundle the errors? If so I can provide a second PR for this when this is merged.

- Add custom ANTLR error listeners in PmdKotlinParser
- Lexer errors: log WARN with file location and throw LexException
- Parser errors: log WARN with file location and throw ParseException
- Short WARN message (strip verbose expected-token list), full at DEBUG
- Simplify KotlinParsingHelper: remove duplicate custom parseImpl override
- Add tests: syntaxErrorThrowsParseException, lexerErrorThrowsLexException
@stokpop stokpop force-pushed the issue-6608-improve-kotlin-parser-error-handling branch from 19db0e6 to ff6d2f4 Compare May 11, 2026 12:28
- Expand step 6 with error handling guidance: LexException for lexer
  errors, ParseException for parser errors
- Replace PmdSwiftParser reference with PmdKotlinParser as the
  reference implementation
@stokpop stokpop requested a review from adangel May 11, 2026 12:59
@adangel adangel removed the needs:user-input Maintainers are waiting for feedback from author label May 15, 2026

@adangel adangel left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update.

However, we should not both log and throw an exception. The exception will end up in the report as processing error.

Good idea to bundle the errors?

I like the idea. Be aware, that some errors reported that way might be consequential errors.

implementation that you need to extend to create your own adapter. See
[`PmdKotlinParser`](https://github.com/stokpop/pmd/blob/main/pmd-kotlin/src/main/java/net/sourceforge/pmd/lang/kotlin/ast/PmdKotlinParser.java)
as the reference implementation.
* **Error handling**: ANTLR by default swallows errors silently. You must register custom

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* **Error handling**: ANTLR by default swallows errors silently. You must register custom
* **Error handling**: ANTLR by default prints errors to stderr only and otherwise ignores them. You must register custom

as the reference implementation.
* **Error handling**: ANTLR by default swallows errors silently. You must register custom
`BaseErrorListener`s on both the lexer and the parser to surface errors properly:
* **Lexer errors** (unrecognized tokens): log a `WARN` and throw a

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either LOG or throw an exception, but not both...

public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol,
int line, int charPositionInLine,
String msg, RecognitionException e) {
LOG.warn("Lexer error at {}:{}:{}: {}", task.getFileId().getOriginalPath(), line, charPositionInLine, msg);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either log the error or throw an exception, but do not both - because when throwing an exception, its likely that this exception will be handled in one way or the other - we would end up in logging the same error multiple times.

-> Please remove the LOG.warn call here

Comment on lines +73 to +74
LOG.warn("Parser error at {}:{}:{}: {}", task.getFileId().getOriginalPath(), line, charPositionInLine, shortMessage(msg));
LOG.debug("Parser error at {}:{}:{}: {}", task.getFileId().getOriginalPath(), line, charPositionInLine, msg);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, we should not both log and throw. Please remove the loggings.
Also: I would provide the full message to the parse exception and not remove any information fro the message.

@adangel adangel added the needs:user-input Maintainers are waiting for feedback from author label May 15, 2026
@adangel adangel removed the needs:user-input Maintainers are waiting for feedback from author label May 22, 2026
@adangel adangel added this to the 7.25.0 milestone May 22, 2026
@adangel

adangel commented May 22, 2026

Copy link
Copy Markdown
Member

@stokpop I've updated the PR to collect the errors and throw only one exception per file (and remove the extra logging).
If this proves useful, then we can think about moving the (private inner) class "AntlrErrorListener" to pmd-core.

@adangel adangel left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

I've updated the PR to collect the errors and throw only one exception per file (and remove the extra logging).
If this proves useful, then we can think about moving the (private inner) class "AntlrErrorListener" to pmd-core.

@adangel adangel merged commit 53c2b95 into pmd:main May 22, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[kotlin] Lexer or parse errors are reported to stderr only without file context

2 participants