Skip to content

[kotlin] Parser hangs on complex files due to unbounded ATN prediction loop #6659

@stokpop

Description

@stokpop

Problem

When parsing complex Kotlin files (e.g. large Jetpack Compose files with deeply nested lambda expressions), the ANTLR parser enters an exponential ATN prediction loop in ParserATNSimulator.closureCheckingStopState. This loop has no interruption points — the thread spins indefinitely and PMD hangs forever, requiring a kill.

Root cause: the ANTLR 4 ATN closure algorithm performs recursive PredictionContext.merge calls that can grow without bound on adversarial input. The generated parser also uses static shared DFA[] and PredictionContextCache fields, so ATN state from one file accumulates and affects subsequent files.

Solution

1. InterruptibleParserATNSimulator (in pmd-core)

New class overriding closureCheckingStopState to check Thread.interrupted() at every recursion level. When interrupted, throws ParseCancelledException to unwind the ATN stack immediately. Placed in pmd-core so it is reusable by all ANTLR-based languages (Kotlin, Swift, PL/SQL, etc.).

2. Per-file ATN state isolation

PmdKotlinParser creates fresh DFA[] and PredictionContextCache per file, preventing unbounded state accumulation across files.

3. Parse timeout

A daemon ExecutorService runs each parse with a configurable timeout (default 30s, system property pmd.kotlin.parseTimeoutSeconds). On timeout the thread is interrupted, ParseCancelledException is thrown, and a clear ParseException with WARN log is reported.

Verification

Validated on a real-world Android project (413 Kotlin files). jstack confirmed the kotlin-parser thread is deeply inside the recursive closureCheckingStopState loop and the override fires at every level. With a 5s timeout, 86 files triggered clean timeout warnings; with the default 30s all files parsed successfully.

Notes

Metadata

Metadata

Assignees

No one assigned

    Labels

    a:bugPMD crashes or fails to analyse a file.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions