Skip to content

fix: LexicalPreservingPrinter can't print PrimitiveType#4817

Open
seokjun7410 wants to merge 1 commit into
javaparser:masterfrom
seokjun7410:fix/4781
Open

fix: LexicalPreservingPrinter can't print PrimitiveType#4817
seokjun7410 wants to merge 1 commit into
javaparser:masterfrom
seokjun7410:fix/4781

Conversation

@seokjun7410

@seokjun7410 seokjun7410 commented Aug 18, 2025

Copy link
Copy Markdown
Contributor

fixes. #4781

Issue Analysis

When adding primitive type fields via LexicalPreservingPrinter, keywords like int, boolean get omitted:

// Expected: class Foo { int foo; }
// Actual:   class Foo { foo; }    // 'int' missing

Root Cause

Dynamically created PrimitiveType nodes lack proper NodeText initialization. The existing prettyPrintingTextNode() relied on node.toString() which returns empty strings for uninitialized nodes.

My Solution: CSM-Based Architecture Approach

Removed Special Case Handling: Eliminated the hardcoded PrimitiveType switch statement in prettyPrintingTextNode() (38 lines removed).

Enhanced getOrCreateNodeText(): Added proper handling for dynamically created PrimitiveType nodes:

if (!node.hasRange() && node instanceof PrimitiveType) {
    interpret(node, ConcreteSyntaxModel.forClass(node.getClass()), nodeText);
} else {
    prettyPrintingTextNode(node, nodeText);
}

Made Primitive enum implement Stringable: For consistency with JavaParser's design patterns.

nstead of special-casing PrimitiveType, I leveraged JavaParser's existing CSM (ConcreteSyntaxModel) infrastructure which already knows how to render primitive types correctly.

Question

Does this CSM-based approach align well with JavaParser's lexical preservation architecture? I chose this over the simpler hardcoded string approach because it maintains consistency with how other AST nodes are handled.

@jlerbsc

jlerbsc commented Aug 19, 2025

Copy link
Copy Markdown
Collaborator

Your PR contains two separate changes. One is an interesting improvement to the existing code, and the other tries to fix a bug.

It would be more practical to separate the two types of changes, as I don't think the bug fix is very relevant.

My observation is that when propagating the change in the list of members (here the fields) of the class declaration, we try to modify the parent node of the list of fields (more precisely in the Node.setParentNode(...) method). According to my debugger, this is when the data field of the node to be added, which is supposed to contain the text form of the field declaration, is modified. Before the assignment ‘parentNode = newParentNode;’, my debugger indicates that the data field (of the node to be added) is null, and after the assignment, the data field is initialised, which subsequently disrupts lexical preservation. I don't know if this is a bug in the debugger or if it's a side effect related to changing the parent node of the node to be added. Perhaps you could confirm this analysis in your work environment.

@seokjun7410

Copy link
Copy Markdown
Contributor Author

Thank you for the insightful review. I'll debug the setParentNode() issue first and separate the changes into two PRs as you suggested.

@seokjun7410

seokjun7410 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Follow-up note on #4781

I revisited the setParentNode() path mentioned in the review.

My current conclusion is that Node.setParentNode(...) itself does not appear to initialize the node’s data field. The behavior observed around parentNode = newParentNode; was likely caused by debugger/display evaluation implicitly triggering Node.toString().

When I inspected the relevant nodes without triggering implicit toString() evaluation, their data fields remained null before and after the parent change. NODE_TEXT_DATA appeared only after an explicit LexicalPreservingPrinter.print(...) / Node.toString() path was executed.

The actual failing flow seems to be that getOrCreateNodeText(PrimitiveType) stores an empty NodeText in NODE_TEXT_DATA, and then the PrimitiveType special case calls node.toString(). With lexical preservation enabled, this re-enters LexicalPreservingPrinter, which reads the already-present but still-empty NodeText. As a result, the primitive keyword is printed as an empty string.

So the issue was not that setParentNode() directly modified data, but that Node.toString() was called while NodeText was still being initialized.

This also explains why the CSM-based refactoring fixes #4781. After switching PrimitiveType to the existing ConcreteSyntaxModel path, the printer no longer re-enters Node.toString() during NodeText initialization. The primitive keyword is now produced from the syntax model via attribute(ObservableProperty.TYPE).

I think this fixes the concrete #4781 bug. However, a broader reentrancy risk may still remain in LPP, because NODE_TEXT_DATA currently does not distinguish between “NodeText is being initialized” and “NodeText is fully initialized”.

Possible follow-up directions would be to add a focused regression test, delay publishing NODE_TEXT_DATA until NodeText is fully populated, or introduce an explicit LexicalPrintContext.

These changes seem broader than the focused #4781 fix, so I would appreciate your thoughts on this direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants