Skip to content

[modelica] Normalize invalid node ranges#2195

Merged
adangel merged 4 commits into
pmd:masterfrom
atrosinenko:modelica-node-range-fixup
Jan 6, 2020
Merged

[modelica] Normalize invalid node ranges#2195
adangel merged 4 commits into
pmd:masterfrom
atrosinenko:modelica-node-range-fixup

Conversation

@atrosinenko

@atrosinenko atrosinenko commented Jan 3, 2020

Copy link
Copy Markdown
Contributor

Some files were parsed into ASTs with nodes having negative ranges.
For example:

package TestPackage
  package EmptyPackage
  end EmptyPackage;
end TestPackage;

has subtree:

  SimpleLongClassSpecifier "EmptyPackage"
    SimpleName "EmptyPackage"
    Composition
      ElementList <-- start = 3:3, end = 2:22
    SimpleName "EmptyPackage"

Some files were parsed into ASTs with nodes having negative ranges.
For example:

package TestPackage
  package EmptyPackage
  end EmptyPackage;
end TestPackage;

has subtree:

  SimpleLongClassSpecifier "EmptyPackage"
    SimpleName "EmptyPackage"
    Composition
      ElementList <-- start = 3:3, end = 2:22
    SimpleName "EmptyPackage"
@ghost

ghost commented Jan 3, 2020

Copy link
Copy Markdown
1 Message
📖 No java rules are changed!

Generated by 🚫 Danger

endLine = parser.token.endLine;
endColumn = parser.token.endColumn;

if (endLine < beginLine) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could avoid doing any of that (in jjtOpen/Close) if instead of using the fields of AbstractNode, you delegated the method calls to the tokens, eg:

@Override
public int getBeginLine() {
return jjtGetFirstToken().getBeginLine();
}
@Override
public int getBeginColumn() {
return jjtGetFirstToken().getBeginColumn();
}
@Override
public int getEndLine() {
return jjtGetLastToken().getEndLine();
}
@Override
public int getEndColumn() {
return jjtGetLastToken().getEndColumn();
}

@atrosinenko atrosinenko Jan 5, 2020

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this definitely looks significantly less hackish. Unfortunately, some nodes end up not containing any tokens (like in the example above), leading to invalid ranges. Are such grammars invalid or maybe it is a minor issue in common code?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The range [3:3, 2:22] for ElemementList suggests that [FirstToken, LastToken] takes the form [t, t-1]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd first create a unit test, that verifies the locations of the nodes in the given code example. Then you can be sure, that a) the bug is fixed, b) it is not reintroduced later on.

The grammar creates the node ElementList, even though, there are no Element children. You can avoid creating the ElementList node by using a conditional node in the grammar:

-void ElementList(Visibility v): {}
+void ElementList(Visibility v) #ElementList(>1): {}

(see line 353 in Modelica.jjt)

Whether this makes sense here, I can't say. You seem to set the visibility on the node, although it is always "UNSPEC". The same would need to be applied for the node Composition and possibly other nodes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought on making this node conditional. On one hand, it cleans up the AST and in this particular case it may be the best solution. On the other hand, it is interesting, whether either some generic solution exists or such token-free subtrees are explicitly considered an issue.

Related to previous, on adding a test: it do exists, but in SCM repo. I agree, it is worth moving it here in slightly reduced form. It is still interesting, is this condition expected to hold for every language module?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seem to set the visibility on the node, although it is always "UNSPEC".

Visibility is public, private, etc. modifiers set in source file:

package TestPackage
public
  package EmptyPackage
  end EmptyPackage;
end TestPackage;

This creates two ElementLists, one of them is PUBLIC. But, frankly speaking, I have never used such modifiers and not yet handle them in type resolution.

@oowekyala oowekyala Jan 5, 2020

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Token free subtrees are not a bug, they're useful to make the parsed AST more regular. This is used for example in the Scala AST, and I'd like to introduce that in the Java AST in 7.0. It's probably just an oversight of JJTree not to handle this case, but we shouldn't limit PMD to it.

I've pushed a solution, that uses an implicit zero-length token. I expect this is how we'll handle it in PMD 7 for the Java tree, though at that point we'll share the solution between language modules, and not do this ad-hoc in jjtClose (eg fixing JJTree).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - we are using Javacc 5.0, which is a old version. It might be different with a newer javacc version.

@atrosinenko

Copy link
Copy Markdown
Contributor Author

Thank you!

@adangel adangel added this to the 6.21.0 milestone Jan 6, 2020
adangel added a commit that referenced this pull request Jan 6, 2020
@adangel adangel merged commit 394de28 into pmd:master Jan 6, 2020
@atrosinenko atrosinenko deleted the modelica-node-range-fixup branch February 6, 2022 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants