Skip to content

[java] About operator nodes #1661

@oowekyala

Description

@oowekyala

This relates to expression grammar changes. #1019

  • We currently use a special node for the operator of an assignment expression (AssignmentOperator). This is not needed because an assignment expression may only have one assignment operator so we could as well set the image of the parent. The Expression node is currently used to represent assignment expressions, but it also encloses all other types of expressions because the JJTree descriptor doesn't mention (>1) as a condition. That adds an unnecessary layer of nesting. I think it would be better to have Expression #void, and a new node AssignmentExpression with no child operator node, instead the operator is on the image. This is consistent with how we don't push e.g. additive expression nodes if there is no additive operation mentioned -> it makes no sense to push ASTExpression if there is no assignment operator.
  • So AssignmentOperator is unnecessary. On the other hand, Additive and multiplicative expressions throw away all but the last of their operators:
    void AdditiveExpression() #AdditiveExpression(>1):
    {}
    {
    MultiplicativeExpression() ( LOOKAHEAD(2) ( "+" {jjtThis.setImage("+");} | "-" {jjtThis.setImage("-");} ) MultiplicativeExpression() )*
    }
    void MultiplicativeExpression() #MultiplicativeExpression(>1):
    {}
    {
    UnaryExpression() ( LOOKAHEAD(2) ( "*" {jjtThis.setImage("*");} | "/" {jjtThis.setImage("/");} | "%" {jjtThis.setImage("%");}) UnaryExpression() )*
    }

    The way this is written, an expression such as a + b + c - 1 would be parsed as an ASTAdditiveExpression with an image equal to "-", throwing away the fact we saw "+" twice. The same happens with MultiplicativeExpression.

I see three possible options:

  1. Add a node for each operator, while continuing to parse these expressions flatly. This is very likely going to get unwieldy, bloating the AST, and the structure of the expression is lost and unpractical to recover.
    With this option, we'd have
AdditiveExpression
+ a (...)
+ AdditiveOperator "+"
+ b (...)
+ AdditiveOperator "+"
+ c (...)
+ AdditiveOperator "-"
+ 1 (...)

Matching additive expressions with eg AdditiveExpression[@Operator="+"] wouldn't be possible anymore.

  1. Left-recursive parsing. This would increase the depth of the AST very much, and especially for eg long string concatenation expressions. With this option, we'd have
AdditiveExpression "-"
+ AdditiveExpression "+"
  + AdditiveExpression "+"
    + a (...)
    + b (...)
  + c (...)
+ 1 (...)
  1. "Reduced" left-recursive parsing. With this option, we only push new nodes when the operator is different:
AdditiveExpression "-"
+ AdditiveExpression "+"
  + a (...)
  + b (...)
  + c (...)
+ 1 (...)

IMO that's a good compromise. String concatenations, even big, would be flat, because they use only "+". The increased nesting is in fact consistent with how nesting of expressions of different precedences is done (eg a + b * c has two levels -> why not a + b - c?).

Metadata

Metadata

Assignees

No one assigned

    Labels

    in:astAbout the AST structure or API, the parsing step

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions