Skip to content

Change precedence of Range #36514

@gafter

Description

@gafter

Below is an excerpt from an unedited version of today's LDM notes specifying a change to the syntax for Range. This needs to be implemented and tested.

Range

There are some syntactic ambiguities in the Range specification and implementation. See #34483. Principal among them are the fact that the specification gives no meaning to a * .. b but the compiler accepts it, producing a tree with precedence inversion. One fix is to simply produce a syntax error in these cases. However I believe this is likely to be confusing to the user.

These issues can be resolved by the following proposed precedence change:

  1. e1 .. e2 should be at a precedence between shift and additive (as currently specified).
  2. .. e should be at unary precedence, like all the other prefix operators.
  3. e .. should be at primary precedence, like all the other postfix operators.

Note that a single token look-ahead is required to distinguish between cases 1 and 3. Nevertheless, this may result in some confusingly inconsistent parsing behavior.

_ = a + b .. c; // (a + b) .. c
_ = a + b ..;   // a + (b ..)

This confusion could be eliminated by moving the precedence of the binary form from where it is (between shift and additive) to between unary and multiplicative:

_ = a + b .. c; // a + (b .. c)
_ = a + b ..;   // a + (b ..)

This final syntactic proposal is summarized by the following grammar:

multiplicative_expression
    : range_expression
    | multiplicative_expression '*' range_expression
    | multiplicative_expression '/' range_expression
    | multiplicative_expression '%' range_expression
    ;

range_expression
    : unary_expression
    | range_expression '..' unary_expression
    ;

unary_expression
    : prefix_range
    ; // plus all the other forms

prefix_range:
    : '..' unary_expression
    ;

primary_no_array_creation_expression:
    : primary_range
    ; // plus all the other forms

primary_range:
    : primary_expression '..'
    | '..'
    ;

There remain some small ambiguities (which admittedly are caused by using difference precedence levels).

  • Is .. .. to be parsed as .. (..) or (..) ..?
  • Is .. .. e to be parsed as .. (.. e) or (..) .. (e)?
  • Is .. e .. to be parsed as .. (e ..) or (.. e) ..?
  • Is e1 .. .. e2 to be parsed as e1 .. (.. e2) or (e1 ..) .. e2?

Note that these are all semantic errors (as there is no form of the range operator that takes a range as an operand).

I suggest the former in all cases, which would permit limited look-ahead in the parser (look-ahead would be needed to distinguish the parses of .. .. e from .. .. for a different resolution).

An alternative approach is to simply move the precedence to the suggested level for all forms of .. and give an error for precedence inversions, which should now be much less likely.

Anothed ambiguity arises from combination with other operators that have both a unary and binary form. What is the meaning of .. + a or .. - a?

  1. .. (+ a)
  2. (..) + a

Do we have some systematic way of answering such questions?

Resolution: Make the proposed precedence changes per syntax above, and the disambiguaation rule is that if the token following .. could start an expression, then it is the right-hand-side operand of the range operator.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions