Below is an excerpt from an unedited version of today's LDM notes specifying a change to the syntax for Range. This needs to be implemented and tested.
Range
There are some syntactic ambiguities in the Range specification and implementation. See #34483. Principal among them are the fact that the specification gives no meaning to a * .. b but the compiler accepts it, producing a tree with precedence inversion. One fix is to simply produce a syntax error in these cases. However I believe this is likely to be confusing to the user.
These issues can be resolved by the following proposed precedence change:
e1 .. e2 should be at a precedence between shift and additive (as currently specified).
.. e should be at unary precedence, like all the other prefix operators.
e .. should be at primary precedence, like all the other postfix operators.
Note that a single token look-ahead is required to distinguish between cases 1 and 3. Nevertheless, this may result in some confusingly inconsistent parsing behavior.
_ = a + b .. c; // (a + b) .. c
_ = a + b ..; // a + (b ..)
This confusion could be eliminated by moving the precedence of the binary form from where it is (between shift and additive) to between unary and multiplicative:
_ = a + b .. c; // a + (b .. c)
_ = a + b ..; // a + (b ..)
This final syntactic proposal is summarized by the following grammar:
multiplicative_expression
: range_expression
| multiplicative_expression '*' range_expression
| multiplicative_expression '/' range_expression
| multiplicative_expression '%' range_expression
;
range_expression
: unary_expression
| range_expression '..' unary_expression
;
unary_expression
: prefix_range
; // plus all the other forms
prefix_range:
: '..' unary_expression
;
primary_no_array_creation_expression:
: primary_range
; // plus all the other forms
primary_range:
: primary_expression '..'
| '..'
;
There remain some small ambiguities (which admittedly are caused by using difference precedence levels).
- Is
.. .. to be parsed as .. (..) or (..) ..?
- Is
.. .. e to be parsed as .. (.. e) or (..) .. (e)?
- Is
.. e .. to be parsed as .. (e ..) or (.. e) ..?
- Is
e1 .. .. e2 to be parsed as e1 .. (.. e2) or (e1 ..) .. e2?
Note that these are all semantic errors (as there is no form of the range operator that takes a range as an operand).
I suggest the former in all cases, which would permit limited look-ahead in the parser (look-ahead would be needed to distinguish the parses of .. .. e from .. .. for a different resolution).
An alternative approach is to simply move the precedence to the suggested level for all forms of .. and give an error for precedence inversions, which should now be much less likely.
Anothed ambiguity arises from combination with other operators that have both a unary and binary form. What is the meaning of .. + a or .. - a?
.. (+ a)
(..) + a
Do we have some systematic way of answering such questions?
Resolution: Make the proposed precedence changes per syntax above, and the disambiguaation rule is that if the token following .. could start an expression, then it is the right-hand-side operand of the range operator.
Below is an excerpt from an unedited version of today's LDM notes specifying a change to the syntax for Range. This needs to be implemented and tested.
Range
There are some syntactic ambiguities in the
Rangespecification and implementation. See #34483. Principal among them are the fact that the specification gives no meaning toa * .. bbut the compiler accepts it, producing a tree with precedence inversion. One fix is to simply produce a syntax error in these cases. However I believe this is likely to be confusing to the user.These issues can be resolved by the following proposed precedence change:
e1 .. e2should be at a precedence between shift and additive (as currently specified)... eshould be at unary precedence, like all the other prefix operators.e ..should be at primary precedence, like all the other postfix operators.Note that a single token look-ahead is required to distinguish between cases 1 and 3. Nevertheless, this may result in some confusingly inconsistent parsing behavior.
This confusion could be eliminated by moving the precedence of the binary form from where it is (between shift and additive) to between unary and multiplicative:
This final syntactic proposal is summarized by the following grammar:
There remain some small ambiguities (which admittedly are caused by using difference precedence levels).
.. ..to be parsed as.. (..)or(..) ..?.. .. eto be parsed as.. (.. e)or(..) .. (e)?.. e ..to be parsed as.. (e ..)or(.. e) ..?e1 .. .. e2to be parsed ase1 .. (.. e2)or(e1 ..) .. e2?Note that these are all semantic errors (as there is no form of the range operator that takes a range as an operand).
I suggest the former in all cases, which would permit limited look-ahead in the parser (look-ahead would be needed to distinguish the parses of
.. .. efrom.. ..for a different resolution).An alternative approach is to simply move the precedence to the suggested level for all forms of
..and give an error for precedence inversions, which should now be much less likely.Anothed ambiguity arises from combination with other operators that have both a unary and binary form. What is the meaning of
.. + aor.. - a?.. (+ a)(..) + aDo we have some systematic way of answering such questions?
Resolution: Make the proposed precedence changes per syntax above, and the disambiguaation rule is that if the token following
..could start an expression, then it is the right-hand-side operand of the range operator.