Skip to content

Extended indexing operators#1064

Merged
damiendoligez merged 7 commits intoocaml:trunkfrom
Octachron:ocaml-indexop-last
Sep 15, 2017
Merged

Extended indexing operators#1064
damiendoligez merged 7 commits intoocaml:trunkfrom
Octachron:ocaml-indexop-last

Conversation

@Octachron
Copy link
Copy Markdown
Member

After #69, #622 and the last developper meeting, this PR proposes to add a new family of user-definable indexing operators, that does not interfere nor can be confused with any existing indexing operators.

Extended indexing operators names start with a leading dot . followed by an operator character, except .(to avoid confusion with .. which has a different meaning ) or < (used as an opening bracket at the type level) , and a possibly empty sequence of operator characters. It must be closed with a couple of enclosing brackets ( i.e. (), [], {} ) and optionaly an assignment operator <-:

let (  .%( ) ) p (x,y) = p.(x).(y)
let (  .$?${ } ) p (x,y,z) = p.(x).(y).(z)
let (  .~[ ] <- ) p (x,y,z,t) v = p.(x).(y).(z).(t) <- v 

Expressions of the form expr_1.∘∘∘(expr_2) are then desugared to ( .∘∘∘( )) expr_1 expr_2.
Similarly for assignment, expr_1.∘∘∘(expr_2) <- expr_3 is desugared to ( .∘∘∘( )<-) expr_1 expr_2 expr_3. The same syntactic sugar is applied to all bracket variants (i.e. {, (, [ ).

Expression Desugaring
Access expr_1.∘∘∘⟨expr_2⟩ ( .∘∘∘⟨ ⟩ ) expr_1 expr_2
Assignment expr_1.∘∘∘⟨expr_2⟩ <- expr_3 ( .∘∘∘⟨ ⟩ <- ) expr_1 expr_2 expr_3

The compulsory operator character should help to mark these operators as user-defined, and decrease the confusion potential of these operators. Moreover, the use of a qualified dot respects the use of unqualified . as a projection primitive.

@gasche
Copy link
Copy Markdown
Member

gasche commented Feb 24, 2017

I like it! .$?$(...) is terribly ugly, but .~[ ] and .%( ) both make sense, and I agree it's a good way out of your conflict with type-directed resolution.

(In general I also like the idea of having operator names mirror the operator syntax, similar to Agda's mixfixes, which has first been given to me, in an OCaml contet, by Nicolas Pouillard in 2010. But this was already part of your previous proposal.)

@gasche
Copy link
Copy Markdown
Member

gasche commented Feb 24, 2017

If I understand correctly, the testsuite currently checks the parsed AST. Would it be possible to also test that these programs are pretty-printed back correctly (that is, in a way that can be parsed back to the same thing)?

@Octachron
Copy link
Copy Markdown
Member Author

The parsetree/source.ml test already checks that the pretty-printed ast can be parsed back.

mkexp @@ Pexp_apply(id , [Nolabel, $1; Nolabel, $4; Nolabel, $7]) }
| simple_expr DOTOP LBRACE expr RBRACE
{ let id = mkexp @@ Pexp_ident( ghloc @@ Lident ("." ^ $2 ^ "{}")) in
mkexp @@ Pexp_apply(id, [Nolabel, $1; Nolabel, $4]) }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is orthogonal, but it might be beneficial to add rules of the form simple_expr DOTOP LBRACE expr error to give better error messages.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, fixed.

@Drup
Copy link
Copy Markdown
Contributor

Drup commented Feb 25, 2017

The patch looks good to me (and I like the proposed solution).

I wonder if maybe it would be time to change the representation of (non module) identifier to a sum type. That would make the printing implementation slightly less ridiculous.

@Octachron
Copy link
Copy Markdown
Member Author

Extending on @Drup comment, I have added a comment to clarify the cryptic string manipulation in parsing/pprintast.ml which extract operator informations from its name.

@damiendoligez
Copy link
Copy Markdown
Member

Looks good but I don't much like the restrictions on characters:

  • for < : is there a real reason to disallow it (i.e. a conflict in the grammar) ?
  • for . : of course .. is a keyword, but I don't think that's a reason to disallow all "operators" that begin with ...

@Octachron
Copy link
Copy Markdown
Member Author

With the current scheme (i.e. using a new DOTOP token in the lexer) allowing either . or < as starting characters would require modification in unrelated part of the parser due to the use of < and > as brackets for object type:

  • For <, in let f: 'a .< x:'a> -> 'a= … , .< will be read as a DOTOP(<) token.
  • For ., in let f: <x:'a; ..> -> 'a = …, ..> will be read as a DOTOP(.>) token.

There is potentially two ways to get around this limitation: restrict the syntax to merge .%( and friends in a single token rather than two (.% () (a solution that was suggested to me by @Drup earlier) or generalize the syntax by removing the lexer modification and implementing it as a grammar modification only.

The first solution will slightly break the symmetry between classical indexing operators and the extended ones: a . (b) will be valid, but not a .% (b). The last solution will allow more whitespace at strange positions: a .. . .. % (b) .

@damiendoligez
Copy link
Copy Markdown
Member

OK, those are good reasons.

This commit adds extended indexing operators to the parser.
For instance,

```
let (.%()) p (x,y) = p.( x ).( y )
;; p.%(0,0)

```
Extended indexing operators name starts with a leading
dot "." followed by an operator symbol, except "." or "<"
and a sequence of any operators and must be closes with a
couple of enclosing parentheses ( i.e. "()", "[]", "{}" )
and then an optional assignment operator "<-":

  * '.' dotsymbolchar symbolchar* '(' ')' ['<-']
  * '.' dotsymbolchar symbolchar* '[' ']' ['<-']
  * '.' dotsymbolchar symbolchar* '{' '}' ['<-']

Similarly, expressions of the form

* expr_1 '.' dotsymbolchar symbolchar* '(' expr_2 ')'
* expr_1 '.' dotsymbolchar symbolchar* '[' expr_2 ']'
* expr_1 '.' dotsymbolchar symbolchar* '{' expr_2 '}'

and

* expr_1 '.' dotsymbolchar symbolchar* '(' expr_2 ')' '<-' expr_3
* expr_1 '.' dotsymbolchar symbolchar* '[' expr_2 ']' '<-' expr_3
* expr_1 '.' dotsymbolchar symbolchar* '{' expr_2 '}' '<-' expr_3

are desugared to

* ('.' dotsymbolchar symbolchar* '(' ')' ) expr_1 expr_2
* ('.' dotsymbolchar symbolchar* '[' ']' ) expr_1 expr_2
* ('.' dotsymbolchar symbolchar* '{' '}' ) expr_1 expr_2

and

* ('.' dotsymbolchar symbolchar* '(' ')' '<-' ) expr_1 expr_2 expr_3
* ('.' dotsymbolchar symbolchar* '[' ']' ) expr_1 expr_2 expr_3
* ('.' dotsymbolchar symbolchar* '{' '}' '<-' ) expr_1 expr_2 expr_3
@Octachron
Copy link
Copy Markdown
Member Author

What is the status of this PR for 4.0{6,7}?

If the restriction on the operators character is deemed too complicated, a coarser and simpler restriction may be to forbid the characters <> with the motivation that they are too close typographically from the left and right angle brackets ⟨⟩ and thus confusing in for instance x.<?>.(y).

@damiendoligez
Copy link
Copy Markdown
Member

I think the restriction is OK. It will always be possible to write unreadable programs so we shouldn't spend too much time on this "issue".

Please rebase and I'll merge.

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Sep 15, 2017

I just noticed that this doesn't support things like x.Foo.%(...) -- analogous to x.Foo.label. This is a shame. I guess its too late to get that in before the freeze. Maybe Damien will generously consider fixing this omission a bug fix?

@mshinwell
Copy link
Copy Markdown
Contributor

I took the liberty of rebasing this myself. Will be ready to merge once Travis etc finishes.

@damiendoligez
Copy link
Copy Markdown
Member

damiendoligez commented Sep 15, 2017

Maybe Damien will generously consider fixing this omission a bug fix?

Depending on the complexity of the fix and the proximity of the release, maybe.

@damiendoligez damiendoligez merged commit 201ee17 into ocaml:trunk Sep 15, 2017
@Octachron
Copy link
Copy Markdown
Member Author

@lpw25 , I will send a PR adding back qualified operators tomorrow morning.

user-defined indexing operators, obtained by adding at least
one operator character after the dot symbol to the standard indexing
operators: e,g ".%()", ".?[]", ".@{}<-"
(Florian Angeletti, review by Damien Doligez and Gabriel Radanne)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Octachron I think this PR breaks backwards compatibility in some edge cases: see https://caml.inria.fr/mantis/view.php?id=7637.

@yallop
Copy link
Copy Markdown
Member

yallop commented Oct 19, 2017

Not necessarily a problem, but worth noting here: this change uses up the syntactic space that was considered for the succinct version of overriding local open, i.e.

M.!(e)

as a shorthand for

let open! M in e

With this PR, M.!(e) is instead interpreted as (.!()) M e.

@Drup
Copy link
Copy Markdown
Contributor

Drup commented Oct 19, 2017

@yallop You mean, the one that was rejected (#218) ? :)

@yallop
Copy link
Copy Markdown
Member

yallop commented Oct 13, 2018

Another issue with this change (which I should have spotted earlier): it reserves .~, which has been used by MetaOCaml for splicing for a long time.

However, it appears that none of the code available via OPAM is yet using .~, so it's not too late to restore compatibility. I'll open a new pull request with more details (update: #2106).

stedolan pushed a commit to stedolan/ocaml that referenced this pull request Mar 21, 2023
)

I've also regenerated the test to be sure that printing now works (in the one
very simple example). In the long run:

* The primitives should probably all be made lowercase for simplicity and for
  consistency with the IR (currently the "constructor-like" ones get initial
  caps).
* We should be testing fexpr printing as well as parsing (roundtrip test).
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Co-authored-by: cuihtlauac <cuihtlauac@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants