Optimize desugaring of `with` expressions by Gabriella439 · Pull Request #1050 · dhall-lang/dhall-lang

Gabriella439 · 2020-08-02T16:07:27Z

The motivation for this change is:

dhall-lang/dhall-haskell#1960

… where a deeply nested and chained with expression caused a blow-up
in the size of the desugared syntax tree.

This alternative desugaring was actually an idea we contemplated in the
original change that standardized with:

… but at the time we elected to go with the simpler desugaring, reasoning
that users could work around performance issues with explicit let
bindings:

#923 (comment)

However, for chained with expressions that work-around is not
ergonomic, as illustrated in
dhall-lang/dhall-haskell#1960. Instead,
this adds language support for the let optimization.

The change to the WithMultiple test demonstrates that this leads to
a more compact desugared expression. Before this change the
expression { a.b = 1, c.d = 2 } with a.b = 3 with c.e = 4 would
desugar to:

  { a.b = 1, c.d = 2 }
⫽ { a = { a.b = 1, c.d = 2 }.a ⫽ { b = 3 } }
⫽ { c =
        ({ a.b = 1, c.d = 2 } ⫽ { a = { a.b = 1, c.d = 2 }.a ⫽ { b = 3 } }).c
      ⫽ { e = 4 }
  }

… and now the same expression desugars to:

let _ = let _ = { a.b = 1, c.d = 2 } in _ ⫽ { a = _.a ⫽ { b = 3 } }

in  _ ⫽ { c = _.c ⫽ { e = 4 } }

The motivation for this change is: dhall-lang/dhall-haskell#1960 … where a deeply nested and chained `with` expression caused a blow-up in the size of the desugared syntax tree. This alternative desugaring was actually an idea we contemplated in the original change that standardized `with`: #923 … but at the time we elected to go with the simpler desugaring, reasoning that users could work around performance issues with explicit `let` binding: #923 (comment) However, for chained `with` expressions that work-around is not ergonomic, as illustrated in dhall-lang/dhall-haskell#1960. Instead, this adds language support for the `let` work-around. The change to the `WithMultiple` test demonstrates that this leads to a more compact desugared expression. Before this change the expression `{ a.b = 1, c.d = 2 } with a.b = 3 with c.e = 4` would desugar to: ```dhall { a.b = 1, c.d = 2 } ⫽ { a = { a.b = 1, c.d = 2 }.a ⫽ { b = 3 } } ⫽ { c = ({ a.b = 1, c.d = 2 } ⫽ { a = { a.b = 1, c.d = 2 }.a ⫽ { b = 3 } }).c ⫽ { e = 4 } } ``` … and now the same expression desugars to: ```dhall let _ = let _ = { a.b = 1, c.d = 2 } in _ ⫽ { a = _.a ⫽ { b = 3 } } in _ ⫽ { c = _.c ⫽ { e = 4 } } ```

Gabriella439 · 2020-08-02T16:07:56Z

Here is the matching change to the Haskell implementation: dhall-lang/dhall-haskell#1964

Nadrieril · 2020-08-02T17:38:15Z

From what I remember, the decisions we made #923 should have allowed an implementation to desugar using let while remaining standard-compliant. Did we overlook something?
I'm not a fan of this change because it forces the manipulation of variables at parse time. I would want this kind of thing to happen later on, when I represent variables with something more clever than text, and where I can avoid traversing the whole subtree when I introduce a fresh variable.
I do however agree that the let desugaring is the better one.
I guess I would prefer to have a node for with in the binary representation, and do the desugaring just before tck instead of during parsing.

Gabriella439 · 2020-08-02T18:03:57Z

@Nadrieril: The main difference is the change to one test suite file, which specifies how to encode multiple withs.

I would be fine changing things to make with a node in the syntax tree, though, if nobody else objects. I only took this route because I thought it would be simpler for implementations.

philandstuff · 2020-08-03T05:21:51Z

I’m going to try implementing in dhall-golang and will report back.

sjakobi · 2020-08-03T10:41:04Z

@Gabriel439 Could you give an example (and a test?) of the desugaring with a field-path that is longer than two fields?

E.g.

r with a.b.c = x

Currently I'm unsure whether a with will result in exactly one let-binding or in n-1 let-bindings, where n is the length of the field-path.

Gabriella439 · 2020-08-03T16:08:26Z

@sjakobi: r with a.b.c = x desugars to:

let _ = r in _ ⫽ { a = let _ = _.a in _ ⫽ { b = _.b ⫽ { c = x } } }

I'll hold off on the test until we sort out whether with expressions should be syntactic sugar or not.

sjakobi · 2020-08-04T04:01:48Z

@sjakobi: r with a.b.c = x desugars to:

let _ = r in _ ⫽ { a = let _ = _.a in _ ⫽ { b = _.b ⫽ { c = x } } }

What would be the performance implications if we'd produce at most one let per with? In this case

let _ = r in _ ⫽ { a = _.a ⫽ { b = _.a.b ⫽ { c = x } } }

Gabriella439 · 2020-08-04T04:11:26Z

@sjakobi: The difference would probably be negligible in practice, but I'd expect the one with more lets to be more efficient in the extreme case of a large number of fields because the number of field accesses is linear in the depth of the with expression instead of quadratic.

Gabriella439 · 2020-08-04T14:57:13Z

@Nadrieril: So I took a stab at your suggestion and the main issue I ran into is that if we have a separate phase for desugaring (e.g. in between parsing and type-checking), then that would entail other changes to the AST, too. For example, we'd have to be able to represent the dotted field syntax (e.g. { x.y = 1 }) or duplicate record fields (e.g. { x = 1, x = 1 }) in the AST and standardize how to serialize that. Same thing for multi-line string literals.

Another possible option is to keep the current desugaring treatment, but instead specify the behavior of with at β-normalization time (which implies we'd also need a type-checking rule for with expressions), but I'd be fine with that if people didn't mind the implementation cost. If anything, it would likely improve error messages.

Nadrieril · 2020-08-04T15:12:15Z

Ok, thanks for trying, I do see the issue. with in normalization and typechecking seems like a good option, I'd be ok with that

Gabriella439 · 2020-08-04T16:14:07Z

@Nadrieril: Alright, I'll put up a parallel pull request soon with the non-desugaring route. I don't expect it will be too complicated: it will probably just be something like "to type-check/normalize a with expression, type-check/normalize it as if it were desugared, but feel free to customize error messages and the evaluation strategy"

Gabriella439 · 2020-08-12T02:54:08Z

@Nadrieril: So I implemented what you suggested in the Haskell implementation and realized that there's one downside of doing things the way you suggest, which is that we don't have a good way to verify within our standard test suite that the desugaring was done correctly if it's always done "just in time" as part of type-checking and/or normalization. The parsing tests would only show that the with expression is faithfully encoded (without desugaring) and the normalization tests would miss the intermediate desugaring step in their expected output.

Also, I don't think it's that bad to add the dependency on shift at desugaring time, given that shift is the one function in the standard that has no dependencies on other functions within the standard, so it's cheap to implement just for this purpose.

Gabriella439 · 2020-08-12T02:58:27Z

Also, for reference, here is the Haskell code for the alternative approach:

https://github.com/dhall-lang/dhall-haskell/compare/gabriel/with_encoding

It's not much different because the Haskell implementation was already using a separate constructor internally for With, even if it didn't show up in the binary encoding

Gabriella439 · 2020-08-12T14:45:31Z

@Nadrieril: Actually, I take that back. The fact that the exact encoding doesn't show up in the test output is actually a feature, because the idea was that implementations would now be free to use different encodings internally so long as they produce the same final result.

Gabriella439 · 2020-08-13T01:37:35Z

Alright, the alternative approach is ready for review here: #1052

If that is approved then it will supersede this proposal

Nadrieril · 2020-08-13T02:48:49Z

For me, shift is an artifact of the way the standard is specified. I expect non-toy implementations to never use such an operation, and instead use environments, de Bruijn indices, and this reverse-deBruijn indices that your NbE evaluator uses. Since that machinery is probably not yet in place just after parsing, that's my reason for wanting to avoid pre-tck shifts

Nadrieril · 2020-08-13T04:06:10Z

we don't have a good way to verify within our standard test suite that the desugaring was done correctly if it's always done "just in time" as part of type-checking and/or normalization

I think we do: a normalization test should do the trick

\(x: {}) -> x with a.b.c = 42

As you mention later, we indeed can't check desugaring to happen before nztion/tck, but that's the point. Hopefully the test above and a few tck tests should be enough

philandstuff

I know this will likely be superseded, but I spotted a couple of issues

@philandstuff

… as caught by @philandstuff Co-authored-by: Philip Potter <philip.g.potter@gmail.com>

@philandstuff

… as caught by @philandstuff Co-authored-by: Philip Potter <philip.g.potter@gmail.com>

Gabriella439 · 2020-08-19T16:18:20Z

This is superseded by #1052

Gabriella439 mentioned this pull request Aug 13, 2020

Enable with optimizations #1052

Merged

philandstuff reviewed Aug 15, 2020

View reviewed changes

Comment thread standard/record.md Outdated

Comment thread standard/record.md Outdated

Gabriella439 and others added 2 commits August 15, 2020 17:56

Fix desugaring example

6ea9e43

… as caught by @philandstuff Co-authored-by: Philip Potter <philip.g.potter@gmail.com>

Fix judgment

9fb9652

… as caught by @philandstuff Co-authored-by: Philip Potter <philip.g.potter@gmail.com>

Gabriella439 closed this Aug 19, 2020

Gabriella439 deleted the gabriel/scale_with_desugar branch August 19, 2020 16:18

Uh oh!

Conversation

Gabriella439 commented Aug 2, 2020

Uh oh!

Gabriella439 commented Aug 2, 2020

Uh oh!

Nadrieril commented Aug 2, 2020

Uh oh!

Gabriella439 commented Aug 2, 2020

Uh oh!

philandstuff commented Aug 3, 2020

Uh oh!

sjakobi commented Aug 3, 2020

Uh oh!

Gabriella439 commented Aug 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sjakobi commented Aug 4, 2020

Uh oh!

Gabriella439 commented Aug 4, 2020

Uh oh!

Gabriella439 commented Aug 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Nadrieril commented Aug 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Gabriella439 commented Aug 4, 2020

Uh oh!

Gabriella439 commented Aug 12, 2020

Uh oh!

Gabriella439 commented Aug 12, 2020

Uh oh!

Gabriella439 commented Aug 12, 2020

Uh oh!

Gabriella439 commented Aug 13, 2020

Uh oh!

Nadrieril commented Aug 13, 2020

Uh oh!

Nadrieril commented Aug 13, 2020

Uh oh!

philandstuff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Gabriella439 commented Aug 19, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Gabriella439 commented Aug 3, 2020 •

edited

Loading

Gabriella439 commented Aug 4, 2020 •

edited

Loading

Nadrieril commented Aug 4, 2020 •

edited

Loading