Optimize desugaring of with expressions#1050
Conversation
The motivation for this change is: dhall-lang/dhall-haskell#1960 … where a deeply nested and chained `with` expression caused a blow-up in the size of the desugared syntax tree. This alternative desugaring was actually an idea we contemplated in the original change that standardized `with`: #923 … but at the time we elected to go with the simpler desugaring, reasoning that users could work around performance issues with explicit `let` binding: #923 (comment) However, for chained `with` expressions that work-around is not ergonomic, as illustrated in dhall-lang/dhall-haskell#1960. Instead, this adds language support for the `let` work-around. The change to the `WithMultiple` test demonstrates that this leads to a more compact desugared expression. Before this change the expression `{ a.b = 1, c.d = 2 } with a.b = 3 with c.e = 4` would desugar to: ```dhall { a.b = 1, c.d = 2 } ⫽ { a = { a.b = 1, c.d = 2 }.a ⫽ { b = 3 } } ⫽ { c = ({ a.b = 1, c.d = 2 } ⫽ { a = { a.b = 1, c.d = 2 }.a ⫽ { b = 3 } }).c ⫽ { e = 4 } } ``` … and now the same expression desugars to: ```dhall let _ = let _ = { a.b = 1, c.d = 2 } in _ ⫽ { a = _.a ⫽ { b = 3 } } in _ ⫽ { c = _.c ⫽ { e = 4 } } ```
|
Here is the matching change to the Haskell implementation: dhall-lang/dhall-haskell#1964 |
|
From what I remember, the decisions we made #923 should have allowed an implementation to desugar using |
|
@Nadrieril: The main difference is the change to one test suite file, which specifies how to encode multiple I would be fine changing things to make |
|
I’m going to try implementing in dhall-golang and will report back. |
|
@Gabriel439 Could you give an example (and a test?) of the desugaring with a field-path that is longer than two fields? E.g. Currently I'm unsure whether a |
|
@sjakobi: let _ = r in _ ⫽ { a = let _ = _.a in _ ⫽ { b = _.b ⫽ { c = x } } }I'll hold off on the test until we sort out whether |
What would be the performance implications if we'd produce at most one let _ = r in _ ⫽ { a = _.a ⫽ { b = _.a.b ⫽ { c = x } } } |
|
@sjakobi: The difference would probably be negligible in practice, but I'd expect the one with more |
|
@Nadrieril: So I took a stab at your suggestion and the main issue I ran into is that if we have a separate phase for desugaring (e.g. in between parsing and type-checking), then that would entail other changes to the AST, too. For example, we'd have to be able to represent the dotted field syntax (e.g. Another possible option is to keep the current desugaring treatment, but instead specify the behavior of |
|
Ok, thanks for trying, I do see the issue. |
|
@Nadrieril: Alright, I'll put up a parallel pull request soon with the non-desugaring route. I don't expect it will be too complicated: it will probably just be something like "to type-check/normalize a |
|
@Nadrieril: So I implemented what you suggested in the Haskell implementation and realized that there's one downside of doing things the way you suggest, which is that we don't have a good way to verify within our standard test suite that the desugaring was done correctly if it's always done "just in time" as part of type-checking and/or normalization. The parsing tests would only show that the Also, I don't think it's that bad to add the dependency on |
|
Also, for reference, here is the Haskell code for the alternative approach: https://github.com/dhall-lang/dhall-haskell/compare/gabriel/with_encoding It's not much different because the Haskell implementation was already using a separate constructor internally for |
|
@Nadrieril: Actually, I take that back. The fact that the exact encoding doesn't show up in the test output is actually a feature, because the idea was that implementations would now be free to use different encodings internally so long as they produce the same final result. |
|
Alright, the alternative approach is ready for review here: #1052 If that is approved then it will supersede this proposal |
|
For me, |
I think we do: a normalization test should do the trick \(x: {}) -> x with a.b.c = 42As you mention later, we indeed can't check desugaring to happen before nztion/tck, but that's the point. Hopefully the test above and a few tck tests should be enough |
philandstuff
left a comment
There was a problem hiding this comment.
I know this will likely be superseded, but I spotted a couple of issues
… as caught by @philandstuff Co-authored-by: Philip Potter <philip.g.potter@gmail.com>
… as caught by @philandstuff Co-authored-by: Philip Potter <philip.g.potter@gmail.com>
|
This is superseded by #1052 |
The motivation for this change is:
dhall-lang/dhall-haskell#1960
… where a deeply nested and chained
withexpression caused a blow-upin the size of the desugared syntax tree.
This alternative desugaring was actually an idea we contemplated in the
original change that standardized
with:#923
… but at the time we elected to go with the simpler desugaring, reasoning
that users could work around performance issues with explicit
letbindings:
#923 (comment)
However, for chained
withexpressions that work-around is notergonomic, as illustrated in
dhall-lang/dhall-haskell#1960. Instead,
this adds language support for the
letoptimization.The change to the
WithMultipletest demonstrates that this leads toa more compact desugared expression. Before this change the
expression
{ a.b = 1, c.d = 2 } with a.b = 3 with c.e = 4woulddesugar to:
{ a.b = 1, c.d = 2 } ⫽ { a = { a.b = 1, c.d = 2 }.a ⫽ { b = 3 } } ⫽ { c = ({ a.b = 1, c.d = 2 } ⫽ { a = { a.b = 1, c.d = 2 }.a ⫽ { b = 3 } }).c ⫽ { e = 4 } }… and now the same expression desugars to: