Skip to content

Conversation

@radrow
Copy link
Contributor

@radrow radrow commented Feb 2, 2021

Description of the change

In short:

When you dfs through a tree and launch the same dfs on each node then you go exponential.

In long:

I have spotted that function definitions which are big enough (see the added testcase) tend to blow up the memory and take extreme amount of time to handle.

The issue was triggered particularly in the collapseBindingGroups as it was running a traversal through the whole AST of each declaration that on every let was calling the collapseBindingGroups which was running a traversal through the whole AST of each declaration that on every let was calling the collapseBindingGroups which was running a traversal through the whole AST of each declaration that on every let was calling the collapseBindingGroups which was running a traversal(...)

More clearly, collapseBindingGroups used to run collapseBindingGroupsForValue on each AST node which was running collapseBindingGroups on every let expression. This was effectively forcing all inner lets to be visited twice which, by the fact, that each of these duplicated evaluations was running next two ones, was amplifying the problem leading to exponential complexity over the depth of a tree.


Since the purpose of this function is to flatten the binding groups of the declarations across the whole AST, I changed the folding function for everywhereOnValues to simply flatten a single layer of declarations. It works perfectly as going deeper is already handled by everywhereOnValues.

I have prepared a testcase that fails on the master branch but is successfully solved after the fixes.


Work on this PR was funded by aeternity

@hdgarrood
Copy link
Contributor

Great work! This looks sensible to me. Does this new test fail without this fix, at least on machines where you don’t have an enormous amount of memory?

@radrow
Copy link
Contributor Author

radrow commented Feb 2, 2021

I tested it on my PC and it failed on master with 32GB of RAM and i7 9k CPU. Last words of stack test:

purescript    >       'BigFunction.purs' should compile and run without error:                                                                                                                                                           
purescript    > Test suite tests failed
Completed 3 action(s).  
Test suite failure for package purescript-0.14.0
    tests:  exited with: ExitFailure (-9)
Logs printed to console

Copy link
Contributor

@hdgarrood hdgarrood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks again

Copy link
Member

@rhendric rhendric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@kl0tl
Copy link
Member

kl0tl commented Feb 5, 2021

The new test takes ~3 minutes to compile on my machine. This isn’t noticeable with the absurdly long build times we get with Travis these days but this will be more noticeable when we’ll switch to GitHub actions. Is that ok or should we try to exhibit a test case that compiles faster?

@natefaubion
Copy link
Contributor

The new test takes ~3 minutes to compile on my machine. This isn’t noticeable with the absurdly long build times we get with Travis these days but this will be more noticeable when we’ll switch to GitHub actions. Is that ok or should we try to exhibit a test case that compiles faster?

I think it would be preferable, least of all because this makes running tests in development quite an ordeal. I wouldn't care as much if this was only run in CI (or with a CLI flag).

@hdgarrood
Copy link
Contributor

I suppose since this was exponential before, the smallest program that's big enough to be likely to cause an OOM could potentially be a fair bit smaller than this test.

@radrow
Copy link
Contributor Author

radrow commented Feb 8, 2021

Okay, I have shrinked the test case to just two "repetitions" as it still breaks the build on my PC on master and passes within 16 seconds on my branch. Is that acceptable?

@hdgarrood
Copy link
Contributor

I think 16 seconds is a bit of a drag, but just about manageable. Ideally we'd have a slightly more sophisticated mechanism for detecting performance regressions than just writing test cases that are likely to fail with OOM errors if things regress, but that's a whole project in of itself. Any objections to merging this as-is?

@hdgarrood hdgarrood merged commit b7a8616 into purescript:master Mar 19, 2021
@radrow radrow deleted the exp-bind-fix branch March 19, 2021 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants