Optimize large CPAs by sim642 · Pull Request #164 · goblint/analyzer

sim642 · 2021-02-03T12:21:55Z

In the traces branch, some new privatization implementations (mutex-oplus, mine, mine-nothread) keep a copy of everything in the local domain. Timing shows that this causes an unreasonable amount of time spent on equal, compare and hash of CPA, which are very naïve. We have physical equality and hashcons optimizations for the entire Spec, but this does nothing to improve the performance of Set operations in path-sensitivity, which I suspect is the culprit here.

Right now I made two changes, which notably improved the performance of those privatization implementations on some large benchmarks:

Add physical equality check to equal.
Add caching of hash. I did this by storing a lazy hash computation with the map.

Initially I thought using another hashcons more inside would leak a lot of memory by keeping lots of intermediate CPAs in the hashcons hashtable, so I didn't go that route. I later looked into BatHashcons and realized it uses Weak to avoid leaking memory that way, so it might be worth going that way after all.

For now, this is just a draft PR on top of the traces branch, where it really matters. The current implementation is ugly and changes all of MapDomain actually, not just CPA. A cleaner implementation would be to have the hash caching/hashconsing as a MapDomain functor to just apply it to CPA, but the leaky abstractions and lacking interfaces in MapDomain make it difficult.

sim642 · 2021-02-03T17:42:46Z

I now enforced MapDomain signatures and extracted OptHash into a functor to be just used for CPA. To my surprise, this can make CPA.hash a few times slower, which isn't great news.

Considering our heavy use of functors, maybe flambda would help a lot, especially if one level of functor can make such a difference.

This is very slow because every add/remove in a fold-loop relifts.

sim642 · 2021-02-04T11:22:19Z

I now tried hashconsing instead of my hash-caching. Obviously it makes equal and hash instant even on large maps, but all the slow hash computation is just shifted to the lift. Overall, this is much slower due to patterns like using add in a fold-loop to construct a new map: every intermediate add recalculates the hash slowly to lift the result. So hashconsing isn't actually the answer here!

Well, it may be the answer if we didn't use the add in a fold-loop pattern. I think we have often done that to combine filtering and mapping while avoiding the intermediate result between filter and map, whereas filter_map (if it were available on MapDomain) might be more appropriate. Although changes like this likely won't solve all the cases: sometimes such loops are over other data structures.

Another idea might be to combine the two approaches to get some kind of lazy hashconsing, which would only hashcons maps that hash is calculated on. It would avoid the slowdown of add in fold-loops as the map itself isn't hashed in such case intermediately. But it might further improve equal performance by replacing the still slow equal comparison (when not physically equal) with a comparison of tags. There are more cases there though, e.g. when calling equal with one hashconsed and one non-hashconsed map.

sim642 · 2021-04-06T14:26:18Z

This has been sitting in the sidelines for now, but I think I'll have to see how this affects the newer heavy privatizations. I suppose they still can keep quite big CPAs around.

My biggest worry with this PR, is that the stricter interfacing of MapDomains broke the compilation of PathSensitive3 in witness generation, because it used the internals of a lifted map. Although I fixed it to at least compile, it's very probable that there things don't behave right anymore and witness generation would be broken. And for the time being, I don't really want to spend time trying to test and possibly fix it.

sim642 · 2021-04-06T16:04:34Z

Here's a comparison on some benchmarks from goblint/bench that shows how more involved privatizations (global-history, mine-W, mine-lazy, mine-global) get around 50% time improvement just from this change.

Before

After

sim642 added 5 commits February 3, 2021 12:15

Optimize CPA.equal

3102209

Optimize CPA.hash

8d72216

Enforce MapDomain module signatures

534461f

Extract map hash calculation optimization to functor

ee950d5

Add TODO about OptHash functor performance regression

55587d4

Add MapDomain.Hashcons functor

54d84f6

This is very slow because every add/remove in a fold-loop relifts.

sim642 added 8 commits February 4, 2021 15:19

Decompose OptHash to HashCached

f8dd4b2

Simplify HashCached.to_yojson

369a1b0

Simplify HashCached.pretty_f

c7ca576

Deduplicate MapDomain.Hashcons

4829278

Extract CPA timing to MapDomain.Timed

3a035f3

Disable CPA timing

742d3a0

Rename StatsPriv -> TimedPriv

f322b16

Merge branch 'traces' into traces-cpa-optimize

f2f1b27

sim642 mentioned this pull request Feb 25, 2021

Fix Hashconsing & Marshaling of Results #166

Merged

sim642 marked this pull request as ready for review April 6, 2021 14:26

Merge branch 'traces' into traces-cpa-optimize

da2d673

sim642 added the performance Analysis time, memory usage label Apr 6, 2021

sim642 merged commit a6d0a0b into traces Apr 8, 2021

sim642 deleted the traces-cpa-optimize branch April 8, 2021 12:48

This was referenced Apr 9, 2021

Traces article implementation #183

Merged

Remove CircInterval implementation #188

Merged

vogler mentioned this pull request Apr 14, 2021

profiling and optimizations #189

Closed

This was referenced Apr 21, 2021

Fix HoareMap and witness generation after MapDomain changes #195

Closed

Fix witness generation and HoareMap #204

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize large CPAs#164

Optimize large CPAs#164
sim642 merged 15 commits intotracesfrom
traces-cpa-optimize

sim642 commented Feb 3, 2021

Uh oh!

sim642 commented Feb 3, 2021

Uh oh!

sim642 commented Feb 4, 2021

Uh oh!

sim642 commented Apr 6, 2021

Uh oh!

sim642 commented Apr 6, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sim642 commented Feb 3, 2021

Uh oh!

sim642 commented Feb 3, 2021

Uh oh!

sim642 commented Feb 4, 2021

Uh oh!

sim642 commented Apr 6, 2021

Uh oh!

sim642 commented Apr 6, 2021

Before

After

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant