Skip to content

Improve performance#103

Closed
lehins wants to merge 5 commits intointerface-to-performancefrom
improve-performance
Closed

Improve performance#103
lehins wants to merge 5 commits intointerface-to-performancefrom
improve-performance

Conversation

@lehins
Copy link
Copy Markdown
Collaborator

@lehins lehins commented Apr 21, 2020

This PR fixes:

  • Performance issue that I discovered when benchmarking generation of integral values in a range
  • Order of arguments in uniformR function was inconsistent with other range functions.

@curiousleo
Copy link
Copy Markdown
Collaborator

* Performance issue that I discovered when benchmarking generation of integral values in a range

Nice! How big is the performance boost?

@lehins
Copy link
Copy Markdown
Collaborator Author

lehins commented Apr 21, 2020

@curiousleo Pretty significant ~x20. Problem is when a function is recursive, then it cannot be inlined, which can hurt the performance really bad when that function is being called many many times.
In case of [un]signedBitmaskWithRejectionRM there was no need for them to be recursive.

Haven't tried the internal benchmarks, but running with criterion difference in performance was obvious.

@lehins lehins mentioned this pull request Apr 21, 2020
@lehins
Copy link
Copy Markdown
Collaborator Author

lehins commented Apr 21, 2020

@curiousleo Applied the same fix of removing recursion to Integer generation, got quite a bit of performance boost as well ;)

Benchmarks for ranges are available in the same place: https://github.com/lehins/haskell-benchmarks/tree/new-random/new-random-benchmarks

@curiousleo curiousleo force-pushed the improve-performance branch from 4055a8e to bb7b99a Compare April 23, 2020 11:34
@curiousleo
Copy link
Copy Markdown
Collaborator

Fixed a semantic merge conflict: 068aa36#diff-5b60458897b8b814de6d596bdc71e75cL210-R210

@curiousleo
Copy link
Copy Markdown
Collaborator

curiousleo commented Apr 23, 2020

According to the new benchmarks, this is an improvement for some types (e.g. Word64) and a performance regression for others (e.g. Word16).

                                         mean BEFORE    AFTER     error BEFORE     AFTER
pure/uniformR/full/Word16                mean 28.35 μs  17.96 ms  ( +- 1.383 μs  ) ( +- 1.369 ms  )
pure/uniformR/full/Word64                mean 1.552 ms  131.3 μs  ( +- 103.9 μs  ) ( +- 3.496 μs  )

Full comparison in compare.txt.

improve-performance-bb7b99a.txt
interface-to-performance-d03e325.txt

Copy link
Copy Markdown
Collaborator

@curiousleo curiousleo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amongst performance improvements, this also contains performance regressions: #103 (comment)

@curiousleo
Copy link
Copy Markdown
Collaborator

Ping @lehins - I've cherry-picked most performance improvements from here onto #116, which is now merged. I've played around with the optimisation you made to uniformIntegerM (making it non-recursive). This has strange side-effects: in my benchmarks, it makes other generators run slower. I won't pursue this particular optimisation further for now.

@lehins
Copy link
Copy Markdown
Collaborator Author

lehins commented May 8, 2020

I am closing this, since most of improvements were cherry picked

@lehins lehins closed this May 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants