Skip to content

Conversation

@PragTob
Copy link
Member

@PragTob PragTob commented Dec 13, 2023

Should combat memory consumption issues caused by sending too much data to processes.

edit: To highlight the severeness of this:

  • there is a benchmark I'm working on that I almost couldn't run before this as my 32GB of RAM would not be enough
  • it also hung forever before calculating statistics and running the formatters
  • In that benchmark I used :save to compare performance across elixir version, before this PR each individual save file was 200+MB. With this MR, it's 1MB (as it removes the data from formatters and :save is a formatter under the hood). The file size isn't the major thing, but it illustrates how much memory we save and how much time copying memory we avoid.

I wanna write up all of this in like 2 blog posts but as per usual my speed is abysmal 😁

Done to avoid copying potentially hige inputs and functions to
the processes when all they need to do is crunch some silly small
numbers. HUGE performance gain while reducing memory usage a lot
for benchmarks with bigger data in use.

Also use Task.async_stream instead of our good old `Parallel.map`
as with the dawn of `inputs` I have seen people do 30+ scenarios
(10 benchmarks with 3 inputs will do that to you), where if we're
running on a 4 core system we might be doing too much in parallel
also potentially skyrocketing memory consumption.
This includes a slight workaround to preserve the names of inputs
as people may be relying on those. Doesn't feel great but...
somehow reasonable?
@PragTob
Copy link
Member Author

PragTob commented Dec 14, 2023

Mental notes:

  1. We can be even more efficient for the most common case of just 1 formatter if we just don't do any parallel processing in that case, difference is small but exists and should be an easy fix:
Name                        ips        average  deviation         median         99th %
sequential_output        240.79        4.15 ms     ±5.70%        4.13 ms        4.34 ms
format & write           240.00        4.17 ms     ±3.71%        4.13 ms        4.82 ms
Formatter.output         228.23        4.38 ms    ±16.58%        4.07 ms        6.34 ms

Comparison: 
sequential_output        240.79
format & write           240.00 - 1.00x slower +0.0138 ms
Formatter.output         228.23 - 1.06x slower +0.23 ms
  1. might want to provide "input_names" to use instead of the broken list 👀

  2. adjust docs & Changelog

@PragTob
Copy link
Member Author

PragTob commented Dec 14, 2023

🥳

Name                        ips        average  deviation         median         99th %
Formatter.output         248.71        4.02 ms     ±4.58%        4.03 ms        4.59 ms
sequential_output        248.38        4.03 ms     ±3.16%        4.00 ms        4.53 ms
format & write           247.77        4.04 ms     ±6.76%        4.02 ms        4.35 ms

Comparison: 
Formatter.output         248.71
sequential_output        248.38 - 1.00x slower +0.00539 ms
format & write           247.77 - 1.00x slower +0.0153 ms

Now, that scenario is finally fast/fastest in that lil benchmark:

```
Name                        ips        average  deviation         median         99th %
Formatter.output         248.71        4.02 ms     ±4.58%        4.03 ms        4.59 ms
sequential_output        248.38        4.03 ms     ±3.16%        4.00 ms        4.53 ms
format & write           247.77        4.04 ms     ±6.76%        4.02 ms        4.35 ms

Comparison:
Formatter.output         248.71
sequential_output        248.38 - 1.00x slower +0.00539 ms
format & write           247.77 - 1.00x slower +0.0153 ms
```
@PragTob
Copy link
Member Author

PragTob commented Dec 14, 2023

I definitely should get exguard working again... the amount of force pushes due to minor credo things are embarrassing ._.

@PragTob
Copy link
Member Author

PragTob commented Dec 14, 2023

👋 Hey @hrzndhrn @NickNeck as you have some of the most Benchee plugins I'd appreciate it if you took a look here, mainly from the PoV on impact for plugins. If you want to review the PR as a whole of course I'm happy too 😁

TLDR;

  • sending a lot of data between processes VERY BAD
  • formatters are losing access to Scenario.function, Scenario.input and Configuration.inputs will only retain the names, not the values

Thanks!

@NickNeck
Copy link
Contributor

@PragTob thanks for the hint. Thanks for Benchee and all your work. I will take a look during the weekend.

@PragTob
Copy link
Member Author

PragTob commented Dec 15, 2023

@NickNeck thanks for all your great OSS work and also specifically of creating Benchee plugins, it was designed to facillitate and allow this but seeing all the things people do warms my heart! 💚

Copy link
Contributor

@NickNeck NickNeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 . I think the necessary changes for plugins are not too hard and the benefit it is worth.

@PragTob
Copy link
Member Author

PragTob commented Dec 17, 2023

@NickNeck thanks a lot! 💚

IMG_20220705_184557

@PragTob PragTob merged commit 59f9886 into main Dec 17, 2023
@PragTob PragTob deleted the memory branch December 17, 2023 08:30
PragTob added a commit that referenced this pull request Dec 19, 2023
This reverts commit dfa6987.

It turned out that the parallel processing wasn't the issue
with memory consumption we faced but instead copying data
to processes.

See:
* #408
* #414
* https://pragtob.wordpress.com/2023/12/18/careful-what-data-you-send-or-how-to-tank-your-performance-with-task-async/
PragTob added a commit that referenced this pull request Dec 19, 2023
This reverts commit dfa6987.

It turned out that the parallel processing wasn't the issue
with memory consumption we faced but instead copying data
to processes.

See:
* #408
* #414
* https://pragtob.wordpress.com/2023/12/18/careful-what-data-you-send-or-how-to-tank-your-performance-with-task-async/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants