Conversation
|
Started a benchmark since this is very much in the hot path: https://github.com/shadow/benchmark/actions/runs/4306832877 Locally it seems this PR might be a slight perf gain: main: this pr: |
|
A possible reason for a performance gain - the C++ implementation ultimately uses |
|
Unfortunately it looks like it got slightly slower on the full benchmark: https://github.com/shadow/benchmark-results/blob/master/tor/2023-03-01-T18-40-39/plots/run_time.png I think the most likely suspect is SelfContainedMutex, and there is some room for optimization there.
For now I'd lean towards filing an issue to explore those optimizations, adding a comment in SelfContainedMutex referencing the issue, and merging the current form. |
|
I tried a couple optimizations.
I have a couple more things to try:
|
|
After rebasing and adding a large alignment to I'm on the fence whether to keep the large alignment; maybe I'll run one more benchmark without it. In the meantime I think this can be reviewed and merged with or without it. |
|
Also see #2791 - none of the candidate optimizations appear to help in microbenchmarking. |
Notably this removes the user-facing experimental options `preload_spin_max` and `use_explicit_block_message`, and hard-wires the current defaults of not spinning and not using an explicit block message (which would be redundant with spinning disabled). We've stopped using these options, and I haven't bothered reimplementing them in the Rust version of `IPCData`.
No description provided.