-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Collection of minor performance optimizations #2855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… modulus with FastMod 9aac9f9 replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a
In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps.
Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed.
… on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time.
UdjinM6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK
I'm surprised that tiny sporks and args map had such a noticeable impact tbh, sounds very strange to me.
|
The sporks is a different thing. We now perform a lot of |
This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action.
|
Added a few commits to fix Travis and also ensure that future backports don't go wrong. |
UdjinM6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK
See individual commits. Most of these are related to using unordered maps/sets instead of ordered ones and are the result of multiple profiling sessions. All these small optimizations add up to about 10% of CPU usage on the message handler thread.
This PR also contains a backport of bitcoin#13176 as I've also noticed unexpected high CPU use for
CNode::AddInventoryKnown.