Collection of minor performance optimizations #2855

codablock · 2019-04-11T07:09:45Z

See individual commits. Most of these are related to using unordered maps/sets instead of ordered ones and are the result of multiple profiling sessions. All these small optimizations add up to about 10% of CPU usage on the message handler thread.

This PR also contains a backport of bitcoin#13176 as I've also noticed unexpected high CPU use for CNode::AddInventoryKnown.

… modulus with FastMod 9aac9f9 replace modulus with FastMod (Martin Ankerl) Pull request description: Not sure if this is optimization is necessary, but anyway I have some spare time so here it is. This replaces the slow modulo operation with a much faster 64bit multiplication & shift. This works when the hash is uniformly distributed between 0 and 2^32-1. This speeds up the benchmark by a factor of about 1.3: ``` RollingBloom, 5, 1500000, 3.73733, 4.97569e-07, 4.99002e-07, 4.98372e-07 # before RollingBloom, 5, 1500000, 2.86842, 3.81630e-07, 3.83730e-07, 3.82473e-07 # FastMod ``` Be aware that this changes the internal data of the filter, so this should probably not be used for CBloomFilter because of interoperability problems. Tree-SHA512: 04104f3fb09f56c9d14458a6aad919aeb0a5af944e8ee6a31f00e93c753e22004648c1cd65bf36752b6addec528d19fb665c27b955ce1666a85a928e17afa47a

In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps.

Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed.

… on demand CNode::AskFor will now push entries into an initially unordered vector instead of an ordered multimap. Only when we later want to use vecAskFor in SendMessages, we sort the vector. The vector will actually be mostly sorted in most cases as insertion order usually mimics the desired ordering. Only the last few entries might need some shuffling around. Doing the sort on-demand should be less wasteful then trying to maintain correct order all the time.

UdjinM6

utACK

I'm surprised that tiny sporks and args map had such a noticeable impact tbh, sounds very strange to me.

codablock · 2019-04-11T07:32:45Z

mapArgs was not that bad, I just did it while I was in optimization mode :)

The sporks is a different thing. We now perform a lot of IsNewInstantSendEnabled calls, which summed up to 1% of total time spent. I assume the actual reason is the temporary mapValueCounts map, which we should try to avoid in an upcoming refactoring. Replacing the maps with unordered maps just looked like the fastest/cheapest way to optimize it a tiny little bit.

travis fails

This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action.

codablock · 2019-04-11T09:50:50Z

Added a few commits to fix Travis and also ensure that future backports don't go wrong.

UdjinM6

utACK

laanwj and others added 8 commits April 11, 2019 09:00

Use unordered_map in CSporkManager

49ae7ec

In one of my profiling sessions with many InstantSend transactions happening, calls into CSporkManager added up to about 1% of total CPU time. This is easily avoidable by using unordered maps.

Use std::unordered_map instead of std::map in limitedmap

f240a76

Use unordered_set for CNode::setAskFor

365aeb7

Add serialization support for unordered maps and sets

f8382d4

Use unordered_map for mapArgs and mapMultiArgs

0b2fa0c

Let limitedmap prune in batches and use unordered_multimap

cd24d4f

Due to the batched pruning, there is no need to maintain an ordered map of values anymore. Only when nPruneAfterSize, there is a need to create a temporary ordered vector of values to figure out what can be removed.

codablock added this to the 14.0 milestone Apr 11, 2019

UdjinM6 previously approved these changes Apr 11, 2019

View reviewed changes

codablock added 3 commits April 11, 2019 11:46

Fix compilation of tests

c727024

Fix limitedmap tests

6cf7c85

Rename limitedmap to unordered_limitedmap to ensure backports conflict

814ffa8

This ensures that future backports that depends on limitedmap's ordering conflict so that we are made aware of needed action.

Fix compilation error on Travis

d34341c

UdjinM6 approved these changes Apr 11, 2019

View reviewed changes

UdjinM6 merged commit 241f76f into dashpay:develop Apr 11, 2019

codablock deleted the pr_optimizations branch October 10, 2019 15:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Collection of minor performance optimizations #2855

Collection of minor performance optimizations #2855

Uh oh!

codablock commented Apr 11, 2019

Uh oh!

UdjinM6 left a comment

Uh oh!

codablock commented Apr 11, 2019

Uh oh!

codablock commented Apr 11, 2019

Uh oh!

UdjinM6 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Collection of minor performance optimizations #2855

Collection of minor performance optimizations #2855

Uh oh!

Conversation

codablock commented Apr 11, 2019

Uh oh!

UdjinM6 left a comment

Choose a reason for hiding this comment

Uh oh!

codablock commented Apr 11, 2019

Uh oh!

codablock commented Apr 11, 2019

Uh oh!

UdjinM6 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants