Skip to content

Conversation

@wesm
Copy link
Member

@wesm wesm commented May 27, 2020

Local benchmarks suggest that there is 3-4 ns of overhead for manipulating small bitmaps when it is not inline.

inline (argument is byte size of bitmap):

--------------------------------------------------------
Benchmark                 Time           CPU Iterations
--------------------------------------------------------
SetBitsTo/2               3 ns          3 ns  271439203   733.294MB/s
SetBitsTo/16              2 ns          2 ns  308813758   6.49485GB/s
SetBitsTo/1024            9 ns          9 ns   79821710   109.078GB/s
SetBitsTo/131072       2029 ns       2029 ns     325563   60.1566GB/s

non-inline:

--------------------------------------------------------
Benchmark                 Time           CPU Iterations
--------------------------------------------------------
SetBitsTo/2               6 ns          6 ns  129335891    334.62MB/s
SetBitsTo/16              6 ns          6 ns  122741527   2.53134GB/s
SetBitsTo/1024           11 ns         11 ns   64547137   87.1395GB/s
SetBitsTo/131072       2010 ns       2010 ns     332558   60.7215GB/s

If it can be demonstrated that inlining can meaningfully improve the macroperformance of some function, then we could lift the implementation into a SetBitsToInline, but until then I think it makes sense to keep it in bit_util.cc

@github-actions
Copy link

@wesm
Copy link
Member Author

wesm commented May 28, 2020

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant