Add complexity estimation of iterating over HashSet and HashMap#97215
Add complexity estimation of iterating over HashSet and HashMap#97215bors merged 1 commit intorust-lang:masterfrom AngelicosPhosphoros:add_hashtable_iteration_complexity_note
Conversation
|
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
|
r? @thomcc (rust-highfive has picked a reviewer for you, use r? to override) |
|
I also tested what happens if limit amount of iteration by length of hash set and discovered that this slightly improve performance because hashbrown still searches items in empty buckets even after yielding all elements already. I don't think that this should be mentioned because
Benchmarks code
Criterion output
I don't think that there is any reason to include this benchmarks to our codebase because they don't have big value themselves. |
It is not obvious (at least for me) that complexity of iteration over hash tables depends on capacity and not length. Especially comparing with other containers like Vec or String. I think, this behaviour is worth mentioning. I run benchmark which tests iteration time for maps with length 50 and different capacities and get this results: ``` capacity - time 64 - 203.87 ns 256 - 351.78 ns 1024 - 607.87 ns 4096 - 965.82 ns 16384 - 3.1188 us ``` If you want to dig why it behaves such way, you can look current implementation in [hashbrown code](https://github.com/rust-lang/hashbrown/blob/f3a9f211d06f78c5beb81ac22ea08fdc269e068f/src/raw/mod.rs#L1933). Benchmarks code would be presented in PR related to this commit.
|
@bors r+ rollup |
|
📌 Commit de97d73 has been approved by |
…askrgr Rollup of 7 pull requests Successful merges: - rust-lang#97109 (Fix misleading `cannot infer type for type parameter` error) - rust-lang#97187 (Reverse condition in Vec::retain_mut doctest) - rust-lang#97201 (Fix typo) - rust-lang#97203 (Minor tweaks to rustc book summary formatting.) - rust-lang#97208 (Do not emit the lint `unused_attributes` for *inherent* `#[doc(hidden)]` associated items) - rust-lang#97215 (Add complexity estimation of iterating over HashSet and HashMap) - rust-lang#97220 (Add regression test for#81827) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
Add shortcircuit in iteration if we yielded all elements Current implementation works little slower than `set.iter().take(set.len())`. See my comment [here](rust-lang/rust#97215 (comment)). So why not avoid extra integer which added by `Iterator::take` if we can add limiting logic into our iterator itself? I don't really know how this change affects [reflect_toogle_full](https://github.com/rust-lang/hashbrown/blob/f3a9f211d06f78c5beb81ac22ea08fdc269e068f/src/raw/mod.rs#L2019) and implementation of [FusedIterator](https://github.com/rust-lang/hashbrown/blob/f3a9f211d06f78c5beb81ac22ea08fdc269e068f/src/raw/mod.rs#L2150). Maybe I should make inner iterator "jump" to the end of its memory block?
It is not obvious (at least for me) that complexity of iteration over hash tables depends on capacity and not length. Especially comparing with other containers like Vec or String. I think, this behaviour is worth mentioning.
I run benchmark which tests iteration time for maps with length 50 and different capacities and get this results:
If you want to dig why it behaves such way, you can look current implementation in hashbrown code.
Benchmarks code would be presented in PR related to this commit.