Skip to content

Replace pair_list with hash table#1128

Merged
asvetlov merged 8 commits intomasterfrom
ht
Jun 17, 2025
Merged

Replace pair_list with hash table#1128
asvetlov merged 8 commits intomasterfrom
ht

Conversation

@asvetlov
Copy link
Member

@asvetlov asvetlov commented Apr 5, 2025

No description provided.

@codspeed-hq
Copy link

codspeed-hq bot commented Apr 7, 2025

CodSpeed Performance Report

Merging #1128 will degrade performances by 77.96%

Comparing ht (b2f8227) with master (bdbbc48)

Summary

⚡ 129 improvements
❌ 36 (👁 35) regressions
✅ 79 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
👁 test_cimultidict_add_istr[c-extension-module] 2.7 ms 3.5 ms -23.87%
👁 test_cimultidict_add_istr[pure-python-module] 26.7 ms 99.7 ms -73.25%
test_cimultidict_delitem_istr[c-extension-module] 88.6 µs 64.9 µs +36.6%
test_cimultidict_delitem_istr[pure-python-module] 1,600.2 µs 950.7 µs +68.32%
👁 test_cimultidict_extend_istr[c-extension-module] 2.4 ms 2.9 ms -17.11%
👁 test_cimultidict_extend_istr[pure-python-module] 87.7 ms 161.3 ms -45.63%
👁 test_cimultidict_extend_istr_with_kwargs[c-extension-module] 6.3 ms 7.5 ms -15.95%
👁 test_cimultidict_extend_istr_with_kwargs[pure-python-module] 110.2 ms 269.1 ms -59.04%
test_cimultidict_fetch_istr[c-extension-module] 61.6 µs 47.4 µs +29.94%
test_cimultidict_fetch_istr[pure-python-module] 869.7 µs 502 µs +73.24%
test_cimultidict_get_istr_hit[c-extension-module] 72.7 µs 59.5 µs +22.25%
test_cimultidict_get_istr_hit[pure-python-module] 867.8 µs 498.4 µs +74.12%
test_cimultidict_get_istr_hit_with_default[c-extension-module] 74.7 µs 61.6 µs +21.28%
test_cimultidict_get_istr_hit_with_default[pure-python-module] 868.9 µs 500.3 µs +73.68%
test_cimultidict_get_istr_miss[c-extension-module] 74.6 µs 48.1 µs +54.99%
test_cimultidict_get_istr_miss[pure-python-module] 1,368.5 µs 484.6 µs ×2.8
test_cimultidict_get_istr_with_default_miss[c-extension-module] 76.7 µs 50.4 µs +52.2%
test_cimultidict_get_istr_with_default_miss[pure-python-module] 1,370.5 µs 485.8 µs ×2.8
👁 test_cimultidict_getall_istr_hit[pure-python-module] 54.5 µs 101.6 µs -46.35%
test_cimultidict_getall_istr_miss[c-extension-module] 16.4 µs 14.8 µs +10.61%
... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

@asvetlov
Copy link
Member Author

asvetlov commented Apr 7, 2025

Heh. Appending new values to the multidict is more expensive, all other ops are faster.

We have a tradeoff, as usual. I think that the lookup is a much more often operation than the multidict filling.
Also, pls keep in mind that the item replacement/deletion also requires a lookup.

@asvetlov
Copy link
Member Author

asvetlov commented Apr 7, 2025

The PR is more-or-less done but I'd like to make some polishing and self-review later.
Please don't merge it, I'll be only partially available next month, I cannot commit that I'll have enough time for fixing issues when the new version will be released.

Careful testing is appreciated!

New multidict is close to Python's dict except for multiple keys, of course.

It starts from the empty hashtable, which grows by a power of 2 starting from 8: 8, 16, 32, 64, 128, ...
The amount of items is 2/3 of the hashtable size (1/3 of the table is never allocated).

The table is resized if needed, and bulk updates (extend(), update(), and constructor calls) pre-allocate many items at once, reducing the amount of potential hashtable resizes.

Item deletion puts DKIX_DUMMY special index in the hashtable. In opposite to the standard dict, DKIX_DUMMY is never replaced with an index of the new entry except by hashtable indices rebuild. It allows to keep the insertion order for multiple equal keys.

The iteration for operations like getall() is a little tricky. The next index calculation could return the already visited index before reaching the end. To eliminate duplicates, the code marks already visited entries by entry->hash = -1. -1 hash is an invalid hash value that could be used as a marker. After the iteration finishes, all marked entries are restored.
Double iteration over the indices still has O(1) amortized time, it is ok.

.add(), val = md[key], md[key] = val, md.setdefault() all have O(1).
.getall() / .popall() have O(N) where N is the amount of returned items.
.update() / extend() have O(N+M) where N and M are amount of items in the left and right arguments (we had quadratic time here before).
popitem() is slightly slower because the function should calculate the index of the last (deleted) entry. Since the method is really rare; I don't care too much.

.copy() is super fast; construction multidict from another multidict could be optimized to reuse .copy() approach.

The performance is okay, multidict creation is slightly slower because the hashtable should be recalculated and the indices table rebuilds, but all other operations are faster.

Multidict is still slightly slower than the regular dict because I don't want to use the private API for accessing internal Python structures, the most notable is the string's hash.

TODO:

  1. Optimize MultiDict(md) construction.
  2. GC untrack the multidict if all keys/values are not tracked. In aiohttp, values are usually untracked str instanced.

Open question: Should we use HT_PERTURB_SHIFT in the index calculation? It is crucial for storing integers where hash(num) == num, but I doubt if it decreases the number of collisions for str keys. During the work on PR I saw many times when idx = next(idx) didn't change the current idx or it was like 1, 2, 1, 3 with a relatively high chance of duplication.
It may not be worth changing. I have no idea how to prove this except by calculating the collision statistics on a large enough number of keys.

If anybody wants to play with the code, two things could help debugging.

  1. Uncomment # CFLAGS = ["-O0", "-g3", "-UNDEBUG"] line in setup.py to make a debugging build. It enables asserts and ASSERT_CONSISTENT(...) self-consistency check.
  2. _ht_dump(...) function prints the current hashtable structure in a form useful for analyzing the internal structure.

Please feel free to experiment with and ask any questions.

@codecov
Copy link

codecov bot commented Apr 8, 2025

Codecov Report

❌ Patch coverage is 96.56652% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 98.31%. Comparing base (bdbbc48) to head (b2f8227).
⚠️ Report is 66 commits behind head on master.

Files with missing lines Patch % Lines
multidict/_multidict_py.py 96.00% 1 Missing and 15 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1128      +/-   ##
==========================================
- Coverage   98.62%   98.31%   -0.32%     
==========================================
  Files          27       27              
  Lines        3566     3851     +285     
  Branches      561      700     +139     
==========================================
+ Hits         3517     3786     +269     
- Misses         17       18       +1     
- Partials       32       47      +15     
Flag Coverage Δ
CI-GHA 98.31% <96.56%> (-0.32%) ⬇️
MyPy 80.35% <36.65%> (-2.83%) ⬇️
pytest 99.85% <98.85%> (-0.15%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Vizonex
Copy link
Member

Vizonex commented Jun 8, 2025

Is there anything I can help out with in this pull request? I can always look into cloning this branch if anybody wants.

@Vizonex
Copy link
Member

Vizonex commented Jun 9, 2025

So I tested this fork on my computer and it all runs correctly. I guess the only real issue now is how we get code coverage to cover the missing lines. maybe adding a few more tests could do it?

@Vizonex
Copy link
Member

Vizonex commented Jun 9, 2025

https://app.codecov.io/gh/aio-libs/multidict/pull/1128/blob/multidict/_multidict_py.py#L819
@asvetlov I wrote a test that passes through line 819 I wonder if this was the last thing needed for code-coverage other than that the think I can think of that you didn't do is adding a summary of the changes made in the changes folder
You'll have to put this down into test_update.py since I tried deleting mine and then attempted forking this one to make changes which didn't work.

def test_update_with_second_md(any_multidict_class: _MD_Classes) -> None:
    obj1 = any_multidict_class()
    obj2 = any_multidict_class([("a", 2)])
    obj1.update(obj2)
    assert obj1 == obj2

@Vizonex
Copy link
Member

Vizonex commented Jun 10, 2025

Ignore what I sent before, I sent you a PR. asvetlov#1

@asvetlov
Copy link
Member Author

@Vizonex thanks!

@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Jun 10, 2025
@asvetlov asvetlov marked this pull request as ready for review June 10, 2025 10:38
@asvetlov
Copy link
Member Author

asvetlov commented Jun 10, 2025

I think the PR is ready for review.
Performance degradation for bare .add() and .extend() is acceptable, as it requires a hashtable lookup instead of simply adding to the end of the array.
In real life, most multidict operations require a lookup, which is now 25-50% faster.

The change is backward compatible and shouldn't affect the library users.

@asvetlov asvetlov requested review from Dreamsorcerer and bdraco June 10, 2025 10:42
@Dreamsorcerer
Copy link
Member

I'm not too familiar with this, so I'll leave it with the others. Certainly looking forward to the performance improvement though.

@Vizonex
Copy link
Member

Vizonex commented Jun 10, 2025

How is code coverage not hitting the same line I thought I had covered? I'm gonna have to re-patch it. This is confusing 😕

@asvetlov
Copy link
Member Author

Now the test coverage is better.
The only missed things are checks like if identity == entry.identity: ...
They are essential in case of hash collisions: two different strings have the same hash.
Unfortunately, I have no idea how to generate this pair of strings, especially for 8-byte hash and the python random hash seed.
I suggest marking these lines as # pragma: no branch and that's it.

@Dreamsorcerer
Copy link
Member

They are essential in case of hash collisions: two different strings have the same hash.

Create a str subclass with a hash implementation?

class CollisionStr(str):
    def __hash__(self):
        return 42

>>> hash(CollisionStr("foo"))
42

Or something similar.

@asvetlov
Copy link
Member Author

Create a str subclass with a hash implementation?

I doubt if it is acceptable; multidict uses exact str instances as identities.
If I pass a custom string as a key, the library always converts it to an exact string for future processing.

Perhaps I could mock _CSMixin._identity() and _CIMixin._identity(); but again, it is not the best idea.
Our test suite is designed in a way that each test is used to check both Python and C implementations.
The second one is much more important.
Perhaps, I could write a helper that explicitly rewrites the calculated hash for an existing Python string using direct memory access to the corresponding C structure.
It should work with CPython, but really, do we need such a hack in our tests?

@Dreamsorcerer
Copy link
Member

Hmm, yeah, sounds tricky.

@Vizonex
Copy link
Member

Vizonex commented Jun 12, 2025

I love the changes being done to multidict, anything I can do this time to keep things moving forward?

@Vizonex
Copy link
Member

Vizonex commented Jun 12, 2025

@asvetlov I think the debug workflows you put in all stopped. I'm wondering if 40 minutes would be better?

@asvetlov
Copy link
Member Author

asvetlov commented Jun 16, 2025

@Vizonex test_update_large_dict test added for explicit check large (>=2^17) multidict stops the world on CI.
Locally, it takes only 8 secs on my old laptop, the modern one executes it even faster.
I think I should cut it out and add an extra pragma: no cover to keep the formal coverage metrics.

@Vizonex
Copy link
Member

Vizonex commented Jun 16, 2025

@Vizonex test_update_large_dict test added for explicit check large (>=2^17) multidict stops the world on CI. Locally, it takes only 8 secs on my old laptop, the modern one executes it even faster. I think I should cut it out and add an extra pragma: no cover to keep the formal coverage metrics.

Agreed

@asvetlov asvetlov merged commit 9d3c53f into master Jun 17, 2025
62 of 64 checks passed
@asvetlov asvetlov deleted the ht branch June 17, 2025 12:38
@Vizonex
Copy link
Member

Vizonex commented Jun 17, 2025

@asvetlov Great Job on getting this passed I'll get started on giving multidict cython support very soon.

@bdraco
Copy link
Member

bdraco commented Jun 17, 2025

@Vizonex
Copy link
Member

Vizonex commented Jun 18, 2025

I triggered a manual dependabot run:

yarl performance changes: https://codspeed.io/aio-libs/yarl/branches/dependabot%2Fpip%2Fmultidict-6.5.0 aiohttp performance changes: https://codspeed.io/aio-libs/aiohttp/branches/dependabot%2Fpip%2F3.13%2Fmultidict-6.5.0

That's amazing I hope adding multidict to cython will further enhance aiohttp it as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bot:chronographer:provided There is a change note present in this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants