Replace pair_list with hash table by asvetlov · Pull Request #1128 · aio-libs/multidict

asvetlov · 2025-04-05T19:27:14Z

No description provided.

codspeed-hq · 2025-04-07T14:39:33Z

CodSpeed Performance Report

Merging #1128 will degrade performances by 77.96%

_{Comparing ht (b2f8227) with master (bdbbc48)}

Summary

⚡ 129 improvements
❌ 36 (👁 35) regressions
✅ 79 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
👁	`test_cimultidict_add_istr[c-extension-module]`	2.7 ms	3.5 ms	-23.87%
👁	`test_cimultidict_add_istr[pure-python-module]`	26.7 ms	99.7 ms	-73.25%
⚡	`test_cimultidict_delitem_istr[c-extension-module]`	88.6 µs	64.9 µs	+36.6%
⚡	`test_cimultidict_delitem_istr[pure-python-module]`	1,600.2 µs	950.7 µs	+68.32%
👁	`test_cimultidict_extend_istr[c-extension-module]`	2.4 ms	2.9 ms	-17.11%
👁	`test_cimultidict_extend_istr[pure-python-module]`	87.7 ms	161.3 ms	-45.63%
👁	`test_cimultidict_extend_istr_with_kwargs[c-extension-module]`	6.3 ms	7.5 ms	-15.95%
👁	`test_cimultidict_extend_istr_with_kwargs[pure-python-module]`	110.2 ms	269.1 ms	-59.04%
⚡	`test_cimultidict_fetch_istr[c-extension-module]`	61.6 µs	47.4 µs	+29.94%
⚡	`test_cimultidict_fetch_istr[pure-python-module]`	869.7 µs	502 µs	+73.24%
⚡	`test_cimultidict_get_istr_hit[c-extension-module]`	72.7 µs	59.5 µs	+22.25%
⚡	`test_cimultidict_get_istr_hit[pure-python-module]`	867.8 µs	498.4 µs	+74.12%
⚡	`test_cimultidict_get_istr_hit_with_default[c-extension-module]`	74.7 µs	61.6 µs	+21.28%
⚡	`test_cimultidict_get_istr_hit_with_default[pure-python-module]`	868.9 µs	500.3 µs	+73.68%
⚡	`test_cimultidict_get_istr_miss[c-extension-module]`	74.6 µs	48.1 µs	+54.99%
⚡	`test_cimultidict_get_istr_miss[pure-python-module]`	1,368.5 µs	484.6 µs	×2.8
⚡	`test_cimultidict_get_istr_with_default_miss[c-extension-module]`	76.7 µs	50.4 µs	+52.2%
⚡	`test_cimultidict_get_istr_with_default_miss[pure-python-module]`	1,370.5 µs	485.8 µs	×2.8
👁	`test_cimultidict_getall_istr_hit[pure-python-module]`	54.5 µs	101.6 µs	-46.35%
⚡	`test_cimultidict_getall_istr_miss[c-extension-module]`	16.4 µs	14.8 µs	+10.61%
...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

asvetlov · 2025-04-07T14:50:14Z

Heh. Appending new values to the multidict is more expensive, all other ops are faster.

We have a tradeoff, as usual. I think that the lookup is a much more often operation than the multidict filling.
Also, pls keep in mind that the item replacement/deletion also requires a lookup.

asvetlov · 2025-04-07T20:38:26Z

The PR is more-or-less done but I'd like to make some polishing and self-review later.
Please don't merge it, I'll be only partially available next month, I cannot commit that I'll have enough time for fixing issues when the new version will be released.

Careful testing is appreciated!

New multidict is close to Python's dict except for multiple keys, of course.

It starts from the empty hashtable, which grows by a power of 2 starting from 8: 8, 16, 32, 64, 128, ...
The amount of items is 2/3 of the hashtable size (1/3 of the table is never allocated).

The table is resized if needed, and bulk updates (extend(), update(), and constructor calls) pre-allocate many items at once, reducing the amount of potential hashtable resizes.

Item deletion puts DKIX_DUMMY special index in the hashtable. In opposite to the standard dict, DKIX_DUMMY is never replaced with an index of the new entry except by hashtable indices rebuild. It allows to keep the insertion order for multiple equal keys.

The iteration for operations like getall() is a little tricky. The next index calculation could return the already visited index before reaching the end. To eliminate duplicates, the code marks already visited entries by entry->hash = -1. -1 hash is an invalid hash value that could be used as a marker. After the iteration finishes, all marked entries are restored.
Double iteration over the indices still has O(1) amortized time, it is ok.

.add(), val = md[key], md[key] = val, md.setdefault() all have O(1).
.getall() / .popall() have O(N) where N is the amount of returned items.
.update() / extend() have O(N+M) where N and M are amount of items in the left and right arguments (we had quadratic time here before).
popitem() is slightly slower because the function should calculate the index of the last (deleted) entry. Since the method is really rare; I don't care too much.

.copy() is super fast; construction multidict from another multidict could be optimized to reuse .copy() approach.

The performance is okay, multidict creation is slightly slower because the hashtable should be recalculated and the indices table rebuilds, but all other operations are faster.

Multidict is still slightly slower than the regular dict because I don't want to use the private API for accessing internal Python structures, the most notable is the string's hash.

TODO:

~~Optimize MultiDict(md) construction.~~
~~GC untrack the multidict if all keys/values are not tracked. In aiohttp, values are usually untracked str instanced.~~

Open question: Should we use HT_PERTURB_SHIFT in the index calculation? It is crucial for storing integers where hash(num) == num, but I doubt if it decreases the number of collisions for str keys. During the work on PR I saw many times when idx = next(idx) didn't change the current idx or it was like 1, 2, 1, 3 with a relatively high chance of duplication.
It may not be worth changing. I have no idea how to prove this except by calculating the collision statistics on a large enough number of keys.

If anybody wants to play with the code, two things could help debugging.

Uncomment # CFLAGS = ["-O0", "-g3", "-UNDEBUG"] line in setup.py to make a debugging build. It enables asserts and ASSERT_CONSISTENT(...) self-consistency check.
_ht_dump(...) function prints the current hashtable structure in a form useful for analyzing the internal structure.

Please feel free to experiment with and ask any questions.

codecov · 2025-04-08T09:49:38Z

Codecov Report

❌ Patch coverage is 96.56652% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 98.31%. Comparing base (bdbbc48) to head (b2f8227).
⚠️ Report is 66 commits behind head on master.

Files with missing lines	Patch %	Lines
multidict/_multidict_py.py	96.00%	1 Missing and 15 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1128      +/-   ##
==========================================
- Coverage   98.62%   98.31%   -0.32%     
==========================================
  Files          27       27              
  Lines        3566     3851     +285     
  Branches      561      700     +139     
==========================================
+ Hits         3517     3786     +269     
- Misses         17       18       +1     
- Partials       32       47      +15

Flag	Coverage Δ
CI-GHA	`98.31% <96.56%> (-0.32%)`	⬇️
MyPy	`80.35% <36.65%> (-2.83%)`	⬇️
pytest	`99.85% <98.85%> (-0.15%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

multidict/_multidict_py.py

Vizonex · 2025-06-08T17:56:22Z

Is there anything I can help out with in this pull request? I can always look into cloning this branch if anybody wants.

Vizonex · 2025-06-09T20:42:04Z

So I tested this fork on my computer and it all runs correctly. I guess the only real issue now is how we get code coverage to cover the missing lines. maybe adding a few more tests could do it?

Vizonex · 2025-06-09T22:12:37Z

https://app.codecov.io/gh/aio-libs/multidict/pull/1128/blob/multidict/_multidict_py.py#L819
@asvetlov I wrote a test that passes through line 819 I wonder if this was the last thing needed for code-coverage other than that the think I can think of that you didn't do is adding a summary of the changes made in the changes folder
You'll have to put this down into test_update.py since I tried deleting mine and then attempted forking this one to make changes which didn't work.

def test_update_with_second_md(any_multidict_class: _MD_Classes) -> None:
    obj1 = any_multidict_class()
    obj2 = any_multidict_class([("a", 2)])
    obj1.update(obj2)
    assert obj1 == obj2

Vizonex · 2025-06-10T02:19:23Z

Ignore what I sent before, I sent you a PR. asvetlov#1

asvetlov · 2025-06-10T10:16:28Z

@Vizonex thanks!

asvetlov · 2025-06-10T10:41:58Z

I think the PR is ready for review.
Performance degradation for bare .add() and .extend() is acceptable, as it requires a hashtable lookup instead of simply adding to the end of the array.
In real life, most multidict operations require a lookup, which is now 25-50% faster.

The change is backward compatible and shouldn't affect the library users.

multidict/_multidict_py.py

Dreamsorcerer · 2025-06-10T13:20:56Z

I'm not too familiar with this, so I'll leave it with the others. Certainly looking forward to the performance improvement though.

multidict/_multilib/hashtable.h

… Set-Cookie test for #1128

Vizonex · 2025-06-10T22:05:59Z

How is code coverage not hitting the same line I thought I had covered? I'm gonna have to re-patch it. This is confusing 😕

asvetlov · 2025-06-11T10:55:10Z

Now the test coverage is better.
The only missed things are checks like if identity == entry.identity: ...
They are essential in case of hash collisions: two different strings have the same hash.
Unfortunately, I have no idea how to generate this pair of strings, especially for 8-byte hash and the python random hash seed.
I suggest marking these lines as # pragma: no branch and that's it.

Dreamsorcerer · 2025-06-11T13:13:41Z

They are essential in case of hash collisions: two different strings have the same hash.

Create a str subclass with a hash implementation?

class CollisionStr(str):
    def __hash__(self):
        return 42

>>> hash(CollisionStr("foo"))
42

Or something similar.

asvetlov · 2025-06-11T16:37:22Z

Create a str subclass with a hash implementation?

I doubt if it is acceptable; multidict uses exact str instances as identities.
If I pass a custom string as a key, the library always converts it to an exact string for future processing.

Perhaps I could mock _CSMixin._identity() and _CIMixin._identity(); but again, it is not the best idea.
Our test suite is designed in a way that each test is used to check both Python and C implementations.
The second one is much more important.
Perhaps, I could write a helper that explicitly rewrites the calculated hash for an existing Python string using direct memory access to the corresponding C structure.
It should work with CPython, but really, do we need such a hack in our tests?

Dreamsorcerer · 2025-06-11T17:07:36Z

Hmm, yeah, sounds tricky.

Vizonex · 2025-06-12T16:09:43Z

I love the changes being done to multidict, anything I can do this time to keep things moving forward?

Vizonex · 2025-06-12T17:05:13Z

@asvetlov I think the debug workflows you put in all stopped. I'm wondering if 40 minutes would be better?

asvetlov · 2025-06-16T14:42:06Z

@Vizonex test_update_large_dict test added for explicit check large (>=2^17) multidict stops the world on CI.
Locally, it takes only 8 secs on my old laptop, the modern one executes it even faster.
I think I should cut it out and add an extra pragma: no cover to keep the formal coverage metrics.

Vizonex · 2025-06-16T15:47:30Z

@Vizonex test_update_large_dict test added for explicit check large (>=2^17) multidict stops the world on CI. Locally, it takes only 8 secs on my old laptop, the modern one executes it even faster. I think I should cut it out and add an extra pragma: no cover to keep the formal coverage metrics.

Agreed

Vizonex · 2025-06-17T13:29:12Z

@asvetlov Great Job on getting this passed I'll get started on giving multidict cython support very soon.

bdraco · 2025-06-17T14:54:40Z

I triggered a manual dependabot run:

yarl performance changes: https://codspeed.io/aio-libs/yarl/branches/dependabot%2Fpip%2Fmultidict-6.5.0
aiohttp performance changes: https://codspeed.io/aio-libs/aiohttp/branches/dependabot%2Fpip%2F3.13%2Fmultidict-6.5.0

Vizonex · 2025-06-18T02:35:27Z

I triggered a manual dependabot run:

yarl performance changes: https://codspeed.io/aio-libs/yarl/branches/dependabot%2Fpip%2Fmultidict-6.5.0 aiohttp performance changes: https://codspeed.io/aio-libs/aiohttp/branches/dependabot%2Fpip%2F3.13%2Fmultidict-6.5.0

That's amazing I hope adding multidict to cython will further enhance aiohttp it as well

asvetlov mentioned this pull request Apr 8, 2025

Split array of pair_t structures into array of hashes and array of everything else #415

Closed

asvetlov mentioned this pull request Apr 25, 2025

Support bytes as key type #1154

Closed

github-advanced-security bot found potential problems Apr 26, 2025

View reviewed changes

multidict/_multidict_py.py Fixed Show fixed Hide fixed

webknjaz reviewed May 29, 2025

View reviewed changes

multidict/_multidict_py.py Show resolved Hide resolved

bdraco mentioned this pull request Jun 8, 2025

Implement C-API Capsule For Future CPython & Cython Support #1171

Closed

3 tasks

psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Jun 10, 2025

asvetlov marked this pull request as ready for review June 10, 2025 10:38

asvetlov requested review from Dreamsorcerer and bdraco June 10, 2025 10:42

Dreamsorcerer reviewed Jun 10, 2025

View reviewed changes

multidict/_multidict_py.py Outdated Show resolved Hide resolved