Use module state and heap types for C Extension by asvetlov · Pull Request #1125 · aio-libs/multidict

asvetlov · 2025-04-03T19:34:30Z

Module state is crucial for running multiple module instances in sub-interpreters. It is the recommended way to write C Extensions, and CPython built-in modules were rewritten in this way.
It is also recommended that heap types be used over static types for C Extensions. Let's do it.

The PR eliminates global variables except for constant module definitions as a side effect.

codspeed-hq · 2025-04-03T21:20:17Z

CodSpeed Performance Report

Merging #1125 will not alter performance

_{Comparing state (4addc79) with master (018643a)}

Summary

✅ 244 untouched benchmarks

asvetlov · 2025-04-03T21:22:44Z

Okay, module state finding is not for free.
Let's see what can we do with it.

asvetlov · 2025-04-04T08:30:22Z

performance is good again after adding a pointer to the module state to pair_list_t

bdraco · 2025-04-04T08:31:47Z

multidict_tp_init looks a bit slower. 6.3.2 showed up being a bit slower in yarl in aio-libs/yarl#1492 (comment). Not sure if anything can be done there or it is what it is and we should ack and merge the bump in yarl.

asvetlov · 2025-04-04T08:36:54Z

multidict_tp_init looks a bit slower. 6.3.2 showed up being a bit slower in yarl in aio-libs/yarl#1492 (comment). Not sure if anything can be done there or it is what it is and we should ack and merge the bump in yarl.

Yeah, tp_init has no direct access to the module state and should do a lookup over the class's hierarchy.
Hopefully, if a user doesn't derive from MultiDict / CIMultiDict, the lookup ends on the topmost class.
But it takes some small time, still.
It is expected, and there is no space to do anything here.

bdraco · 2025-04-04T09:07:40Z

Testing

Production 1
Production 2
aiohttp master
aiohttp 3.11

asvetlov · 2025-04-04T09:14:22Z

@bdraco for yarl/aiohttp needs, I could imagine exposing public C API for use from Cython.
It is possible, but it should be done carefully. Something like datetime C API is exposed.

Another thing that definitely could improve the performance is switching from pair_list_t to hashtable. It should be pretty close to the structure used for cpython dict itself with the following changes:

Keys are always a PyUnicode_Type identity objects.
No shared keys.
Multiple equal keys could be supported by the hashtable. To achieve it, DKIX_DUMMY should never migrate back to ACTIVE but the table should be rebalanced if there are too many deleted items.

I'd like to experiment with hashtable, O(N) vs O(1) for lookup operations looks interesting.

bdraco · 2025-04-04T09:27:50Z

@bdraco for yarl/aiohttp needs, I could imagine exposing public C API for use from Cython. It is possible, but it should be done carefully. Something like datetime C API is exposed.

These areas are warm and I see them in production profiles. I'm not sure if they could benefit from a C API

https://github.com/aio-libs/aiohttp/blob/eff1dd754c526aae967fe80075aca1df4c28666b/aiohttp/payload.py#L167
https://github.com/aio-libs/aiohttp/blob/eff1dd754c526aae967fe80075aca1df4c28666b/aiohttp/client_reqrep.py#L390
https://github.com/aio-libs/aiohttp/blob/eff1dd754c526aae967fe80075aca1df4c28666b/aiohttp/client_reqrep.py#L415

Might be enough to implement the above patterns as multidict functions but with the flexibility of LooseHeaders it may not result in much real world improvement.

Another thing that definitely could improve the performance is switching from pair_list_t to hashtable. It should be pretty close to the structure used for cpython dict itself with the following changes:

Keys are always a PyUnicode_Type identity objects.

No shared keys.

Multiple equal keys could be supported by the hashtable. To achieve it, DKIX_DUMMY should never migrate back to ACTIVE but the table should be rebalanced if there are too many deleted items.

I'd like to experiment with hashtable, O(N) vs O(1) for lookup operations looks interesting.

O(N) sounds really nice. I'm guessing this probably makes more difference than a C API

asvetlov · 2025-04-04T09:28:06Z

Please keep the PR open for a while, I'd like to apply minor improvements.

bdraco

Looked over the code. LGTM. I don't have my head fully wrapped around the Py_TPFLAGS_MANAGED_WEAKREF but it looks good AFAICT.

Testing went well. No regressions or obvious performance issues observed

bdraco · 2025-04-04T09:37:37Z

Near midnight for me (UTC-10) so off to bed. Let me know if anything should be retested on prod and I'm happy to give it a spin tomorrow when I get up

asvetlov · 2025-04-04T10:18:26Z

auto-headers processing needs #292 implemented for good performance.
Creation CIMultiDict from C could give use a few ms but I don't it worth the effort.
When I'm speaking about public C API I'm mostly thinking about .get()/.getall(), .add() operations, and a fast iteration over keys/values.
All these are micro-optimizations, sure. Who knows the speedup on macro-benchmarks? I guess it is not too high, except for request parsing and response building.

asvetlov · 2025-04-04T10:25:34Z

@bdraco Py_TPFLAGS_MANAGED_WEAKREF is py3.12 feature for making multidicts weak-referenceable.
tp_weaklistoffset doesn't work with heap types (and with multiple inheritance, this is another beast).
For py <3.12 I use a fallback with __weaklistoffset__ member. It works everywhere but Py_TPFLAGS_MANAGED_WEAKREF should be used instead if possible.

bdraco · 2025-04-04T19:27:01Z

auto-headers processing needs #292 implemented for good performance. Creation CIMultiDict from C could give use a few ms but I don't it worth the effort. When I'm speaking about public C API I'm mostly thinking about .get()/.getall(), .add() operations, and a fast iteration over keys/values. All these are micro-optimizations, sure. Who knows the speedup on macro-benchmarks? I guess it is not too high, except for request parsing and response building.

Yeah. I explored it a bit in aio-libs/aiohttp#10696 and it doesn't seem like there is much to gain there. Maybe some potential saving on iterating but even than usually its two or three keys being iterated.

asvetlov added 11 commits April 3, 2025 21:19

It works with non-managed weakrefs

cb79826

Make it work

e0de2ad

Merge remote-tracking branch 'origin/master' into state

9a4996a

Add missing file

d0ecf17

Merge branch 'master' into state

d27cd2c

Fix

63388f7

Fix

c623433

fix

0b6f4de

fix

4bdf23e

fix

925478d

Correct py version check

b2a7d97

asvetlov added 4 commits April 3, 2025 23:35

optimize

1456b3f

revert

0fe16b4

store state in pair_list

78f816d

polish

d881829

changelog

743241a

psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Apr 4, 2025

asvetlov marked this pull request as ready for review April 4, 2025 08:46

asvetlov requested a review from webknjaz as a code owner April 4, 2025 08:46

asvetlov added 2 commits April 4, 2025 10:47

Revert basetype for istr

ba21822

properly support module state lookup for istr-inherited classes

a58617c

bdraco approved these changes Apr 4, 2025

View reviewed changes

Support derived classes

4addc79

asvetlov merged commit 17f08a1 into master Apr 4, 2025
46 of 47 checks passed

asvetlov deleted the state branch April 4, 2025 10:25

Uh oh!

Conversation

asvetlov commented Apr 3, 2025

Uh oh!

codspeed-hq bot commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #1125 will not alter performance

Summary

Uh oh!

asvetlov commented Apr 3, 2025

Uh oh!

asvetlov commented Apr 4, 2025

Uh oh!

bdraco commented Apr 4, 2025

Uh oh!

asvetlov commented Apr 4, 2025

Uh oh!

bdraco commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asvetlov commented Apr 4, 2025

Uh oh!

bdraco commented Apr 4, 2025

Uh oh!

asvetlov commented Apr 4, 2025

Uh oh!

bdraco left a comment

Choose a reason for hiding this comment

Uh oh!

bdraco commented Apr 4, 2025

Uh oh!

asvetlov commented Apr 4, 2025

Uh oh!

asvetlov commented Apr 4, 2025

Uh oh!

Uh oh!

bdraco commented Apr 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codspeed-hq bot commented Apr 3, 2025 •

edited

Loading

bdraco commented Apr 4, 2025 •

edited

Loading