Improve benchmarks scaling for sub-benchmarks#7431
Closed
scoder wants to merge 1 commit intocython:masterfrom
Closed
Improve benchmarks scaling for sub-benchmarks#7431scoder wants to merge 1 commit intocython:masterfrom
scoder wants to merge 1 commit intocython:masterfrom
Conversation
…st benchmark instead of the slowest. Scale back the timings of sub-benchmarks to the outer scale count to report comparable timings.
Contributor
Author
|
Merged as part of #7454 |
scoder
added a commit
that referenced
this pull request
Jan 22, 2026
Python semantics dictate that we first try the mapping protocol and then the sequence protocol for subscripting. When the index is a C integer, we can optimise perfectly for list/tuple, but all other sequences suffer from having to build a Python `int` object for the index to pass it through the mapping lookup if they implement that (e.g. to support extended slicing, like NumPy arrays). Python 3.10 added type markers (for pattern matching) for explicitly declaring a type as sequence or mapping, called `Py_TPFLAGS_SEQUENCE` and `Py_TPFLAGS_MAPPING`, which can now be checked for quite quickly. If a type is marked as sequence but still implements mapping lookups for slicing, and it supports sequence subscripting, we can avoid the Python `int` creation of the mapping protocol and go straight through the sequence index lookup. With this change, indexing into Python's `array.array` and `memoryview` types is ~60% faster in a micro-benchmark. Using a C integer as dict key got slightly slower but is resolved by adding a separate up-front special case. Future NumPy versions are expected to set the sequence flag and should therefore benefit from this change as well. See numpy/numpy#30519 Benchmark is based on #7431 See https://docs.python.org/3/c-api/typeobj.html#c.Py_TPFLAGS_SEQUENCE #1807 pandas-dev/pandas#55915 pandas-dev/pandas#55179 (comment)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
… by scaling to the fastest benchmark instead of the slowest.
Scale back the timings of sub-benchmarks to the outer scale count to report comparable timings (which will be divided by the outer scale for reporting the per-loop runtime).