Skip to content

Fix Cython 3.0 regression with time_loc_dups#55915

Merged
mroeschke merged 2 commits intopandas-dev:mainfrom
WillAyd:fix-dup-perf
Nov 10, 2023
Merged

Fix Cython 3.0 regression with time_loc_dups#55915
mroeschke merged 2 commits intopandas-dev:mainfrom
WillAyd:fix-dup-perf

Conversation

@WillAyd
Copy link
Copy Markdown
Member

@WillAyd WillAyd commented Nov 10, 2023

re #55179

From discussion in cython/cython#1807 (comment) it looks like Cython prior to 3.0 would always use the sequence protocol for indexing with an integral value. However, Python prefers the object protocol first if available, and Cython switched to match that logic with 3.0

NumPy arrays implement both the sequence and the mapping protocol. In cases where we have untyped arrays that fall back to Python calls we will see a performance regression since this will now route through the mapping space

The changes in this PR are not meant to be an exhaustive review of the codebase, rather just a quick POC to reset the time_loc_dups benchmark

@jbrockmendel
Copy link
Copy Markdown
Member

LGTM, bummer that its necessary though.

@mroeschke mroeschke added this to the 2.2 milestone Nov 10, 2023
@mroeschke mroeschke added Performance Memory or execution speed performance Internals Related to non-user accessible pandas implementation labels Nov 10, 2023
@mroeschke mroeschke merged commit d650212 into pandas-dev:main Nov 10, 2023
@mroeschke
Copy link
Copy Markdown
Member

Thanks @WillAyd

@WillAyd WillAyd deleted the fix-dup-perf branch November 11, 2023 00:07
@rhshadrach rhshadrach mentioned this pull request Nov 12, 2023
5 tasks
scoder added a commit to cython/cython that referenced this pull request Jan 22, 2026
Python semantics dictate that we first try the mapping protocol and then
the sequence protocol for subscripting. When the index is a C integer,
we can optimise perfectly for list/tuple, but all other sequences suffer
from having to build a Python `int` object for the index to pass it
through the mapping lookup if they implement that (e.g. to support
extended slicing, like NumPy arrays).

Python 3.10 added type markers (for pattern matching) for explicitly
declaring a type as sequence or mapping, called `Py_TPFLAGS_SEQUENCE`
and `Py_TPFLAGS_MAPPING`, which can now be checked for quite quickly.
If a type is marked as sequence but still implements mapping lookups for
slicing, and it supports sequence subscripting, we can avoid the Python
`int` creation of the mapping protocol and go straight through the
sequence index lookup.

With this change, indexing into Python's `array.array` and `memoryview`
types is ~60% faster in a micro-benchmark.
Using a C integer as dict key got slightly slower but is resolved by
adding a separate up-front special case.

Future NumPy versions are expected to set the sequence flag and should
therefore benefit from this change as well.
See numpy/numpy#30519

Benchmark is based on #7431

See
https://docs.python.org/3/c-api/typeobj.html#c.Py_TPFLAGS_SEQUENCE
#1807
pandas-dev/pandas#55915
pandas-dev/pandas#55179 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Internals Related to non-user accessible pandas implementation Performance Memory or execution speed performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants