Skip to content

Inline slice_memviewslice#7557

Closed
da-woods wants to merge 1 commit intocython:masterfrom
da-woods:slice_memviewslice-inline
Closed

Inline slice_memviewslice#7557
da-woods wants to merge 1 commit intocython:masterfrom
da-woods:slice_memviewslice-inline

Conversation

@da-woods
Copy link
Copy Markdown
Contributor

@da-woods da-woods commented Mar 8, 2026

Partial fix for #7521.

This came up in a particular scikit-image microbenchmark so almost certainly isn't the only thing. But it looks like this particular function relies heavily on unused branches being eliminated so really wants to be inlined.

On the scikit-image microbenchmark, it's now actually faster with the shared utility code than without. I'm not sure why but possibly just the effect of the inline keyword.

The shared utility module itself doesn't change size at all with this fix.

I don't really like the code duplication but I think it's worth it at least for now.

Partial fix for cython#7521.

This came up in a particular scikit-image microbenchmark so
almost certainly isn't the only thing. But it looks like this
particular function relies heavily on unused branches being
eliminated so really wants to be inlined.

On the scikit-image microbenchmark, it's now actually faster
with the shared utility code than without. I'm not sure why
but possibly just the effect of the `inline` keyword.

The shared utility module itself doesn't change size at all
with this fix.

I don't really like the code duplication but I think it's
worth it at least for now.
@da-woods
Copy link
Copy Markdown
Contributor Author

da-woods commented Mar 8, 2026

I don't really like the code duplication but I think it's worth it at least for now.

Thinking about this more, I wonder if this is better solved by moving this function to the C utility code. It's all very simple and were aren't really using Cython to do any Python object handling or anything. So moving it to C would just give better control of how it's included.

da-woods added a commit to da-woods/cython that referenced this pull request Mar 9, 2026
Partial fix for cython#7521.

This came up in a particular scikit-image microbenchmark so almost certainly isn't the only thing. But it looks like this particular function relies heavily on unused branches being eliminated so really wants to be inlined.

On the scikit-image microbenchmark, it's now actually faster with the shared utility code than without. I'm not sure why but possibly just the effect of the inline keyword.

The shared utility module itself doesn't change size at all with this fix (despite moving it to C)

Alternative to cython#7557
@da-woods
Copy link
Copy Markdown
Contributor Author

da-woods commented Mar 9, 2026

#7558 is an alternative version of this translated to C rather than duplicated. Performance-wise it's basically identical I think

@scoder
Copy link
Copy Markdown
Contributor

scoder commented Mar 9, 2026

I wonder if this is better solved by moving this function to the C utility code

Agreed. Closing this PR since it's superseded by #7558

@scoder scoder closed this Mar 9, 2026
da-woods added a commit that referenced this pull request Mar 9, 2026
Partial fix for #7521.

This came up in a particular scikit-image microbenchmark so almost
certainly isn't the only thing. But it looks like this particular
function relies heavily on unused branches being eliminated so really
wants to be inlined.

On the scikit-image microbenchmark, it's now actually faster with the
shared utility code than without. I'm not sure why but possibly just the
effect of the inline keyword.

The shared utility module itself shrinks by 32bytes with this
fix.

Alternative to #7557. I prefer this
version because it has less duplication but there's some risk of errors
in the translation to C
@da-woods da-woods deleted the slice_memviewslice-inline branch March 9, 2026 18:20
scoder pushed a commit that referenced this pull request Mar 21, 2026
Partial fix for #7521

Backport to 3.2.x of #7558

This came up in a particular scikit-image microbenchmark so almost
certainly isn't the only thing. But it looks like this particular
function relies heavily on unused branches being eliminated so really
wants to be inlined.

On the scikit-image microbenchmark, it's now actually faster with the
shared utility code than without. I'm not sure why but possibly just the
effect of the inline keyword.

The shared utility module itself shrinks by 32bytes with this fix.

Alternative to #7557. I prefer this
version because it has less duplication but there's some risk of errors
in the translation to C.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance sharedlib Cython module for code sharing between multiple modules.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants