Skip to content

BUG: fix incorrect bytes to stringdtype coercion#28282

Merged
charris merged 1 commit intonumpy:maintenance/2.2.xfrom
charris:backport-28276
Feb 5, 2025
Merged

BUG: fix incorrect bytes to stringdtype coercion#28282
charris merged 1 commit intonumpy:maintenance/2.2.xfrom
charris:backport-28276

Conversation

@charris
Copy link
Copy Markdown
Member

@charris charris commented Feb 5, 2025

Backport of #28276.

Fixes #28269.

It turns out test_scalars_string_conversion was testing the old buggy conversion 🙃.

Is it maybe problematic to assume the bytes are UTF-8? Before we were doing something completely nonsensical so we're free to make a choice here. I think the built-in NumPy bytes dtype assumes everything is ASCII, which is maybe less useful than letting people pass in arbitrary UTF-8?

We could also probably do this faster without going through the Python C API but that can be a future pass if anyone notices.

@charris charris added 00 - Bug 08 - Backport Used to tag backport PRs component: numpy.strings String dtypes and functions labels Feb 5, 2025
@charris charris added this to the 2.2.3 release milestone Feb 5, 2025
@charris charris added 08 - Backport Used to tag backport PRs and removed 08 - Backport Used to tag backport PRs labels Feb 5, 2025
@charris charris merged commit 2cc5acf into numpy:maintenance/2.2.x Feb 5, 2025
@charris charris deleted the backport-28276 branch February 5, 2025 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

00 - Bug 08 - Backport Used to tag backport PRs component: numpy.strings String dtypes and functions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants