Skip to content

BUG: Trouble with r/l/strip when stripping a non-ASCII character from a StringDType #26915

@WarrenWeckesser

Description

@WarrenWeckesser

Describe the issue:

It seems rstrip can't strip a Unicode character such as 'μ' from the string "λμ":

In [89]: a = np.array(["abc", "λμ"], dtype=np.dtype('T'))

In [90]: a
Out[90]: array(['abc', 'λμ'], dtype=StringDType())

Try to rstrip 'μ':

In [91]: np.strings.rstrip(a, 'μ')
Out[91]: array(['abc', 'λμ'], dtype=StringDType())

It is still there.

Pass a as chars; this should strip everything:

In [92]: np.strings.rstrip(a, a)
Out[92]: array(['', 'λμ'], dtype=StringDType())

Nope.

Try with strip instead of rstrip; 'λ' is removed, but 'μ' remains:

In [93]: np.strings.strip(a, a)
Out[93]: array(['', 'μ'], dtype=StringDType())

Python and NumPy Versions:

In [102]: import sys, numpy; print(numpy.__version__); print(sys.version)
2.1.0.dev0+git20240711.abeca76
3.12.4 (main, Jul  4 2024, 20:00:06) [GCC 11.4.0]

Runtime Environment:

[{'numpy_version': '2.1.0.dev0+git20240711.abeca76',
  'python': '3.12.4 (main, Jul  4 2024, 20:00:06) [GCC 11.4.0]',
  'uname': uname_result(system='Linux', node='pop-os', release='6.9.3-76060903-generic', version='#202405300957~1718348209~22.04~7817b67 SMP PREEMPT_DYNAMIC Mon J', machine='x86_64')},
 {'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
                      'found': ['SSSE3',
                                'SSE41',
                                'POPCNT',
                                'SSE42',
                                'AVX',
                                'F16C',
                                'FMA3',
                                'AVX2'],
                      'not_found': ['AVX512F',
                                    'AVX512CD',
                                    'AVX512_KNL',
                                    'AVX512_KNM',
                                    'AVX512_SKX',
                                    'AVX512_CLX',
                                    'AVX512_CNL',
                                    'AVX512_ICL']}},
 {'architecture': 'Zen',
  'filepath': '/usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so',
  'internal_api': 'openblas',
  'num_threads': 24,
  'prefix': 'libopenblas',
  'threading_layer': 'pthreads',
  'user_api': 'blas',
  'version': '0.3.20'}]

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions