Skip to content

fix TypeError when passing bytearray to from_bytes#703

Merged
Ousret merged 1 commit intojawah:masterfrom
ArmaanjeetSandhu:fix/bytearray-typeerror
Mar 6, 2026
Merged

fix TypeError when passing bytearray to from_bytes#703
Ousret merged 1 commit intojawah:masterfrom
ArmaanjeetSandhu:fix/bytearray-typeerror

Conversation

@ArmaanjeetSandhu
Copy link
Copy Markdown
Contributor

This PR fixes a bug where passing a valid bytearray to from_bytes crashes the application with a TypeError. The api.from_bytes type hints explicitly support bytearray, but the underlying helper function utils.any_specified_encoding did not.

What actually happens:
When from_bytes is called with a bytearray and preemptive_behaviour=True, utils.any_specified_encoding raises a TypeError because it strictly checks if not isinstance(sequence, bytes):.

What I expected to happen:
The function should successfully process the bytearray sequence and return a CharsetMatches object, just as it does for standard bytes objects.

How to reproduce the issue:

from charset_normalizer import from_bytes

# Create a bytearray object 
my_bytearray = bytearray(b"Hello, world!")

# This will raise a TypeError on the current main branch
results = from_bytes(my_bytearray)

Changes Made:

  • Updated utils.py around line 260 to check for isinstance(sequence, (bytes, bytearray)) instead of strictly bytes.
  • This fix ensures backward compatibility while aligning the runtime behavior with the documented type hints.

Checklist:

  • I have read the CONTRIBUTING.md document.
  • I have verified that this does not break backward compatibility.
  • I have successfully run nox -s test locally.
  • I have successfully run nox -s lint locally.
  • I have successfully run nox -s coverage locally.

The `from_bytes` function in `api.py` accepts both
`bytes` and `bytearray` as valid input. However,
when `preemptive_behaviour` is enabled (the
default), the input is passed to
`any_specified_encoding` in `utils.py`, which
strictly checked for `isinstance(sequence, bytes)`.
This caused an immediate `TypeError` when a
`bytearray` was provided.

This commit updates the type check in
`any_specified_encoding` to
`isinstance(sequence, (bytes, bytearray))` to
properly support `bytearray` inputs without
breaking backward compatibility.
Copy link
Copy Markdown
Member

@Ousret Ousret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Ousret Ousret merged commit 209f9ff into jawah:master Mar 6, 2026
1 check passed
@Ousret Ousret mentioned this pull request Mar 6, 2026
Ousret added a commit that referenced this pull request Mar 6, 2026
## [3.4.5](3.4.4...3.4.5) (2026-03-06)

### Changed
- Update `setuptools` constraint to `setuptools>=68,<=82`.
- Raised upper bound of mypyc for the optional pre-built extension to v1.19.1

### Fixed
- Add explicit link to lib math in our optimized build. (#692)
- Logger level not restored correctly for empty byte sequences. (#701)
- TypeError when passing bytearray to from_bytes. (#703)

### Misc
- Applied safe micro-optimizations in both our noise detector and language detector.
- Rewrote the `query_yes_no` function (inside CLI) to avoid using ambiguous licensed code.
- Added `cd.py` submodule into mypyc optional compilation to reduce further the performance impact.
@ArmaanjeetSandhu ArmaanjeetSandhu deleted the fix/bytearray-typeerror branch March 6, 2026 05:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants