Releases: jawah/charset_normalizer
Version 3.4.5
3.4.5 (2026-03-06)
Changed
- Update
setuptoolsconstraint tosetuptools>=68,<=82. - Raised upper bound of mypyc for the optional pre-built extension to v1.19.1
Fixed
- Add explicit link to lib math in our optimized build. (#692)
- Logger level not restored correctly for empty byte sequences. (#701)
- TypeError when passing bytearray to from_bytes. (#703)
Misc
- Applied safe micro-optimizations in both our noise detector and language detector.
- Rewrote the
query_yes_nofunction (inside CLI) to avoid using ambiguous licensed code. - Added
cd.pysubmodule into mypyc optional compilation to reduce further the performance impact.
Warning
mypyc changed the usual binary output for the optimized wheel. Beware, especially if using PyInstaller or alike. See #714
Version 3.4.4
3.4.4 (2025-10-13)
Changed
- Bound
setuptoolsto a specific constraintsetuptools>=68,<=81. - Raised upper bound of mypyc for the optional pre-built extension to v1.18.2
Removed
setuptools-scmas a build dependency.
Misc
- Enforced hashes in
dev-requirements.txtand createdci-requirements.txtfor security purposes. - Additional pre-built wheels for riscv64, s390x, and armv7l architectures.
- Restore
multiple.intoto.jsonlin GitHub releases in addition to individual attestation file per wheel.
Version 3.4.3
3.4.3 (2025-08-09)
Changed
- mypy(c) is no longer a required dependency at build time if
CHARSET_NORMALIZER_USE_MYPYCisn't set to1. (#595) (#583) - automatically lower confidence on small bytes samples that are not Unicode in
detectoutput legacy function. (#391)
Added
- Custom build backend to overcome inability to mark mypy as an optional dependency in the build phase.
- Support for Python 3.14
Fixed
- sdist archive contained useless directories.
- automatically fallback on valid UTF-16 or UTF-32 even if the md says it's noisy. (#633)
Misc
- SBOM are automatically published to the relevant GitHub release to comply with regulatory changes.
Each published wheel comes with its SBOM. We choose CycloneDX as the format. - Prebuilt optimized wheel are no longer distributed by default for CPython 3.7 due to a change in cibuildwheel.
Version 3.4.2
3.4.2 (2025-05-02)
Fixed
- Addressed the DeprecationWarning in our CLI regarding
argparse.FileTypeby backporting the target class into the package. (#591) - Improved the overall reliability of the detector with CJK Ideographs. (#605) (#587)
Changed
- Optional mypyc compilation upgraded to version 1.15 for Python >= 3.9
Version 3.4.1
🚀 We're still raising awareness around HTTP/2, and HTTP/3!
Did you know that Internet Explorer 11 shipped with an optional HTTP/2 support back in 2013? also libcurl did ship it in 2014[...]
Using Requests today is the rough equivalent of using EOL Windows 8! We promptly invite Python developers to look at the first drop-in replacement for Requests, namely Niquests. Ship with native WebSocket, SSE, Happy Eyeballs, DNS over HTTPS, and so on[...] All of this while remaining compatible with all Requests prior plug-ins / add-ons.
It leverages charset-normalizer in a better way! Check it out, you will gain up to being 3X faster and get a real/respectable support with it.
3.4.1 (2024-12-24)
Changed
- Project metadata are now stored using
pyproject.tomlinstead ofsetup.cfgusing setuptools as the build backend. - Enforce annotation delayed loading for a simpler and consistent types in the project.
- Optional mypyc compilation upgraded to version 1.14 for Python >= 3.8
Added
- pre-commit configuration.
- noxfile.
Removed
build-requirements.txtas per usingpyproject.tomlnative build configuration.bin/integration.pyandbin/serve.pyin favor of downstream integration test (see noxfile).setup.cfgin favor ofpyproject.tomlmetadata configuration.- Unused
utils.range_scanfunction.
Fixed
- Converting content to Unicode bytes may insert
utf_8instead of preferredutf-8. (#572) - Deprecation warning "'count' is passed as positional argument" when converting to Unicode bytes on Python 3.13+
Version 3.4.0
🚀 charset-normalizer is raising awareness around HTTP/2, and HTTP/3!
Did you know that Internet Explorer 11 shipped with an optional HTTP/2 support back in 2013? also libcurl did ship it in 2014[...]
All of this while our community is still struggling to make a firm advancement in HTTP clients. Now, many of you use Requests
as the defacto http client, now, and for many years now, Requests has been frozen. Being left in a vegetative state and not evolving,
this blocked millions of developers from using more advanced features.
We promptly invite Python developers to look at the drop-in replacement for Requests, namely Niquests.
It leverage charset-normalizer in a better way! Check it out, you will be positively surprised! Don't wait another decade.
We are thankful to @microsoft and involved parties for funding our work through the Microsoft FOSS Fund program.
3.4.0 (2024-10-08)
Added
- Argument
--no-preemptivein the CLI to prevent the detector to search for hints. - Support for Python 3.13 (#512)
Fixed
- Relax the TypeError exception thrown when trying to compare a CharsetMatch with anything else than a CharsetMatch.
- Improved the general reliability of the detector based on user feedbacks. (#520) (#509) (#498) (#407) (#537)
- Declared charset in content (preemptive detection) not changed when converting to utf-8 bytes. (#381)
Version 3.3.2
3.3.2 (2023-10-31)
Fixed
- Unintentional memory usage regression when using large payloads that match several encodings (#376)
- Regression on some detection cases showcased in the documentation (#371)
Added
- Noise (md) probe that identifies malformed Arabic representation due to the presence of letters in isolated form (credit to my wife, thanks!)
Version 3.3.1
3.3.1 (2023-10-22)
Changed
- Optional mypyc compilation upgraded to version 1.6.1 for Python >= 3.8
- Improved the general detection reliability based on reports from the community
Release 3.3.0
3.3.0 (2023-09-30)
Added
- Allow to execute the CLI (e.g. normalizer) through
python -m charset_normalizer.cliorpython -m charset_normalizer - Support for 9 forgotten encodings that are supported by Python but unlisted in
encoding.aliasesas they have no alias (#323)
Removed
- (internal) Redundant utils.is_ascii function and unused function is_private_use_only
- (internal) charset_normalizer.assets is moved inside charset_normalizer.constant
Changed
- (internal) Unicode code blocks in constants are updated using the latest v15.0.0 definition to improve detection
- Optional mypyc compilation upgraded to version 1.5.1 for Python >= 3.8
Fixed
- Unable to properly sort CharsetMatch when both chaos/noise and coherence were close due to an unreachable condition in __lt__ (#350)
Version 3.2.0
3.2.0 (2023-06-07)
Changed
- Typehint for function
from_pathno longer enforcePathLikeas its first argument - Minor improvement over the global detection reliability
Added
- Introduce function
is_binarythat relies on main capabilities, and is optimized to detect binaries - Propagate
enable_fallbackargument throughoutfrom_bytes,from_path, andfrom_fpthat allow a deeper control over the detection (default True) - Explicit support for Python 3.12
Fixed
- Edge case detection failure where a file would contain 'very-long' camel-cased word (Issue #289)