Skip to content

Releases: jawah/charset_normalizer

Version 3.4.5

06 Mar 04:17
7411396

Choose a tag to compare

3.4.5 (2026-03-06)

Changed

  • Update setuptools constraint to setuptools>=68,<=82.
  • Raised upper bound of mypyc for the optional pre-built extension to v1.19.1

Fixed

  • Add explicit link to lib math in our optimized build. (#692)
  • Logger level not restored correctly for empty byte sequences. (#701)
  • TypeError when passing bytearray to from_bytes. (#703)

Misc

  • Applied safe micro-optimizations in both our noise detector and language detector.
  • Rewrote the query_yes_no function (inside CLI) to avoid using ambiguous licensed code.
  • Added cd.py submodule into mypyc optional compilation to reduce further the performance impact.

Warning

mypyc changed the usual binary output for the optimized wheel. Beware, especially if using PyInstaller or alike. See #714

Version 3.4.4

14 Oct 03:27
b30ffdc

Choose a tag to compare

3.4.4 (2025-10-13)

Changed

  • Bound setuptools to a specific constraint setuptools>=68,<=81.
  • Raised upper bound of mypyc for the optional pre-built extension to v1.18.2

Removed

  • setuptools-scm as a build dependency.

Misc

  • Enforced hashes in dev-requirements.txt and created ci-requirements.txt for security purposes.
  • Additional pre-built wheels for riscv64, s390x, and armv7l architectures.
  • Restore multiple.intoto.jsonl in GitHub releases in addition to individual attestation file per wheel.

Version 3.4.3

09 Aug 06:42
46f662d

Choose a tag to compare

3.4.3 (2025-08-09)

Changed

  • mypy(c) is no longer a required dependency at build time if CHARSET_NORMALIZER_USE_MYPYC isn't set to 1. (#595) (#583)
  • automatically lower confidence on small bytes samples that are not Unicode in detect output legacy function. (#391)

Added

  • Custom build backend to overcome inability to mark mypy as an optional dependency in the build phase.
  • Support for Python 3.14

Fixed

  • sdist archive contained useless directories.
  • automatically fallback on valid UTF-16 or UTF-32 even if the md says it's noisy. (#633)

Misc

  • SBOM are automatically published to the relevant GitHub release to comply with regulatory changes.
    Each published wheel comes with its SBOM. We choose CycloneDX as the format.
  • Prebuilt optimized wheel are no longer distributed by default for CPython 3.7 due to a change in cibuildwheel.

Version 3.4.2

02 May 06:21
6422af1

Choose a tag to compare

3.4.2 (2025-05-02)

Fixed

  • Addressed the DeprecationWarning in our CLI regarding argparse.FileType by backporting the target class into the package. (#591)
  • Improved the overall reliability of the detector with CJK Ideographs. (#605) (#587)

Changed

  • Optional mypyc compilation upgraded to version 1.15 for Python >= 3.9

Version 3.4.1

24 Dec 16:14
ffdf7f5

Choose a tag to compare

🚀 We're still raising awareness around HTTP/2, and HTTP/3!

Did you know that Internet Explorer 11 shipped with an optional HTTP/2 support back in 2013? also libcurl did ship it in 2014[...]
Using Requests today is the rough equivalent of using EOL Windows 8! We promptly invite Python developers to look at the first drop-in replacement for Requests, namely Niquests. Ship with native WebSocket, SSE, Happy Eyeballs, DNS over HTTPS, and so on[...] All of this while remaining compatible with all Requests prior plug-ins / add-ons.

It leverages charset-normalizer in a better way! Check it out, you will gain up to being 3X faster and get a real/respectable support with it.

3.4.1 (2024-12-24)

Changed

  • Project metadata are now stored using pyproject.toml instead of setup.cfg using setuptools as the build backend.
  • Enforce annotation delayed loading for a simpler and consistent types in the project.
  • Optional mypyc compilation upgraded to version 1.14 for Python >= 3.8

Added

  • pre-commit configuration.
  • noxfile.

Removed

  • build-requirements.txt as per using pyproject.toml native build configuration.
  • bin/integration.py and bin/serve.py in favor of downstream integration test (see noxfile).
  • setup.cfg in favor of pyproject.toml metadata configuration.
  • Unused utils.range_scan function.

Fixed

  • Converting content to Unicode bytes may insert utf_8 instead of preferred utf-8. (#572)
  • Deprecation warning "'count' is passed as positional argument" when converting to Unicode bytes on Python 3.13+

Version 3.4.0

09 Oct 06:14

Choose a tag to compare

🚀 charset-normalizer is raising awareness around HTTP/2, and HTTP/3!

Did you know that Internet Explorer 11 shipped with an optional HTTP/2 support back in 2013? also libcurl did ship it in 2014[...]
All of this while our community is still struggling to make a firm advancement in HTTP clients. Now, many of you use Requests
as the defacto http client, now, and for many years now, Requests has been frozen. Being left in a vegetative state and not evolving,
this blocked millions of developers from using more advanced features.

We promptly invite Python developers to look at the drop-in replacement for Requests, namely Niquests.
It leverage charset-normalizer in a better way! Check it out, you will be positively surprised! Don't wait another decade.

We are thankful to @microsoft and involved parties for funding our work through the Microsoft FOSS Fund program.

3.4.0 (2024-10-08)

Added

  • Argument --no-preemptive in the CLI to prevent the detector to search for hints.
  • Support for Python 3.13 (#512)

Fixed

  • Relax the TypeError exception thrown when trying to compare a CharsetMatch with anything else than a CharsetMatch.
  • Improved the general reliability of the detector based on user feedbacks. (#520) (#509) (#498) (#407) (#537)
  • Declared charset in content (preemptive detection) not changed when converting to utf-8 bytes. (#381)

Version 3.3.2

31 Oct 20:22
79dce48

Choose a tag to compare

3.3.2 (2023-10-31)

Fixed

  • Unintentional memory usage regression when using large payloads that match several encodings (#376)
  • Regression on some detection cases showcased in the documentation (#371)

Added

  • Noise (md) probe that identifies malformed Arabic representation due to the presence of letters in isolated form (credit to my wife, thanks!)

Version 3.3.1

22 Oct 15:07
5208644

Choose a tag to compare

3.3.1 (2023-10-22)

Changed

  • Optional mypyc compilation upgraded to version 1.6.1 for Python >= 3.8
  • Improved the general detection reliability based on reports from the community

Release 3.3.0

30 Sep 06:08
165211a

Choose a tag to compare

3.3.0 (2023-09-30)

Added

  • Allow to execute the CLI (e.g. normalizer) through python -m charset_normalizer.cli or python -m charset_normalizer
  • Support for 9 forgotten encodings that are supported by Python but unlisted in encoding.aliases as they have no alias (#323)

Removed

  • (internal) Redundant utils.is_ascii function and unused function is_private_use_only
  • (internal) charset_normalizer.assets is moved inside charset_normalizer.constant

Changed

  • (internal) Unicode code blocks in constants are updated using the latest v15.0.0 definition to improve detection
  • Optional mypyc compilation upgraded to version 1.5.1 for Python >= 3.8

Fixed

  • Unable to properly sort CharsetMatch when both chaos/noise and coherence were close due to an unreachable condition in __lt__ (#350)

Version 3.2.0

07 Jul 18:02
0424c80

Choose a tag to compare

3.2.0 (2023-06-07)

Changed

  • Typehint for function from_path no longer enforce PathLike as its first argument
  • Minor improvement over the global detection reliability

Added

  • Introduce function is_binary that relies on main capabilities, and is optimized to detect binaries
  • Propagate enable_fallback argument throughout from_bytes, from_path, and from_fp that allow a deeper control over the detection (default True)
  • Explicit support for Python 3.12

Fixed

  • Edge case detection failure where a file would contain 'very-long' camel-cased word (Issue #289)