Skip to content

Crashes with chardet 7.0.0is_binary_string does not handle None encoding #634

@wesleybl

Description

@wesleybl

binaryornot 0.4.4 crashes when used with chardet 7.0.0. The new version of chardet can return {'encoding': None, 'confidence': 0.99} — high confidence, but no encoding. The condition in is_binary_string does not guard against a None encoding, so it passes None directly to .decode(), causing a crash.

Traceback

File ".../binaryornot/helpers.py", line 103, in is_binary_string
    bytes_to_check.decode(encoding=detected_encoding['encoding'])
TypeError: decode() argument 'encoding' must be str, not None

During handling of the above exception, another exception occurred:

File ".../binaryornot/helpers.py", line 106, in is_binary_string
    unicode(bytes_to_check, encoding=detected_encoding['encoding'])
NameError: name 'unicode' is not defined

Root Cause

In helpers.py, the guard condition before decoding is:

if (detected_encoding['confidence'] > 0.9 and
        detected_encoding['encoding'] != 'ascii'):

With chardet 7.0.0, detect() can return:

{'encoding': None, 'confidence': 0.99, 'language': ''}

Since None != 'ascii' evaluates to True, the code enters the if block and calls:

bytes_to_check.decode(encoding=None)  # TypeError

The except TypeError block then falls back to:

unicode(bytes_to_check, encoding=None)  # NameError: Python 2 only

So the error handler itself crashes instead of recovering gracefully.

Suggested Fix

Add detected_encoding['encoding'] is not None to the guard condition:

if (detected_encoding['confidence'] > 0.9 and
        detected_encoding['encoding'] is not None and
        detected_encoding['encoding'] != 'ascii'):

This is a minimal, surgical fix that preserves the existing logic while handling the new chardet behavior correctly.

Environment

  • binaryornot: 0.4.4
  • chardet: 7.0.0
  • Python: 3.12.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions