Detect Encoding of a Text file with Python

Last Updated : 8 Apr, 2026

Python provides the chardet library, which can automatically detect a file’s encoding. It works by analyzing the statistical patterns of byte sequences to estimate the most likely encoding.

Why Detect File Encoding

  • Different systems and applications save text files in different encodings (like UTF-8, ISO-8859-1, etc.).
  • Reading a file with the wrong encoding can lead to errors or garbled text.
  • Detecting encoding helps ensure smooth file reading and processing.

How to detect the encoding of a text file with Python?

Below, are the step-by-step implementation of How to detect the encoding of a text file with Python.

Step 1: Create a Virtual Environment

First, create the virtual environment using the below commands

python -m venv env
.\env\Scripts\activate.ps1

Step 2:Install the library chardet

First, you need to install the chardet library. Open your terminal or command prompt and run the following command:

pip install chardet

img1

Step 3: Implement the Logic

Below Python code defines a function, 'detect_encoding(file_path), that uses the 'chardet' library to automatically determine the encoding of a text file specified by its path. It reads the file in binary mode, feeds each line to a universal detector from 'chardet', and stops when the detector is done or the file ends. The function then returns the detected encoding extracted from the detector's result, facilitating proper handling of diverse character sets during file processing.

Python
import chardet

def detect_encoding(file_path):
    with open(file_path, 'rb') as file:
        detector = chardet.universaldetector.UniversalDetector()
        for line in file:
            detector.feed(line)
            if detector.done:
                break
        detector.close()
    return detector.result['encoding']

Note: chardet provides a best-effort guess and may not always be accurate, especially for small files or files with limited character variation. It is recommended to verify results using confidence scores or fallback encodings.

Step 4: Add the File Path

Finally, let us use our function to identify the coding of a sample text file. Change the file path in the code below to match where your text file is stored.

Python
file_path = 'path/to/your/textfile.txt'
encoding = detect_encoding(file_path)
print(f'The encoding of the file is: {encoding}')

Step 5: Run the server

Save the whole script in a Python file (such as detect_encoding.py) and run it with your preferred Python interpreter, make sure to replace detect_encoding.py by the name of your actual script.

python detect_encoding.py

Output:

Recording-2024-01-25-181038

Comment