Hi!
I'm using the following script with chardet 7.0.1:
from pathlib import Path
import chardet
for file in ('a.txt', 'b.txt'):
enc = chardet.detect(Path(file).read_bytes())['encoding']
print(file, enc)
Where a.txt contains:
and b.txt contains:
<?xml encoding="Windows-1252"?>
<tag></tag>
The script prints:
a.txt Windows-1252
b.txt windows-1252
If we ignore that perhaps the files should be classified as ASCII, why does one encoding start with a capital W and one not?