Hello ,
I have a situation where I'm trying to detect encoding of ISO-8859-2 file.
In [1]: import chardet
In [2]: chardet.__version__
Out[2]: '2.2.1'
In [3]: chardet.detect(file('iso_file.csv', mode='rb').read())
Out[3]: {'confidence': 0.8727101643152726, 'encoding': 'ISO-8859-2'}
As you can see it's properly detected.
But after pip install -U chardet
In [16]: import chardet
In [17]: chardet.__version__
Out[17]: '2.3.0'
In [18]: chardet.detect(file('iso_file.csv', mode='rb').read())
Out[18]: {'confidence': 1.0, 'encoding': 'UTF-8-SIG'}
Can you provide some details around what was changed in new version that would trigger incorrect behaviour and what I should do from my side to help library better recognize encoding?