-
Notifications
You must be signed in to change notification settings - Fork 291
Wrong detection UTF-8 with ö symbol #288
Copy link
Copy link
Closed
Description
Hi! I'm not an expert in encoding. Can someone please advise me on what I'm doing wrong? The example seems quite simple, but the result is incorrect. Perhaps, it's assumed to use a library for larger texts?
b = b'Sch\xc3\xb6ne gesunde Pflanzen'
chardet.detect(b) # {'encoding': 'ISO-8859-9', 'confidence': 0.6294978352301421, 'language': 'Turkish'}
b.decode(chardet.detect(b)['encoding']) # Result: 'Schöne gesunde Pflanzen'
b.decode("utf-8") # Result: 'Schöne gesunde Pflanzen'
5.2.0/3.11.7
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels