Original bug ID: 6692
Reporter: @whitequark
Status: closed (set by @gasche on 2014-12-13T20:58:57Z)
Resolution: suspended
Priority: normal
Severity: feature
Category: ~DO NOT USE (was: OCaml general)
Related to: #6695 #6697 #6704
Monitored by: lelf @hcarty @dbuenzli @yakobowski
Bug description
Any modern language obviously should handle source code in non-latin scripts.
I think it would be possible to change only OCaml's lexer to properly parse UTF-8, possibly behind a command-line flag, and without, yet, any stdlib API changes.
I volunteer to write a patch if there is some consensus on how to bootstrap it--embedding sedlex being the simplest solution in my view.
Original bug ID: 6692
Reporter: @whitequark
Status: closed (set by @gasche on 2014-12-13T20:58:57Z)
Resolution: suspended
Priority: normal
Severity: feature
Category: ~DO NOT USE (was: OCaml general)
Related to: #6695 #6697 #6704
Monitored by: lelf @hcarty @dbuenzli @yakobowski
Bug description
Any modern language obviously should handle source code in non-latin scripts.
I think it would be possible to change only OCaml's lexer to properly parse UTF-8, possibly behind a command-line flag, and without, yet, any stdlib API changes.
I volunteer to write a patch if there is some consensus on how to bootstrap it--embedding sedlex being the simplest solution in my view.