Skip to content

Invalid location with utf-8 encoded files #11899

@hhugo

Description

@hhugo

(Not really a bug since the compiler doesn't officially support utf-8 encoded sources. But ..)

Reading #11736 and working on utf support in jsoo made me realize that while one can already use utf-8 in string literal and comments, the computed locations will be wrong. Computed column would be number of bytes instead of number codepoint.

I didn't read #11736 carefully but I suspect it would also result in bad locations.

Here is a small example:

We currently have

$ cat c.ml
let (s : string) = "npiπππ" ^ 2
$ ocamlc -c c.ml
File "c.ml", line 1, characters 33-34:
1 | let (s : string) = "npiπππ" ^ 2
                                     ^
Error: This expression has type int but an expression was expected of type
         string

where it should be

$ cat c.ml
let (s : string) = "npiπππ" ^ 2
$ ocamlc -c c.ml
File "c.ml", line 1, characters 30-31:
1 | let (s : string) = "npiπππ" ^ 2
                                  ^
Error: This expression has type int but an expression was expected of type
         string

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions