Skip to content

Attempt at finding the 'real' capitalization of files on windows#1462

Merged
voodoos merged 1 commit intoocaml:masterfrom
mlasson:mlasson-capitalized-module-windows
May 31, 2022
Merged

Attempt at finding the 'real' capitalization of files on windows#1462
voodoos merged 1 commit intoocaml:masterfrom
mlasson:mlasson-capitalized-module-windows

Conversation

@mlasson
Copy link
Copy Markdown
Contributor

@mlasson mlasson commented Apr 28, 2022

Description

When merlin is locating module files, it first tries the capitalized filename (eg. "Module.ml") before the uncapitalized one ("module.ml"). On windows, filenames are case insensitive by default so the first filename matches even if the "real" case is uncapitalized.

This is not a big problem (because... well it is case insensitive) but it still may confuse text editors that are interacting with merlin (eg. vscode through ocaml-lsp). Even if it is mostly cosmetics (you end up having tabs name that are capitalized when opened first after "Go To Definition", or "recently opened files" appears twice...).

What does this PR do ?

It uses "FindFirstFile" that acts as a kind of "dir/ls" that returns the first filename that matches the inputs query. Here the query is an absolute path without wildcard; so there is at most one match which return as a string option.

Let's say I have a document "Document.txt" in my home directory, then FindFirstFile("C:\UsErS\MlAsSon\dOcUmEnT.txt") would return "Document.txt".

This "exact case basename" is then returned to the caml code which compares it with the input basename to check if they are the same (not the same in my example, since "Document.txt" is not "dOcUmEnT.txt"). May be an alternative implementation could have been to to this comparison in the C code. 🤷

Note: this method does not "fix" the case for "whole path" only for the filename (contrary to what has been done
in commit bf79586 for macos).

I guess it is related with issues #425 ( and #453) (I guess these "Fixed need test" issues could be closed since it has been tested for 7 years now :)).

@mlasson
Copy link
Copy Markdown
Contributor Author

mlasson commented May 9, 2022

I don't think that the "ocaml-ci" failure is related to this PR.

@nojb
Copy link
Copy Markdown
Contributor

nojb commented May 16, 2022

For info, I also observed this issue in emacs as well.

@nojb
Copy link
Copy Markdown
Contributor

nojb commented May 16, 2022

cc @MisterDA who has worked on previous Windows PRs

@voodoos
Copy link
Copy Markdown
Collaborator

voodoos commented May 24, 2022

Anyone wants to review this ? @MisterDA, @nojb, @let-def ?

@MisterDA
Copy link
Copy Markdown
Contributor

Without testing the Windows specific code, one could also test that Merlin supports well case-insensitive filesystems on Linux. Maybe the issue can be reproduced there for non-Windows devs.

@MisterDA
Copy link
Copy Markdown
Contributor

I wonder if it wouldn't be better to open the file anyway, and get the "real" filename from the handle. Also, if we just compare the filenames, there might be problems with Unicode normalization.
https://docs.microsoft.com/en-us/windows/win32/memory/obtaining-a-file-name-from-a-file-handle

@MisterDA
Copy link
Copy Markdown
Contributor

Current master:

$ file=src/frontend/ocamlmerlin/ocamlmerlin_server.ml
$ dune exec -- ocamlmerlin single locate -position '95:11' -filename "$file" < "$file"
{"class":"return","value":{"file":"C:\\Users\\antonin\\Tarides\\merlin\\src\\utils\\Std.ml","pos":{"line":52,"col":6}},"notifications":[],"timing":{"clock":47,"cpu":31,"query":0,"pp":0,"reader":0,"ppx":0,"typer":31,"error":0}}

With this patch applied:

$ file=src/frontend/ocamlmerlin/ocamlmerlin_server.ml
$ dune exec -- ocamlmerlin single locate -position '95:11' -filename "$file" < "$file"
{"class":"return","value":{"file":"C:\\Users\\antonin\\Tarides\\merlin\\src\\utils\\std.ml","pos":{"line":52,"col":6}},"notifications":[],"timing":{"clock":31,"cpu":31,"query":0,"pp":0,"reader":0,"ppx":0,"typer":31,"error":0}}

So it seems to be working as expected (the std.ml file is lower-case in the returned json).

Copy link
Copy Markdown
Contributor

@nojb nojb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nojb
Copy link
Copy Markdown
Contributor

nojb commented May 24, 2022

I wonder if it wouldn't be better to open the file anyway, and get the "real" filename from the handle.

Yes, this is another possibility but then you need to parse the returned "real" path to extract the basename. I feel the current approach is simpler.

Also, if we just compare the filenames, there might be problems with Unicode normalization.

In theory this may also happen already under macOS that also has a special logic in this function, right? But in practice I wonder if it will ever happen that the basename contains non-ASCII characters. (And these would not be valid OCaml module names anyway.)

@voodoos
Copy link
Copy Markdown
Collaborator

voodoos commented May 24, 2022

Thank you @mlasson, could you add a changelog entry and maybe squash the irrelevant commits?

And thanks @MisterDA and @nojb for the review and testing!

@mlasson mlasson force-pushed the mlasson-capitalized-module-windows branch from bcc9ad5 to 4797b6b Compare May 24, 2022 17:45
@mlasson
Copy link
Copy Markdown
Contributor Author

mlasson commented May 24, 2022

Done !

@voodoos voodoos merged commit 88c0c2e into ocaml:master May 31, 2022
voodoos added a commit to voodoos/merlin that referenced this pull request Jun 29, 2022
from mlasson/mlasson-capitalized-module-windows
voodoos added a commit to voodoos/merlin that referenced this pull request Jun 30, 2022
from mlasson/mlasson-capitalized-module-windows
voodoos added a commit to voodoos/opam-repository that referenced this pull request Jun 30, 2022
CHANGES:

Thu Jun 30 14:51:42 CEST 2022

  + merlin binary
    - make most library public and split merlin in two packages: the
      `merlin-lib` package that exposes merlin's internals and the `merlin`
      package with the frontend. (ocaml/merlin#1448, ocaml/merlin#1455, ocaml/merlin#1457, ocaml/merlin#1497, @rgrinberg,
      @tmattio, @kit-ty-kate)
    - Type printing: use best_module_path for paths from Mty_alias (ocaml/merlin#1470)
    - Attempt at finding the 'real' capitalization of files on windows (ocaml/merlin#1462 by
      @mlasson)
    - Use newer `Seq`-based API of Yojson 2.0, avoiding the need for the
      deprecated `Stream` module (ocaml/merlin#1475 by @Leonidas-from-XIV)
    - unify parsing of `MERLIN_LOG` (ocaml/merlin#1480 by @ulugbekna)
    - Fix type deduplication in `type-enclosing` results (ocaml/merlin#1483, fixes ocaml/merlin#1477)
    - Only weakly reduce the shapes to speed up the new Merlin locate
      implementation. (ocaml/merlin#1488)
    - Ignore unknown configuration tags from dune configuration provider but not
      from dot-merlin-reader (ocaml/merlin#1486)
    - typing recovery: recover at the granularity of `core_type` (ocaml/merlin#1484)
  + editor modes
    - add method imenu items for emacs (ocaml/merlin#1481, @mndrix)
    - emacs: Make the prefix argument to `merlin-locate` optional, both for
      consistency with Emacs convention and for backwards compatibility. (ocaml/merlin#1476,
      @antalsz)
    - emacs: fix duplicated prefix path in imenu entries (ocaml/merlin#1495, @bcc32)
voodoos added a commit to voodoos/opam-repository that referenced this pull request Jul 1, 2022
CHANGES:

Fri Jul  1 12:51:42 CEST 2022

  + merlin binary
    - Type printing: use `best_module_path` for paths from `Mty_alias` (ocaml/merlin#1470)
    - Attempt at finding the 'real' capitalization of files on windows (ocaml/merlin#1462 by
      @mlasson)
    - Use newer `Seq`-based API of Yojson 2.0, avoiding the need for the
      deprecated `Stream` module (ocaml/merlin#1475 by @Leonidas-from-XIV)
    - unify parsing of `MERLIN_LOG` (ocaml/merlin#1480 by @ulugbekna)
    - Fix type deduplication in `type-enclosing` results (ocaml/merlin#1483, fixes ocaml/merlin#1477)
    - Ignore unknown configuration tags from dune configuration provider but not
      from dot-merlin-reader (ocaml/merlin#1486)
    - typing recovery: recover at the granularity of `core_type` (ocaml/merlin#1484)
  + editor modes
    - Fix `merlin-locate-in-new-window` is ignored (ocaml/merlin#1461 by @emturner,
      fixes ocaml/merlin#1460)
    - add method imenu items for emacs (ocaml/merlin#1481, @mndrix)
    - emacs: Make the prefix argument to `merlin-locate` optional, both for
      consistency with Emacs convention and for backwards compatibility. (ocaml/merlin#1476,
      @antalsz)
voodoos added a commit to voodoos/opam-repository that referenced this pull request Jul 1, 2022
CHANGES:

Fri Jul  1 12:51:42 CEST 2022

  + merlin binary
    - Type printing: use `best_module_path` for paths from `Mty_alias` (ocaml/merlin#1470)
    - Attempt at finding the 'real' capitalization of files on windows (ocaml/merlin#1462 by
      @mlasson)
    - Use newer `Seq`-based API of Yojson 2.0, avoiding the need for the
      deprecated `Stream` module (ocaml/merlin#1475 by @Leonidas-from-XIV)
    - unify parsing of `MERLIN_LOG` (ocaml/merlin#1480 by @ulugbekna)
    - Fix type deduplication in `type-enclosing` results (ocaml/merlin#1483, fixes ocaml/merlin#1477)
    - Ignore unknown configuration tags from dune configuration provider but not
      from dot-merlin-reader (ocaml/merlin#1486)
    - typing recovery: recover at the granularity of `core_type` (ocaml/merlin#1484)
  + editor modes
    - Fix `merlin-locate-in-new-window` is ignored (ocaml/merlin#1461 by @emturner,
      fixes ocaml/merlin#1460)
    - add method imenu items for emacs (ocaml/merlin#1481, @mndrix)
    - emacs: Make the prefix argument to `merlin-locate` optional, both for
      consistency with Emacs convention and for backwards compatibility. (ocaml/merlin#1476,
      @antalsz)
    - emacs: fix duplicated prefix path in imenu entries (ocaml/merlin#1495, @bcc32)
voodoos added a commit to voodoos/opam-repository that referenced this pull request Jul 1, 2022
CHANGES:

Fri Jul  1 12:51:42 CEST 2022

  + merlin binary
    - Type printing: use `best_module_path` for paths from `Mty_alias` (ocaml/merlin#1470)
    - Attempt at finding the 'real' capitalization of files on windows (ocaml/merlin#1462 by
      @mlasson)
    - Use newer `Seq`-based API of Yojson 2.0, avoiding the need for the
      deprecated `Stream` module (ocaml/merlin#1475 by @Leonidas-from-XIV)
    - unify parsing of `MERLIN_LOG` (ocaml/merlin#1480 by @ulugbekna)
    - Fix type deduplication in `type-enclosing` results (ocaml/merlin#1483, fixes ocaml/merlin#1477)
    - Ignore unknown configuration tags from dune configuration provider but not
      from dot-merlin-reader (ocaml/merlin#1486)
    - typing recovery: recover at the granularity of `core_type` (ocaml/merlin#1484)
  + editor modes
    - Fix `merlin-locate-in-new-window` is ignored (ocaml/merlin#1461 by @emturner,
      fixes ocaml/merlin#1460)
    - add method imenu items for emacs (ocaml/merlin#1481, @mndrix)
    - emacs: Make the prefix argument to `merlin-locate` optional, both for
      consistency with Emacs convention and for backwards compatibility. (ocaml/merlin#1476,
      @antalsz)
    - emacs: fix duplicated prefix path in imenu entries (ocaml/merlin#1495, @bcc32)
voodoos added a commit to voodoos/opam-repository that referenced this pull request Jul 1, 2022
CHANGES:

Fri Jul  1 12:51:42 CEST 2022

  + merlin binary
    - Type printing: use `best_module_path` for paths from `Mty_alias` (ocaml/merlin#1470)
    - Attempt at finding the 'real' capitalization of files on windows (ocaml/merlin#1462 by
      @mlasson)
    - Use newer `Seq`-based API of Yojson 2.0, avoiding the need for the
      deprecated `Stream` module (ocaml/merlin#1475 by @Leonidas-from-XIV)
    - unify parsing of `MERLIN_LOG` (ocaml/merlin#1480 by @ulugbekna)
    - Fix type deduplication in `type-enclosing` results (ocaml/merlin#1483, fixes ocaml/merlin#1477)
    - Ignore unknown configuration tags from dune configuration provider but not
      from dot-merlin-reader (ocaml/merlin#1486)
    - typing recovery: recover at the granularity of `core_type` (ocaml/merlin#1484)
  + editor modes
    - Fix `merlin-locate-in-new-window` is ignored (ocaml/merlin#1461 by @emturner,
      fixes ocaml/merlin#1460)
    - add method imenu items for emacs (ocaml/merlin#1481, @mndrix)
    - emacs: Make the prefix argument to `merlin-locate` optional, both for
      consistency with Emacs convention and for backwards compatibility. (ocaml/merlin#1476,
      @antalsz)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants