A document type for error messages#13169
Conversation
gasche
left a comment
There was a problem hiding this comment.
This is nice and overall quite reasonable. I asked a few questions. I don't understand the very fine details of the last commit, which I think is the most subtle one. (The rest I believe are correct.)
The fact that the testsuite passes mostly unchanged is a very good test for the PR, so I am confident in its overall correctness.
utils/format_doc.ml
Outdated
| | Break of { fits : string * int * string as 'a; breaks : 'a } | ||
| | Flush of { newline:bool } | ||
| | Newline | ||
| | If_newline |
There was a problem hiding this comment.
I was surprised by If_newline which is an odd construct (very imperative). @Octachron tells me that it is not actually used in the compiler so we could get rid of it if we wanted -- same with With_size which is also weird. I don't know if we should rather follow the oddities of Format to the letter, or remove the stuff we don't like and don't need.
There was a problem hiding this comment.
IIRC If_newline predates the form of Break here, and I wouldn't be surprised if If_newline is not really needed.
If you ever want to include static unicode text/symbols in diagnostics, then I expect With_size could be useful. If the unicode you want to emit is known ahead of time, the complicated question of computing its width can be worked around by specifying the size. Format itself works well in this way.
|
|
||
| end | ||
|
|
||
| (** {1 Compatibility API} *) |
There was a problem hiding this comment.
We could have the functions that are not so nice and mostly for compatibility in a Compat submodule. But I am okay with the current approach as well, the functions in fact tend to make sense already.
| The universal variable "'a" would escape its scope | ||
| The method "f" has type "'a -> int", but the expected method type was | ||
| "'a0. 'a0 -> int" | ||
| The universal variable "'a0" would escape its scope |
There was a problem hiding this comment.
Note to self: I liked seeing 'a and ´b better than 'a and 'a0 (personally I am fine with any order of occurrence, but I like the variables to be easy to distinguish). But maybe in some other examples reusing the same radical helps readability?
750d606 to
2032886
Compare
|
|
||
| (** {1 Compatibility API} *) | ||
|
|
||
| (** The functions and types below provides source compatibility with format |
There was a problem hiding this comment.
Opening this module adds the document types, fold, etc.. into the environment. What do you think of moving the compatibility API into a Compat submodule ?
Also, highlighting that it's at the same level as the Core API.
There was a problem hiding this comment.
One my design goal was to make the replacement of Format by Format_doc as invisible as possible.
Typically, the Core submodule is only exposed because I though it would be a shame to have a simple immutable API available and hide it away. I am not sure I would encourage people to switch to this API.
From that perspective, I think it is better for now to keep the most used features (the compatibility functions) at the toplevel of the module. But I could hide the core definitions inside the Core submodule?
There was a problem hiding this comment.
I have moved the type definitions and core functions inside the Core submodule, and renamed it to Doc. Like this, an open Format_doc only opens the compatibility API and the Doc submodule.
utils/format_doc.ml
Outdated
| | Flush {newline=false} -> Format.pp_print_flush ppf () | ||
| | Newline -> Format.pp_force_newline ppf () | ||
| | If_newline -> Format.pp_print_if_newline ppf () | ||
| | With_size _ -> () |
There was a problem hiding this comment.
Some functions like pp_print_substring_as, with_size and if_newline are not used. Is offering an alternative API to Format a goal ?
Otherwise, why not keep it smaller to help maintenance ?
You mentioned localization in #12182 (comment). Could it be built on top of the this API ? Making it as small as needed could help adapting it later.
There was a problem hiding this comment.
My first goal was full source compatibility with Format, since this means compiler contributors don't need to know two APIs.
(And if we were to move away from Format compatibility, I would rather adopt Fmt naming convention)
For basic localization, a simple way to add support for compiler variant would be to add hooks for the format string interpreters since we only want to localize static texts.
c23a81b to
b9b1b82
Compare
gasche
left a comment
There was a problem hiding this comment.
There is some code duplication between utils/format_doc.ml and stdlib/format.ml, for example the printf-interpretation functions (output_acc and friends), and some higher-level printers such as pp_print_text, which decompose into calls into more basic pp_print_* functions.
I wonder if it would be reasonable to reduce this duplication by introducing a functor that builds this higher-level structure over the pp_print_* functions. One instance would be in stdlib/format.ml, the other in utils/format_doc.ml. One pain point is that the functor should be in the stdlib itself, which is slightly annoying -- camlinternalFormatter.ml?
|
There certainly some part that could be factorized using a Thus, I would rather report the standard library bikeshedding to another PR. |
3fc2f25 to
7a1e00b
Compare
This ensures that only the compatibility API is available after an `open Format_doc` used to replace an `open Format`.
and let the compatibility function at the top of the Location modules to improve backward compatibility for compilerlibs users. * add a `Doc.quoted_filename` function too.
7a1e00b to
07128d4
Compare
|
I have updated the PR after some noise clean-up, I plan to merge it tomorrow. |
ocaml/ocaml#13169 added a document type for error messages and this change makes base compatible with OCaml trunk.
Due to a [recent change] on `ocaml/ocaml` `trunk`, `base.v0.16.3` fails to [install]. `base` is a build dependency of `owl`, which only has sequential benchmarks (no parallel ones). So, the failure happens only for sequential benchmarks. Parallel ones run alright. Closes ocaml-bench#470 [recent change]: ocaml/ocaml#13169 [install]: ocaml-bench/sandmark-nightly@197c9bd#diff-c635461b488c1e7d7b7feaf5fcb01945d54f84c512cf4db7e2dc390a2a23c5e8R456-R476
Due to a [recent change] on `ocaml/ocaml` `trunk`, `base.v0.16.3` fails to [install]. `base` is a build dependency of `owl`, which only has sequential benchmarks (no parallel ones). So, the failure happens only for sequential benchmarks. Parallel ones run alright. Closes #470 [recent change]: ocaml/ocaml#13169 [install]: ocaml-bench/sandmark-nightly@197c9bd#diff-c635461b488c1e7d7b7feaf5fcb01945d54f84c512cf4db7e2dc390a2a23c5e8R456-R476
|
This PR breaks the build of Coq, which contains the following code: The problem is that @Octachron, do you have an idea of how the user code could be fixed, preferably in a backward-compatible way? |
|
Replacing |
|
Okay, I went along and wrote a commit that adds conditional-compilation support to Coq for this one function, to make it work correctly both before and after 5.3 (I haven't submitted it as a PR for reasons that are clarified below): rocq-prover/rocq@master...gasche:coq:compat-symtable-error-handler And then I realized that there are probably more projects affected by this sort of breakage. For example if I run a github search with I haven't gone through the rest of the list, I'm sure there are other affected projects. I am in favor of document types for error messages, but this is too must breakage. Could we rework this to keep the |
|
Rescript is a divergent fork of OCaml 4.06 which is not affected by the issue. Thus we have two and half breakages? We could keep the deprecated version, and I would probably submit a PR do so this week; however it is unclear to me if it would not be faster to fix the handful of broken packages. |
|
FTR, I encountered |
Currently when the compiler needs to build a fragment of an error message while playing nice with
Format, the best option is to construct aFormat.formatter -> unitclosure. This construction is used in many part of the compilers.However, it has proven to be quite brittle due to the simultaneous uses of many global states when printing error messages, see #13099 for a recent example (which was triggered by a printing closure run in a different environment that the one it was built in).
This PR proposes to remove this brittleness by a introducing a
documenttype which represents a list of formatting instructionsto be sent to an actual formatting engine at a later point. Since this
documenttype contains only data and no closures (except for an escape hatch), we are guaranteed that printing this type does not create side-effects without losing any of the formatting abilities ofFormat. In particular, this document type can be serialized while preserving all the break hints, formatting boxes and tags (which semantics could then be reimplemented by developer tools).In term of implementation, this PR adds a new
Format_docmodule which provides:documenttypeFormatmoduleFormat_docprinter toFormatprinterdocumentin an immutable wayAs a consequence the
Format_docmodule is source-compatible with theFormatmodule, and for most of the compiler modules replacingopen Formatby anopen Format_docis enough to switch to the new kinds of printers.