Skip to content

TMC manual: help!#5

Closed
gasche wants to merge 447 commits intoOctachron:trunkfrom
gasche:trmc-manual
Closed

TMC manual: help!#5
gasche wants to merge 447 commits intoOctachron:trunkfrom
gasche:trmc-manual

Conversation

@gasche
Copy link
Copy Markdown

@gasche gasche commented Oct 31, 2021

I'm preparing a manual chapter on TMC (see ocaml#9760). I made the mistake of writing the chapter in one go, without trying to compile, and now I'm plagued by many caml-tex errors that I have the hardest time fixing. Can you help me?

At some point I started creating commits for intermediary states for confusing or hard-to-debug errors. Some errors are sub-optimal because they do not contain line numbers

  Error when parsing the following phrase:
  let[@tail_mod_cons rec bind (f : 'a -> 'a tree) (t : 'a tree) : 'a tree =
  match t with
  | Leaf v -> (f[@tailcall false]) v
  | Node (left, right) -> Node (bind f left, (bind[@tailcall]) f right)

Some errors are horrible because, in addition, they contain no clue as to where the issue may be

  Uncaught exception: Failure("lexing: empty token")
  Fatal error: exception Failure("lexing: empty token")

Finally, some errors are even worse, because they have line numbers (many) but I have no clue what it is about those lines that they are talking about:

    Error when evaluating a caml_example environment in extensions/tail_mod_cons.etex, line 176:
    Textual transforms must be well-separated.
    The "underline" transform spanned the interval 113-157,
    intersecting with another "underline" transform  on the 124-138 interval.
    Hind: did you try to elide a code fragment which raised a warning?

Note: one thing that seems to be improving the situation with my file has been to replace all occurrences of "[@tailcall]" and "[@tail_mod_cons]" in TeX context (outside the caml_example environment) by "[\@tailcall]" and "[\@tail_mod_cons]". I'm not sure this is necessary or valid, but I suspect that the @ sign there was somehow getting some other markup machinery very confused (there are occurrences of @"foo" "bar"@ in other .etex files).

I would be very interested in the following:

  • In the short term, help to debug the .etex / fix the markup errors, so that I can submit a documentation PR that builds (it's painful for reviewers to read the .etex directly, I would rather upload a HTML output).
  • In the medium term, it would be very nice if you could improve the error-message machinery in caml-tex to provide line numbers on the errors that are missing them. I still have no clue what the last error is about, but I suspect that there might also be something in the parsing code to be made more robust.

P.S.: your trunk reference appears to be very old.

Octachron and others added 30 commits August 24, 2021 16:54
…prim

ocaml#10450: restore support for %apply and %revapply with non-translucid types
So that all AMD64 cases come before the ARM and ARM64 cases.
Using the "return after stack overflow" approach.

Fixes:  ocaml#10547
Stack overflow detection and naked pointers checking for ARM64 (Linux and macOS)
These functions have been marked as deprecated for a while, are superseded by functions from the Marshal module, and seem unused in OPAM packages.  Plus, this removes the dependency of Obj on Marshal.
The presence of this field will be checked by opam lint starting with opam 2.2.0
When we run

    List.init 300 (Fun.const "a")

we can see that the toplevel in one case prints

    ""... (* string length 1; truncated *)

instead of just "a". The comment that the string was truncated takes at
least 36 characters, so truncating does not make much sense here.
If we are going to print the comment, it doesn't hurt much to also show
a part of the string to the user.

A byte can be printed as up to 4 characters, due to escaping.
So printing strings of length up to 8 bytes will always be shorter
without truncating.

There are still cases when truncating gives a longer text than not,
but it's unavoidable if the length of the printed prefix is a
nondecreasing function of the string length.
oprint: Truncate strings only after 8 bytes
s390x: use 8 integer registers for parameter passing instead of 5.
POWER: use 16 integer registers for parameter passing instead of 8.
This makes tail calls more usable on these two platforms.
It brings them inline with ARM (8 param regs) and ARM64 and RISC-V (16 param regs).
The build system is vast enough without two names for the same thing!
Fix manpages build when libraries are disabled
gasche added 18 commits October 26, 2021 17:13
(suggestion from Konstantin Romanov)
  Error when parsing the following phrase:
  let[@tail_mod_cons rec bind (f : 'a -> 'a tree) (t : 'a tree) : 'a tree =
  match t with
  | Leaf v -> (f[@tailcall false]) v
  | Node (left, right) -> Node (bind f left, (bind[@tailcall]) f right)
I'm not sure why I need to repeat the type definition from the
previous phrase, but I guess I need to go re-read the documentation of
caml_example to understand how state is preserved.

Nothing particularly worrying here.
  Uncaught exception: Failure("lexing: empty token")
  Fatal error: exception Failure("lexing: empty token")
  Error when evaluating a caml_example environment in extensions/tail_mod_cons.etex, line 176:
  Textual transforms must be well-separated.
  The "underline" transform spanned the interval 113-157,
  intersecting with another "underline" transform  on the 124-138 interval.
  Hind: did you try to elide a code fragment which raised a warning?
    Unknown caml_example option: [ok].
    Supported options are "ok","error", or "warning=n" (with n a warning number).
    Error when evaluating a caml_example environment in extensions/tail_mod_cons.etex, line 176:
    Textual transforms must be well-separated.
    The "underline" transform spanned the interval 113-157,
    intersecting with another "underline" transform  on the 124-138 interval.
    Hind: did you try to elide a code fragment which raised a warning?
@Octachron
Copy link
Copy Markdown
Owner

I will have a look, the most worrying part is the question of intersecting underlining; this is supposed to mean that more than one warning is trying to underline the same location, and I don't remember if there is any way to turn off this error right now.

@gasche
Copy link
Copy Markdown
Author

gasche commented Oct 31, 2021

Let me first clarify that, by "short term", I certainly did not mean on a Sunday (nor: on a holiday tomorrow)!

What you say about the error makes sense; the tail-mod-cons implementation does often have overlapping warnings.

@Octachron
Copy link
Copy Markdown
Owner

The intersection error should be fixed by ocaml#10746 .
The remaining errors are missing labels for sections and warning numbers :

--- a/manual/src/refman/extensions/tail_mod_cons.etex
+++ b/manual/src/refman/extensions/tail_mod_cons.etex
@@ -203,7 +203,7 @@ let[@tail_mod_cons] rec map_vars f exp =
     Let ((f v, (map_vars[@tailcall]) f def), (map_vars[@tailcall]) f body)
 \end{caml_example*}
 

-\subsection{Danger: getting out of tail-mod-cons}
+\subsection{ss:tmc_danger}{Danger: getting out of tail-mod-cons}
 
 Due to the nature of the tail-mod-cons transformation
 (see Section~\ref{ss:details} for a presentation of transformation):
@@ -254,7 +254,7 @@ The same warning occurs when "append_flatten" is a non-tail-mod-cons
 function of the same recursive group; using the tail-mod-cons
 transformation is a property of individual functions, not whole
 recursive groups.
-\begin{caml_example*}{verbatim}[warning]
+\begin{caml_example*}{verbatim}[warning=71]
 let[@tail_mod_cons] rec flatten = function
 | [] -> []
 | xs :: xss ->
@@ -283,7 +283,7 @@ Non-recursive functions can also be marked "[\@tail_mod_cons]"; this is
 typically useful for local bindings to recursive functions.
 
 Incorrect version:
-\begin{caml_example*}{verbatim}[warning]
+\begin{caml_example*}{verbatim}[warning=51,warning=71]
 let[@tail_mod_cons] rec map_vars f exp =
   let self exp = map_vars f exp in
   match exp with
@@ -308,7 +308,7 @@ parameter (the transformation only works with calls to
 known functions).
 
 For example, consider a substitution function on binary trees:
-\begin{caml_example*}{verbatim}[warning]
+\begin{caml_example*}{verbatim}[warning=72]
 type 'a tree = Leaf of 'a | Node of 'a tree * 'a tree
 
 let[@tail_mod_cons] rec bind (f : 'a -> 'a tree) (t : 'a tree) : 'a tree =
@@ -332,8 +332,7 @@ let[@tail_mod_cons] rec bind (f : 'a -> 'a tree) (t : 'a tree) : 'a tree =
   | Node (left, right) -> Node (bind f left, (bind[@tailcall

@gasche
Copy link
Copy Markdown
Author

gasche commented Nov 26, 2021

@Octachron I'm revisiting this now. Thanks to your help and location-intersection-support PR, I will have a working manual section and am unblocked. But I still think the error-reporting in other cases should be fixed (it's a UX problem if I need to ask the caml-tex maintainer to understand what's going on); it's not high-priority, but it matters. I propose that we look at this together the next time we met in person and no one has to run urgently.

@gasche
Copy link
Copy Markdown
Author

gasche commented Nov 26, 2021

Specifically the following error outputs need fixing:

  Error when parsing the following phrase:
  let[@tail_mod_cons rec bind (f : 'a -> 'a tree) (t : 'a tree) : 'a tree =
  match t with
  | Leaf v -> (f[@tailcall false]) v
  | Node (left, right) -> Node (bind f left, (bind[@tailcall]) f right)

(missing line numbers)

    Unknown caml_example option: [ok].
    Supported options are "ok","error", or "warning=n" (with n a warning number).

"ok" should work, and there should be a cue about the multi-warning case because I didn't know about it before your message above.

  Uncaught exception: Failure("lexing: empty token")
  Fatal error: exception Failure("lexing: empty token")

wat?

@Octachron
Copy link
Copy Markdown
Owner

I agree that those error messages should be fixed.
We could be definitively do that during a synchronous hacking session.

Octachron pushed a commit that referenced this pull request Jul 26, 2024
…l#13294)

The toplevel printer detects cycles by keeping a hashtable of values
that it has already traversed.

However, some OCaml runtime types (at least bigarrays) may be
partially uninitialized, and hashing them at arbitrary program points
may read uninitialized memory. In particular, the OCaml testsuite
fails when running with a memory-sanitizer enabled, as bigarray
printing results in reads to uninitialized memory:

```
==133712==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4e6d11 in caml_ba_hash /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45
    #1 0x52474a in caml_hash /var/home/edwin/git/ocaml/runtime/hash.c:251:35
    #2 0x599ebf in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1065:14
    #3 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    #4 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    #5 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #6 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

  Uninitialized value was created by a heap allocation
    #0 0x47d306 in malloc (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x47d306) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)
    #1 0x4e7960 in caml_ba_alloc /var/home/edwin/git/ocaml/runtime/bigarray.c:246:12
    #2 0x4e801f in caml_ba_create /var/home/edwin/git/ocaml/runtime/bigarray.c:673:10
    #3 0x59b8fc in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
    #4 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    #5 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    #6 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #7 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #8 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45 in caml_ba_hash
```

The only use of hashing in genprintval is to avoid cycles, that is, it
is only useful for OCaml values that contain other OCaml values
(including possibly themselves). Bigarrays cannot introduce cycles,
and they are always printed as "<abstr>" anyway.

The present commit proposes to be more conservative in which values
are hashed by the cycle detector to avoid this issue: we skip hashing
any value with tag above No_scan_tag -- which may not contain any
OCaml values.

Suggested-by: Gabriel Scherer <gabriel.scherer@gmail.com>

Signed-off-by: Edwin Török <edwin.torok@cloud.com>
Co-authored-by: Edwin Török <edwin.torok@cloud.com>
@Octachron Octachron closed this Oct 18, 2024
Octachron pushed a commit that referenced this pull request Jul 30, 2025
Move the orphaned ephemerons GC colour check inside the barrier.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.