Skip to content

ocamldoc: thread-safety tags#10983

Closed
Octachron wants to merge 2 commits intoocaml:trunkfrom
Octachron:ocamldoc_multicore_tags
Closed

ocamldoc: thread-safety tags#10983
Octachron wants to merge 2 commits intoocaml:trunkfrom
Octachron:ocamldoc_multicore_tags

Conversation

@Octachron
Copy link
Copy Markdown
Member

This PR adds basic support for the following ocamldoc documentation tags intended to be used to document thread-safety in OCaml 5:

  • @domain-safe
  • @systhread-safe
  • @fiber-safe
  • @concurrent-unsafe

or equivalently

  • @concurrency {domain-safe,systhread-safe,fiber-safe,concurrent-unsafe}

Those new tags are currently only displayed in the html backend, while waiting for discussion on the design. In particular, I hope that having a PR open will help to synchronize the design between both ocamldoc and odoc (cc @ocaml/odoc-dev)

In particular, currently a default @concurrent-unsafe tag is displayed for modules and module types if there are not annotated with an explicit thread-safety tag.

There is also the question if we want those tags to be displayed as ordinary documentation tags or have some more custom rendering at least on the html backend.

This commit adds basic support for the following tags intended to be used
to document concurrency-safety in OCaml 5:

@domain-safe
@systhread-safe
@fiber-safe
@concurrent-unsafe

or equivalently

@concurrency {domain-safe,systhread-safe,fiber-safe,concurrent-unsafe}

Those new tags are currently only displayed in the html backend,
while waiting for discussion on the design.

In particular, currently a default @concurrent-unsafe tag is displayed
for module and module types in there is no explicit concurrency tag.
We probably don't want to do that on individual items.
@dbuenzli
Copy link
Copy Markdown
Contributor

dbuenzli commented Feb 2, 2022

Two comments:

  1. I think I would rather like to have safe as the default and thus only have markings for unsafe stuff. Marking safe stuff is going to be incredibly noisy if you are a functional programmer.

  2. I don't think it's a good idea to add new ocamldoc tags. These things should be attributes (like @raise, @since @before should in fact be). This allows tooling to act upon these. odoc should then simply lift them into doc strings, like it should be doing for @@ocaml.deprecated which is at the moment completly unDRY as you have to make up both an alert and a doc string with the same content when you want to deprecated stuff.

@Julow
Copy link
Copy Markdown
Contributor

Julow commented Feb 2, 2022

I completely agree with @dbuenzli. Documentation tags and attributes are analogous and this might cause frustration if use cases are found for one in the future and people use the other.

Odoc currently doesn't read any attributes (other than the comments) and this should absolutely be changed.

Only attributes are in the parse tree. The syntax of tags is a bit easier than the syntax of attributes and is already in use, perhaps ocamldoc tags could be parsed by OCaml and inserted as attributes in the parsetree ?

@Octachron
Copy link
Copy Markdown
Member Author

I rather disagree, compiler attributes are meant for the compiler, and other code processing tools. A @since documentation tag is not consumed by the compiler in any way and should not be a attribute.

Documentation tags answer a separate concern and should not be mixed with compiler annotations because they are both annotations in a source file.

It would make sense for odoc to read attributes and integrate them to the documentation when there is no matching documentation tag, but that is a potential odoc feature. Moreover, it seem better to keep the separation of concerns even in those case: there is no reason to lock the deprecation tag information to be exactly the the deprecation message raised by the deprecation alert.

@dbuenzli
Copy link
Copy Markdown
Contributor

dbuenzli commented Feb 3, 2022

I rather disagree, compiler attributes are meant for the compiler, and other code processing tools.

Precisely :-) all these annotations are potentially useful for code analysis tools. Do you really want the next 100 phd students in OCaml static analysis to lose their time to parse ocamldoc comments to grab that information ?

A @since documentation tag is not consumed by the compiler in any way and should not be a attribute.

Well it should be consumed by the compiler ! There's no reason why we should discover through painful CI systems/opam repo requests when we use something too new while we target a lower version of OCaml (or a library). The compiler is in a perfect place to warn us about these trivialities if we can indicate it the minimal version numbers we have to target.

Moreover, it seem better to keep the separation of concerns even in those case: there is no reason to lock the deprecation tag information to be exactly the the deprecation message raised by the deprecation alert.

Which concerns are being separated exactly ? You can always be more flexible, but that doesn't mean it's useful. For now the system we have is just very annoying and non DRY, I have to repeat the same message both in the alert and in the documentation comment.

@dbuenzli
Copy link
Copy Markdown
Contributor

I was thinking again about this today and to make things more precise I think the questions to ask here are:

  1. How do we communicate thread (un)safety to users ?
  2. Do we want that information to be machine readable ?

Regarding 1., for me OCaml has always been a language safe by default. Never in my documentation life I indicated in my docs that something was thread-safe. I always added a warning when something was not thread-safe (example here). So I think the naming scheme proposed here is wrong. It should all be about unsafety not about safety. Safe should be the default like the OCaml language has always been.

Regarding 2. as soon as you want something to be machine readable for tooling (not necessarily the compiler itself), ocamldoc comments are not the right place to put information in, attributes are. If the machine readable information also needs to be communicated in the end user docs the corresponding attributes should simply be lifted into documentation rendering by the documentation tool.

@avsm
Copy link
Copy Markdown
Member

avsm commented Feb 10, 2022

On the topic of the nature of the tags, I agree with @dbuenzli that explicitly marking unsafety rather than marking safety would be better for the longer term. Documentation sticks around for the long term, and we want to converge on community code being safe by default.

I'm wondering about how to use these tags in a hierarchical way. It would be nice to be able to mark an entire module as unsafe while going through the reasons why it is, rather than tagging each individual function.

@Octachron
Copy link
Copy Markdown
Member Author

@avsm : we can already use tags at the module level (both with the ocamldoc tag and attribute variants) and function level.

Concerning, @dbuenzli 's first point: I tend to agree that having an unsafe default is probably too pessimistic.

Concerning the second point, I am still not convinced that we want the concurrency safety tags to be machine readable.
Does anyone has a clear use case in mind? A major part of my reluctance on that point is that currently ocamldoc doesn't support attributes at all. Adding last-minute support for attributes in ocamldoc in one of its last versions is not ideal in term of stability. Unfortunately, OCaml 5.0 seems too early for retiring ocamldoc in favor of odoc.

@dbuenzli
Copy link
Copy Markdown
Contributor

Does anyone has a clear use case in mind?

It could be as simple as a tool that indicates you all the places where you use thread-unsafe functions in a code base.

@edwintorok
Copy link
Copy Markdown
Contributor

edwintorok commented Feb 14, 2022

It would've been great if C function MT safety was available in a machine parseable way, e.g. currently the best I can think about is to take the Linux (3) manpages and look in the Attributes table (which although not meant to be machine parseable, is probably fairly regular) and generate a header file that would issue compiler warnings if a C stub attempts to use something that is not multi-thread safe (even though multithreading was already possible without multicore if a C stub never released its global OCaml runtime lock it wouldn't have run into multithreading issues).
Therefore I'd suggest that making thread safety in OCaml functions machine parseable to be a desirable feature: one could attempt to write some form of static analyzer that takes a look at an OCaml's function's safety annotation and the safety annotation of its C stub implementation and try to detect inconsistencies.

Similarly it'd be great if OCaml's interfacing with C documentation would have a table that documents which of the thread safety attributes of C library calls one should be careful of: https://man7.org/linux/man-pages/man7/attributes.7.html
It is not as simple as safe/unsafe, there are also conditionally-safe attributes, or functions that are safe to call only if you don't call any of the other MT-unsafe functions marked with 'const' (e.g. getenv is safe to call as long you don't call putenv anywhere in your code with multiple threads running)

Although I'd much prefer if the OCaml runtime was safe, do we need equivalents of some of the conditionally safe features in ocamldoc tags? E.g. looking at https://man7.org/linux/man-pages/man7/attributes.7.html again the following seem relevant:

  • init: there might be functions that you only call at program startup, and as long as you haven't started any additional domains yet it should be safe to call (e.g. Random.*init?). Although instead of an attribute I'd prefer if these functions raised an exception when called in unsafe context (i.e. with multiple domains already running)
  • race: I'd prefer if races could be eliminated, if not I'd consider them concurrency-unsafe
  • const: e.g. Unix.putenv, I'd prefer if this raises in a multi-domain context
  • sig: concurrency-unsafe, changing signal handlers in a multicore environment seems quite dangerous
  • term: nothing new here, calling these functions would've already been unsafe with systhreads without a Mutex, however they might be safe to call with a proper mutex.
  • cwd: depends what flags you call it with may or may not be safe
  • /condition: this might be a bit more complicated to check or surface in ocamldoc

Another reason to make these annotations machine-readable is possible interaction with a specification language such as https://ocaml-gospel.github.io/gospel/language/locations, where in order to prove your function correct you may also want to assert/prove it only calls functions with a given concurrency safety level (GOSPEL obviously doesn't have such an annotation yet, but it might be a desireable annotation to have in the future).

@gasche
Copy link
Copy Markdown
Member

gasche commented Feb 15, 2022

I think that there is a tension in this discussion between:

  • good-enough: We need a simple approach soon to update the Stdlib documentation before the first Multicore-enabled release (no ETA yet that I know of, but we don't want to wait too long).
  • best-practices: Whatever the compiler/stdlib does will be fairly influential within the ecosystem, other packages are likely to reuse the same approach, so we want the stdlib to do things right instead of being in a hurry.

I think the "best-practices" point is partly tamed by the fact that the stdlib uses ocamldoc, and most other packages have by now moved on to odoc, so at least the exact same approach of ocamldoc tags would probably not be reused. But the odoc people will have to decide if they want to do something, and they will also be in a hurry to release, so maybe they will do the same!

One issue with the "best-practices" point of view is that it's work, it takes more work, and currently no one has volunteered to do this work. On the other hand @Octachron has already done the "good-enough" work, and people actively contributing to the compiler right now tend to have their plates full already.

@Julow
Copy link
Copy Markdown
Contributor

Julow commented Feb 15, 2022

In every cases, odoc and ocamldoc have to be updated to recognize the annotations as both tools don't like new tags and both tools don't know alerts or other attributes.

I don't think it's that hard to change ocamldoc to recognize alerts. (we started the work in odoc here).
I volunteer to do the work.

@Octachron
Copy link
Copy Markdown
Member Author

The extractions of document comment in ocamldoc and odoc are currently done in completely different ways: ocamldoc is still extracting the documentation comment by itself and associate them to signature items after the fact. Thus I am not sure if you are volunteering for extending the ocamldoc document extractor to recognize attributes or for switching to a parsetree-based approach like odoc.

Note that I slowly considering that this might an opportunity to take the time to prepare ocamldoc for its retirement.

@Julow
Copy link
Copy Markdown
Contributor

Julow commented Feb 16, 2022

@Octachron You were right to warn me, it wasn't an easy modification! I've opened #11024 and took your first commit.

@Octachron
Copy link
Copy Markdown
Member Author

With #11024 on the point to be merged, it seems better to use attributes for thread safety. In fact, I am wondering if we should not directly use alerts for concurrency-safety now that they are exposed by the documentation.

@Octachron
Copy link
Copy Markdown
Member Author

Superseded by #11024.

@Octachron Octachron closed this Mar 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants