Concurrency safety documentation, part one: concurrency unsafe types#11193
Concurrency safety documentation, part one: concurrency unsafe types#11193Octachron merged 2 commits intoocaml:trunkfrom
Conversation
78350b9 to
0d86860
Compare
stdlib/buffer.mli
Outdated
| (** {b Concurrency safety} *) | ||
|
|
||
| [@@@alert concurrency_unsafe | ||
| "Concurrent use of buffers is a programming error." |
There was a problem hiding this comment.
Thanks for the PR to document the concurrency safety. Putting myself into the shoes of a user, I feel that the current documentation errs too much on the side of safety and makes the module unusable for reasonable use cases.
We are conflating concurrency and parallelism when we mean concurrency_unsafe. To be precise, we have three modes of concurrency in OCaml 5: fibers, systhreads and domains. OCaml 4 had 2: systhreads and monadic concurrency libraries such as Lwt and Async.
The observation is that if the user only uses fibers, with a library such as eio, then the Buffer module is safe. In particular, anything that is safe with Lwt (modulo the parts that uses systhreads -- Lwt_preemptive and others?) should remain safe with fibers.
There was a problem hiding this comment.
I could rename the section and alert to Parallelism safety or Domain safety? Thread safety might be confusing with systhreads being a concurrency mechanism.
Similarly, if we want to de-emphasize those warnings, I can remove the alert and move the section to the end of the module documentations (with an example of a potential issue?). That sounds sensible to me since I am not sure if people will ever enable the alert since it is such an imprecise tool. Note however, that the alert is disabled by default, thus the modules are still perfectly usable.
There was a problem hiding this comment.
If we use Parallelism safety then eventually we would need to decide what Concurrency safety will mean. Is that fibers or systhreads? Hence, a better option might be to name it Domain safety. CC @avsm who raised the issue.
Also, would it be possible to document individual functions with the safety tags? I'm thinking for modules like Arrays, only a few functions are domain unsafe.
We should also define what safety means. Does "safety" mean no crashes or something stronger like linearizability? @fpottier @gasche who may have opinions on this.
There was a problem hiding this comment.
It is possible to add alerts to individual functions (or documentation comments) when needed.
For arrays, do you meant the blit function?
In term of users documentation, iI was draw to a hierarchy of the form:
- no crashes
- no broken abstractions
- linearizability
since corrupting a data structure (breaking abstractions and expected invariants) is a surprising result for a sequential OCaml programmer.
If I am not mistaken, the first level of safety ought to be guarantee by the language, so it is better documented in the future memory model section of the manual to focus on module specific in the module-by-module documentation.
There was a problem hiding this comment.
For arrays, do you meant the blit function?
I did have the Array.blit function in mind. In particular, Array.blit on float arrays does not respect the memory model.
Lines 345 to 356 in 66128ba
|
I've re-reviewed the PR. I feel that the recommendation such as this one: is too coarse-grained and suffers the risk of misinterpretation in multiple ways. It is unclear that the intended meaning of "concurrent use" is "accessing a buffer from multiple domains without a mutex protecting the buffer". In the parallel programming literature, "concurrent use" does not mean necessarily mean accessing without a mutex. I also agree with the earlier comment that "thread-safety" is not an appropriate term given that we have 3 kinds of threading -- domains, systhreads and user-level threads. The challenge here is that we do not have a widely accepted terminology for ready use. I don't have a good answer to this. What do you think @gasche @fpottier @stedolan? What is the purpose of disabled-by-default alerts? When are the users likely to encounter these? If these are disabled by default, is their purpose to serve as documentation to be read by the users in the API docs? I don't know how alerts work in the compiler. Does building libraries using |
|
Error as warnings don't affect alerts. The intended purpose of the alerts was both for documentation purpose and to give users the option to selectively enable the alert in a limited scope. |
|
After discussing with @Octachron privately, I agree that trying to come up with precise documentation here is not the goal. |
|
Re. "concurrent use": one could say "racy concurrent use" (precise for people familiar with concurrent races, weird for everyone else), or "non-synchronized concurrent use", or "unsafe concurrent use". |
|
"Unsynchronized concurrent access" would also work |
|
|
|
What about establishing |
This sounds good to me. |
|
"(Un)synchronized access" is also the term we use when teaching Concurrent & Distributed Systems in Cambridge. The "concurrent" is dropped as synchronisation is rarely useful in a sequential setting (with callback-driven code being the exception). |
|
I like that the |
6aa3a7c to
96fb2d5
Compare
stdlib/weak.mli
Outdated
|
|
||
| (** {b Unsynchronized accesses} | ||
|
|
||
| Unsynchronized accesses of weak hash sets is a programming error. |
There was a problem hiding this comment.
The alert is missing here. Any particular reason for this?
There was a problem hiding this comment.
The alert should be on the module produced by the Make functor which is not currently supported as far as I can see.
stdlib/buffer.mli
Outdated
| ] | ||
|
|
||
| (** | ||
| Unsynchronized accesses might break module abstraction and create invalid |
There was a problem hiding this comment.
"Breaking module abstraction" sounds quite abstract. Is that needed?
Can we perhaps go with
Unsynchronized access to a buffer may lead to an invalid buffer state. Thus, concurrent uses of a buffer must be synchronized (for instance with a {!Mutex.t}).
I'm also avoiding the term "sequentialized" as it is unclear what that means.
A similar change needs to be made in other places.
There was a problem hiding this comment.
Creating an invalid state is a better phrasing indeed. I would rather avoid switching between access and uses between the two sentence. It is also not clear to me if there a case where a single unsynchronized access can create an invalid buffer state, thus accesses seems better to me:
Unsynchronized accesses to a buffer may lead to an invalid buffer state. Thus, concurrent accesses to a buffer must be synchronized (for instance with a {!Mutex.t}).
?
We can go with |
|
I don't know: a more explicit name for the alert would I fear that with Maybe a better shorter name would be |
|
Hmm. In a concurrent setting, which is where the term unsynchronized makes sense, unsynchronized access is always unsafe. So Taking a step back, what is the purpose of the alert? Is it to enable users to optionally enable this alert to discover whether they are using any "concurrency unsafe" code? Presumably, the users are enabling this alert for their concurrent or parallel code. In this case, |
|
The workflow that I am considering for the alert would look like:
But then, outside of false positives, the I think that I was trying too hard to harden the name against the false positive cases by taking in account that when enabling the alert on a new module, there will always be some false positives. I would thus propose to go with |
|
Sounds good to me. Let's go with |
96fb2d5 to
7c79659
Compare
Concurrency safety documentation, part one: concurrency unsafe types (cherry picked from commit db5949f)
|
Cherry-picked on 5.0 as 616fa13 . |
|
I get this alert in a CI build on macOS while running ocamldoc: https://github.com/ocaml-multicore/ocaml-tsan/actions/runs/3751404038/jobs/6372347665 |
|
@gadmm made an excellent point on Discuss: when using a datastructure, it is not necessarily obvious which operations are read-only and which operations may mutate the structure, and this information is necessary to reason about whether a given usage is racy. (Read/read operations in parallel are not racy.) I think that we should consider documenting explicitly which operations act as read-only on our datastructures. For example:
|
This seems like it should be fixed. It looks like a possibly premature optimization in the code. |
|
There is something that I don't understand about the unsynchronized_access alert:
|
|
I feart that this is a rebasing mistake on my side: the alert should also be added to the module aliases exported by the |
|
Ah, so this is basically #11867 making our life painful. |
This PR adds a warning about concurrency safety in the documentation of the modules:
This PR focuses on types for which module abstraction is broken on concurrent uses (excluding channels that are under work right now). I am planning to have a separate PR for at least modules with global mutable states, and a third one for more subtle types like arrays and
Lazy.t.Except for the
OoandWeakmodule, I have added a new globalconcurrency_unsafealert to the documented module. This alert is currently disabled by default to make it less invasive.I have tried to have a standardized warning note on all modules, at the cost of a small rewrite of the warning note newly added in
Scanfby @OlivierNicole .