Use "error checking" mutexes in the threads library#9846
Use "error checking" mutexes in the threads library#9846xavierleroy merged 14 commits intoocaml:trunkfrom
Conversation
070affa to
75a2a82
Compare
gadmm
left a comment
There was a problem hiding this comment.
A small comment about st_mutex_create. I did not review the win32 version.
|
The "asynchronous actions" test, taken from #9757, is failing with C stack overflows on some platforms, probably because of a bug in the runtime system, see my comment 15b9d00#r41619894 . After several attempts at making this test more robust, I dropped it. |
dra27
left a comment
There was a problem hiding this comment.
I made various comments, but the implementation looks correct to me. I have one concern about the Windows implementation, mainly as it looks like a possible side-channel attack.
If a thread terminates without releasing a mutex it holds, I believe the errorcheck semantics are that nothing will ever lock that mutex again?
For Windows, the behaviour of the critical section underpinning the mutex is undefined in that situation. Worse, thread and process IDs are global and can be re-used - so if Windows did allow an abandoned critical section to be re-locked by another thread (I'm not sure that it doesn't already), and an external agent manipulated the system such that the thread ID had been reused within that process, then the OCaml invariant on m->owner would be broken.
That makes me wonder if we shouldn't rely on GetCurrentThreadId but should allocate a unique ID of our own and store that in Tls? It wouldn't even waste a Tls slot if we ultimate convert condition variables to Windows native condition variables, since the slot used for the Event object would be "freed".
|
Thanks a lot for the review. I pushed some updates. Now we need to discuss whether this semantics for mutexes is the one we want. (It is the one I prefer.) |
|
Regarding my comment about |
|
For the semantics, can we offer both? It would just require the Unix implementation to include the OCaml thread ID as well? |
|
In case the recursive nature of Windows mutexes happened to be important to some, I had thought indeed that one could add an optional argument to Mutex.create to select the desired semantics, to help with code adaptation. Though I imagine that for the semaphore semantics you would prefer to introduce them as a separate module. |
Sorry I missed that. We could probably use the OCaml thread ID, with a small change of internal API (some st_posix.h and st_win32.h functions would take the thread ID as extra argument). It might even be more efficient than calling The general problem of a thread that terminates while holding mutexes remains. Some PThreads implementations have mutex attributes that cause pthread_mutex_lock to fail with a distinctive error code when trying to lock a mutex owned by a terminated thread, so that recovery can be attempted. |
|
Multiple "à la carte" semantics of mutexes (strict checking, recursive, semaphore-like) could probably be implemented, but at some conceptual cost (will users know which one to use?) and some performance cost too (it would be less efficient than just a POSIX thread mutex). Although I need to investigate alternate implementations that could take advantage of the master lock. |
|
We currently rely on the semaphore-like semantics of If you do decide to go ahead with this change – and it seems like a good idea to me – we'd very much appreciate a binary semaphore implementation to go along with it. We could then use a shim library that selects the right API according to the current OCaml version. Regarding the conceptual cost of multiple mutex implementations: I suspect that good documentation would make confusion unlikely. Thanks for your efforts :-) (CC: @icristescu, @pascutto, @samoht.) |
Done as suggested, commit 0d2200a. I tried several designs, including one where the OCaml thread identifier is used, but this design looks clearer to me. I would like to merge this PR within one week. A final review would be most welcome. |
gadmm
left a comment
There was a problem hiding this comment.
Minor comments; I did not review the win32 implementation.
Given the various incompatibilities reported, I am still on the fence as to whether it is not more prudent to keep a fast/legacy behaviour as the default, and let users choose/evolve their mutex variant explicitly with a flag to Mutex.create.
- Mutex.lock raises Sys_error if the mutex is already locked by the calling thread. - Mutex.unlock raises Sys_error if the mutex is unlocked or locked by another thread.
Co-authored-by: David Allsopp <david.allsopp@metastack.com>
Leftover code from an unrelated experiment.
Co-authored-by: David Allsopp <david.allsopp@metastack.com>
Identifiers returned by GetCurrentThreadId() can be reused early. If a thread stops while holding a mutex, and its thread ID is reused for another thread, the mutex could be unlocked by this other thread, which is undefined behavior. To make this scenario even less likely, this PR uses the OCaml thread identifier (a linearly-allocated integer) instead of the result of GetCurrentThreadId().
11618e2 to
7464511
Compare
7464511 to
b0291c0
Compare
dra27
left a comment
There was a problem hiding this comment.
There's a minor nit in a comment, and so while I was there I dealt with the check-typo nits (especially as they were in a previous suggestion of mine ... we need to persuade GitHub to run our AWK scripts in the UI 😉)
I see the problem in the earlier implementation calling TlsSetValue in the wrong thread - but couldn't the initialisation instead have moved caml_thread_start (perhaps with the introduction of an OS-specific st_thread_startup call or something immediately after the st_tls_set(thread_descriptor_key? Or was there something else I've missed? I haven't tried to see if it's measurable, but it seems the fast path for locking/unlocking has gone from an OS-optimised call (either GetCurrentThreadId or a direct call TlsGetValue) to something potentially slower, given the indirection?
Co-authored-by: David Allsopp <david.allsopp@metastack.com>
Co-authored-by: David Allsopp <david.allsopp@metastack.com>
Co-authored-by: David Allsopp <david.allsopp@metastack.com>
|
Thanks for your help fighting check-typo!
Right, this would have been another way to go about it. But after much back-and-forth I decided it would duplicate in st_win32.h much of the thread ID machinery already present in st_stubs.c. Plus, OCaml thread identifiers are OCaml tagged integers, so they are never 0...
I would expect the cost of the extra indirection to be small compared with the cost of the call to TlsGetValue or to GetCurrentThreadId. |
|
Thanks for the reviews. It's time to cross that bridge... |
This is an alternate implementation of #9757. The purpose is the same:
Mutex.lockraisesSys_errorif the mutex is already locked by the calling thread.Mutex.unlockraisesSys_errorif the mutex is unlocked or locked by another thread.The main difference with #9757 is the Win32 implementation, which I hope is correct, and some more testing.