Use "error checking" mutexes in the threads library by xavierleroy · Pull Request #9846 · ocaml/ocaml

xavierleroy · 2020-08-19T16:05:26Z

This is an alternate implementation of #9757. The purpose is the same:

Make sure that Mutex.lock raises Sys_error if the mutex is already locked by the calling thread.
Make sure that Mutex.unlock raises Sys_error if the mutex is unlocked or locked by another thread.

The main difference with #9757 is the Win32 implementation, which I hope is correct, and some more testing.

gadmm

A small comment about st_mutex_create. I did not review the win32 version.

otherlibs/systhreads/st_posix.h

xavierleroy · 2020-08-20T09:52:51Z

The "asynchronous actions" test, taken from #9757, is failing with C stack overflows on some platforms, probably because of a bug in the runtime system, see my comment 15b9d00#r41619894 . After several attempts at making this test more robust, I dropped it.

dra27

I made various comments, but the implementation looks correct to me. I have one concern about the Windows implementation, mainly as it looks like a possible side-channel attack.

If a thread terminates without releasing a mutex it holds, I believe the errorcheck semantics are that nothing will ever lock that mutex again?

For Windows, the behaviour of the critical section underpinning the mutex is undefined in that situation. Worse, thread and process IDs are global and can be re-used - so if Windows did allow an abandoned critical section to be re-locked by another thread (I'm not sure that it doesn't already), and an external agent manipulated the system such that the thread ID had been reused within that process, then the OCaml invariant on m->owner would be broken.

That makes me wonder if we shouldn't rely on GetCurrentThreadId but should allocate a unique ID of our own and store that in Tls? It wouldn't even waste a Tls slot if we ultimate convert condition variables to Windows native condition variables, since the slot used for the Event object would be "freed".

otherlibs/systhreads/mutex.mli

otherlibs/systhreads/st_posix.h

otherlibs/systhreads/st_win32.h

runtime/io.c

xavierleroy · 2020-09-09T12:35:23Z

Thanks a lot for the review. I pushed some updates. Now we need to discuss whether this semantics for mutexes is the one we want. (It is the one I prefer.)

dra27 · 2020-09-09T12:47:09Z

Regarding my comment about GetCurrentThreadId - we of course do have unique thread IDs of our own. Could we use it instead?

dra27 · 2020-09-09T12:48:48Z

For the semantics, can we offer both? It would just require the Unix implementation to include the OCaml thread ID as well?

gadmm · 2020-09-09T12:56:30Z

In case the recursive nature of Windows mutexes happened to be important to some, I had thought indeed that one could add an optional argument to Mutex.create to select the desired semantics, to help with code adaptation. Though I imagine that for the semaphore semantics you would prefer to introduce them as a separate module.

xavierleroy · 2020-09-09T13:02:26Z

Regarding my comment about GetCurrentThreadId - we of course do have unique thread IDs of our own. Could we use it instead?

Sorry I missed that. We could probably use the OCaml thread ID, with a small change of internal API (some st_posix.h and st_win32.h functions would take the thread ID as extra argument). It might even be more efficient than calling GetCurrentThreadId. I'll look into it.

The general problem of a thread that terminates while holding mutexes remains. Some PThreads implementations have mutex attributes that cause pthread_mutex_lock to fail with a distinctive error code when trying to lock a mutex owned by a terminated thread, so that recovery can be attempted.

xavierleroy · 2020-09-09T13:09:28Z

Multiple "à la carte" semantics of mutexes (strict checking, recursive, semaphore-like) could probably be implemented, but at some conceptual cost (will users know which one to use?) and some performance cost too (it would be less efficient than just a POSIX thread mutex). Although I need to investigate alternate implementations that could take advantage of the master lock.

craigfe · 2020-09-13T13:35:20Z

We currently rely on the semaphore-like semantics of Mutex.unlock in certain (Linux-only) Mirage libraries, in order to pass ownership of resources to background threads. Indeed, I wasn't aware that the standard library's mutex didn't explicitly support this until seeing @xavierleroy's post on discuss about it.

If you do decide to go ahead with this change – and it seems like a good idea to me – we'd very much appreciate a binary semaphore implementation to go along with it. We could then use a shim library that selects the right API according to the current OCaml version. Regarding the conceptual cost of multiple mutex implementations: I suspect that good documentation would make confusion unlikely.

Thanks for your efforts :-)

(CC: @icristescu, @pascutto, @samoht.)

xavierleroy · 2020-09-30T12:28:38Z

That makes me wonder if we shouldn't rely on GetCurrentThreadId but should allocate a unique ID of our own and store that in Tls?

Done as suggested, commit 0d2200a. I tried several designs, including one where the OCaml thread identifier is used, but this design looks clearer to me.

I would like to merge this PR within one week. A final review would be most welcome.

gadmm

Minor comments; I did not review the win32 implementation.

Given the various incompatibilities reported, I am still on the fence as to whether it is not more prudent to keep a fast/legacy behaviour as the default, and let users choose/evolve their mutex variant explicitly with a flag to Mutex.create.

Changes

otherlibs/systhreads/st_win32.h

- Mutex.lock raises Sys_error if the mutex is already locked by the calling thread. - Mutex.unlock raises Sys_error if the mutex is unlocked or locked by another thread.

Co-authored-by: David Allsopp <david.allsopp@metastack.com>

Leftover code from an unrelated experiment.

Co-authored-by: David Allsopp <david.allsopp@metastack.com>

Identifiers returned by GetCurrentThreadId() can be reused early. If a thread stops while holding a mutex, and its thread ID is reused for another thread, the mutex could be unlocked by this other thread, which is undefined behavior. To make this scenario even less likely, this PR uses the OCaml thread identifier (a linearly-allocated integer) instead of the result of GetCurrentThreadId().

dra27

There's a minor nit in a comment, and so while I was there I dealt with the check-typo nits (especially as they were in a previous suggestion of mine ... we need to persuade GitHub to run our AWK scripts in the UI 😉)

I see the problem in the earlier implementation calling TlsSetValue in the wrong thread - but couldn't the initialisation instead have moved caml_thread_start (perhaps with the introduction of an OS-specific st_thread_startup call or something immediately after the st_tls_set(thread_descriptor_key? Or was there something else I've missed? I haven't tried to see if it's measurable, but it seems the fast path for locking/unlocking has gone from an OS-optimised call (either GetCurrentThreadId or a direct call TlsGetValue) to something potentially slower, given the indirection?

otherlibs/systhreads/st_win32.h

otherlibs/systhreads/mutex.mli

Co-authored-by: David Allsopp <david.allsopp@metastack.com>

xavierleroy · 2020-10-02T17:18:29Z

Thanks for your help fighting check-typo!

I see the problem in the earlier implementation calling TlsSetValue in the wrong thread - but couldn't the initialisation instead have moved caml_thread_start (perhaps with the introduction of an OS-specific st_thread_startup call or something immediately after the st_tls_set(thread_descriptor_key? Or was there something else I've missed?

Right, this would have been another way to go about it. But after much back-and-forth I decided it would duplicate in st_win32.h much of the thread ID machinery already present in st_stubs.c. Plus, OCaml thread identifiers are OCaml tagged integers, so they are never 0...

I haven't tried to see if it's measurable, but it seems the fast path for locking/unlocking has gone from an OS-optimised call (either GetCurrentThreadId or a direct call TlsGetValue) to something potentially slower, given the indirection?

I would expect the cost of the extra indirection to be small compared with the cost of the call to TlsGetValue or to GetCurrentThreadId.

xavierleroy · 2020-10-05T12:59:51Z

Thanks for the reviews. It's time to cross that bridge...

xavierleroy mentioned this pull request Aug 19, 2020

Make systhread mutexes errorcheck #9757

Closed

xavierleroy force-pushed the strict-mutex-1 branch from 070affa to 75a2a82 Compare August 19, 2020 17:37

gadmm reviewed Aug 19, 2020

View reviewed changes

otherlibs/systhreads/st_posix.h Show resolved Hide resolved

xavierleroy mentioned this pull request Aug 28, 2020

Use "semaphore-like" mutexes in the threads library #9863

Closed

dra27 reviewed Sep 8, 2020

View reviewed changes

This was referenced Sep 14, 2020

Reimplement mutexes and condition variables in OCaml using atomics + suspend/notify #9915

Closed

Provide semaphores in the threading library #9930

Merged

gadmm reviewed Sep 30, 2020

View reviewed changes

Changes Show resolved Hide resolved

otherlibs/systhreads/st_win32.h Outdated Show resolved Hide resolved

xavierleroy and others added 9 commits October 1, 2020 13:58

Use "error checking" mutexes in the threads library

8e07108

- Mutex.lock raises Sys_error if the mutex is already locked by the calling thread. - Mutex.unlock raises Sys_error if the mutex is unlocked or locked by another thread.

Always call pthread_mutexattr_destroy in st_mutex_create

90f7a5a

Make the mutex_errors test more robust

1729ed5

Simplify the mutex_errors test

bebb4a0

Update documentation comments as suggested in code review

cc73e4b

Co-authored-by: David Allsopp <david.allsopp@metastack.com>

Revert unintentional commit to runtime/io.c

17b2ba8

Leftover code from an unrelated experiment.

Commenting on how the "owner" field of mutexes is accessed

e55499c

Typo in comment in st_win32.h

8da91d3

Co-authored-by: David Allsopp <david.allsopp@metastack.com>

xavierleroy force-pushed the strict-mutex-1 branch from 11618e2 to 7464511 Compare October 1, 2020 12:11

Changes entry for 9846

b0291c0

xavierleroy force-pushed the strict-mutex-1 branch from 7464511 to b0291c0 Compare October 1, 2020 12:23

dra27 reviewed Oct 2, 2020

View reviewed changes

otherlibs/systhreads/st_win32.h Outdated Show resolved Hide resolved

otherlibs/systhreads/mutex.mli Outdated Show resolved Hide resolved

otherlibs/systhreads/mutex.mli Outdated Show resolved Hide resolved

xavierleroy and others added 3 commits October 2, 2020 19:13

Update otherlibs/systhreads/st_win32.h

477498d

Co-authored-by: David Allsopp <david.allsopp@metastack.com>

Update otherlibs/systhreads/mutex.mli

da1b60e

Co-authored-by: David Allsopp <david.allsopp@metastack.com>

Update otherlibs/systhreads/mutex.mli

512682b

Co-authored-by: David Allsopp <david.allsopp@metastack.com>

stedolan approved these changes Oct 5, 2020

View reviewed changes

Changes entry for 9846

5f9a5f3

xavierleroy merged commit 786ee52 into ocaml:trunk Oct 5, 2020

dra27 mentioned this pull request Jan 17, 2021

Fix st_current_thread_id in win32 systhreads #10161

Merged

Octachron mentioned this pull request Jan 26, 2021

OCaml 4.12.0 second beta release ocaml/opam-repository#18042

Merged

Sudha247 mentioned this pull request Mar 31, 2021

Systhreads Mutex raises Sys_error ocaml-multicore/ocaml-multicore#523

Merged

xavierleroy deleted the strict-mutex-1 branch July 18, 2021 16:34

MisterDA mentioned this pull request Sep 4, 2024

Use WinAPI concurrency primitives on Windows ports (remove winpthreads) #13416

Merged

jeong-sik mentioned this pull request Mar 28, 2026

fix(keeper): remove Stdlib.Mutex from keeper_registry (EDEADLK fix) jeong-sik/masc-mcp#3580

Merged

4 tasks

Conversation

xavierleroy commented Aug 19, 2020

Uh oh!

gadmm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

xavierleroy commented Aug 20, 2020

Uh oh!

dra27 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xavierleroy commented Sep 9, 2020

Uh oh!

dra27 commented Sep 9, 2020

Uh oh!

dra27 commented Sep 9, 2020

Uh oh!

gadmm commented Sep 9, 2020

Uh oh!

xavierleroy commented Sep 9, 2020

Uh oh!

xavierleroy commented Sep 9, 2020

Uh oh!

craigfe commented Sep 13, 2020

Uh oh!

xavierleroy commented Sep 30, 2020

Uh oh!

gadmm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dra27 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xavierleroy commented Oct 2, 2020

Uh oh!

xavierleroy commented Oct 5, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants