Skip to content

Use "error checking" mutexes in the threads library#9846

Merged
xavierleroy merged 14 commits intoocaml:trunkfrom
xavierleroy:strict-mutex-1
Oct 5, 2020
Merged

Use "error checking" mutexes in the threads library#9846
xavierleroy merged 14 commits intoocaml:trunkfrom
xavierleroy:strict-mutex-1

Conversation

@xavierleroy
Copy link
Copy Markdown
Contributor

This is an alternate implementation of #9757. The purpose is the same:

  • Make sure that Mutex.lock raises Sys_error if the mutex is already locked by the calling thread.
  • Make sure that Mutex.unlock raises Sys_error if the mutex is unlocked or locked by another thread.

The main difference with #9757 is the Win32 implementation, which I hope is correct, and some more testing.

Copy link
Copy Markdown
Contributor

@gadmm gadmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small comment about st_mutex_create. I did not review the win32 version.

@xavierleroy
Copy link
Copy Markdown
Contributor Author

The "asynchronous actions" test, taken from #9757, is failing with C stack overflows on some platforms, probably because of a bug in the runtime system, see my comment 15b9d00#r41619894 . After several attempts at making this test more robust, I dropped it.

Copy link
Copy Markdown
Member

@dra27 dra27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made various comments, but the implementation looks correct to me. I have one concern about the Windows implementation, mainly as it looks like a possible side-channel attack.

If a thread terminates without releasing a mutex it holds, I believe the errorcheck semantics are that nothing will ever lock that mutex again?

For Windows, the behaviour of the critical section underpinning the mutex is undefined in that situation. Worse, thread and process IDs are global and can be re-used - so if Windows did allow an abandoned critical section to be re-locked by another thread (I'm not sure that it doesn't already), and an external agent manipulated the system such that the thread ID had been reused within that process, then the OCaml invariant on m->owner would be broken.

That makes me wonder if we shouldn't rely on GetCurrentThreadId but should allocate a unique ID of our own and store that in Tls? It wouldn't even waste a Tls slot if we ultimate convert condition variables to Windows native condition variables, since the slot used for the Event object would be "freed".

@xavierleroy
Copy link
Copy Markdown
Contributor Author

Thanks a lot for the review. I pushed some updates. Now we need to discuss whether this semantics for mutexes is the one we want. (It is the one I prefer.)

@dra27
Copy link
Copy Markdown
Member

dra27 commented Sep 9, 2020

Regarding my comment about GetCurrentThreadId - we of course do have unique thread IDs of our own. Could we use it instead?

@dra27
Copy link
Copy Markdown
Member

dra27 commented Sep 9, 2020

For the semantics, can we offer both? It would just require the Unix implementation to include the OCaml thread ID as well?

@gadmm
Copy link
Copy Markdown
Contributor

gadmm commented Sep 9, 2020

In case the recursive nature of Windows mutexes happened to be important to some, I had thought indeed that one could add an optional argument to Mutex.create to select the desired semantics, to help with code adaptation. Though I imagine that for the semaphore semantics you would prefer to introduce them as a separate module.

@xavierleroy
Copy link
Copy Markdown
Contributor Author

Regarding my comment about GetCurrentThreadId - we of course do have unique thread IDs of our own. Could we use it instead?

Sorry I missed that. We could probably use the OCaml thread ID, with a small change of internal API (some st_posix.h and st_win32.h functions would take the thread ID as extra argument). It might even be more efficient than calling GetCurrentThreadId. I'll look into it.

The general problem of a thread that terminates while holding mutexes remains. Some PThreads implementations have mutex attributes that cause pthread_mutex_lock to fail with a distinctive error code when trying to lock a mutex owned by a terminated thread, so that recovery can be attempted.

@xavierleroy
Copy link
Copy Markdown
Contributor Author

Multiple "à la carte" semantics of mutexes (strict checking, recursive, semaphore-like) could probably be implemented, but at some conceptual cost (will users know which one to use?) and some performance cost too (it would be less efficient than just a POSIX thread mutex). Although I need to investigate alternate implementations that could take advantage of the master lock.

@craigfe
Copy link
Copy Markdown
Contributor

craigfe commented Sep 13, 2020

We currently rely on the semaphore-like semantics of Mutex.unlock in certain (Linux-only) Mirage libraries, in order to pass ownership of resources to background threads. Indeed, I wasn't aware that the standard library's mutex didn't explicitly support this until seeing @xavierleroy's post on discuss about it.

If you do decide to go ahead with this change – and it seems like a good idea to me – we'd very much appreciate a binary semaphore implementation to go along with it. We could then use a shim library that selects the right API according to the current OCaml version. Regarding the conceptual cost of multiple mutex implementations: I suspect that good documentation would make confusion unlikely.

Thanks for your efforts :-)

(CC: @icristescu, @pascutto, @samoht.)

@xavierleroy
Copy link
Copy Markdown
Contributor Author

That makes me wonder if we shouldn't rely on GetCurrentThreadId but should allocate a unique ID of our own and store that in Tls?

Done as suggested, commit 0d2200a. I tried several designs, including one where the OCaml thread identifier is used, but this design looks clearer to me.

I would like to merge this PR within one week. A final review would be most welcome.

Copy link
Copy Markdown
Contributor

@gadmm gadmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments; I did not review the win32 implementation.

Given the various incompatibilities reported, I am still on the fence as to whether it is not more prudent to keep a fast/legacy behaviour as the default, and let users choose/evolve their mutex variant explicitly with a flag to Mutex.create.

xavierleroy and others added 9 commits October 1, 2020 13:58
- Mutex.lock raises Sys_error if the mutex is already locked by the
  calling thread.
- Mutex.unlock raises Sys_error if the mutex is unlocked or locked
  by another thread.
Co-authored-by: David Allsopp <david.allsopp@metastack.com>
Leftover code from an unrelated experiment.
Co-authored-by: David Allsopp <david.allsopp@metastack.com>
Identifiers returned by GetCurrentThreadId() can be reused early.
If a thread stops while holding a mutex, and its thread ID is reused
for another thread, the mutex could be unlocked by this other thread,
which is undefined behavior.

To make this scenario even less likely, this PR uses the OCaml thread
identifier (a linearly-allocated integer) instead of the result of
GetCurrentThreadId().
Copy link
Copy Markdown
Member

@dra27 dra27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a minor nit in a comment, and so while I was there I dealt with the check-typo nits (especially as they were in a previous suggestion of mine ... we need to persuade GitHub to run our AWK scripts in the UI 😉)

I see the problem in the earlier implementation calling TlsSetValue in the wrong thread - but couldn't the initialisation instead have moved caml_thread_start (perhaps with the introduction of an OS-specific st_thread_startup call or something immediately after the st_tls_set(thread_descriptor_key? Or was there something else I've missed? I haven't tried to see if it's measurable, but it seems the fast path for locking/unlocking has gone from an OS-optimised call (either GetCurrentThreadId or a direct call TlsGetValue) to something potentially slower, given the indirection?

xavierleroy and others added 3 commits October 2, 2020 19:13
Co-authored-by: David Allsopp <david.allsopp@metastack.com>
Co-authored-by: David Allsopp <david.allsopp@metastack.com>
Co-authored-by: David Allsopp <david.allsopp@metastack.com>
@xavierleroy
Copy link
Copy Markdown
Contributor Author

Thanks for your help fighting check-typo!

I see the problem in the earlier implementation calling TlsSetValue in the wrong thread - but couldn't the initialisation instead have moved caml_thread_start (perhaps with the introduction of an OS-specific st_thread_startup call or something immediately after the st_tls_set(thread_descriptor_key? Or was there something else I've missed?

Right, this would have been another way to go about it. But after much back-and-forth I decided it would duplicate in st_win32.h much of the thread ID machinery already present in st_stubs.c. Plus, OCaml thread identifiers are OCaml tagged integers, so they are never 0...

I haven't tried to see if it's measurable, but it seems the fast path for locking/unlocking has gone from an OS-optimised call (either GetCurrentThreadId or a direct call TlsGetValue) to something potentially slower, given the indirection?

I would expect the cost of the extra indirection to be small compared with the cost of the call to TlsGetValue or to GetCurrentThreadId.

@xavierleroy
Copy link
Copy Markdown
Contributor Author

Thanks for the reviews. It's time to cross that bridge...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants