-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Domain.DLS is not thread-safe #12677
Description
The current Domain.DLS implementation is not (sys)thread-safe, because the implementations of the various state updating operations include safe points during which the runtime may switch between threads. I'll briefly highlight some specific scenarios of what can go wrong.
Consider the implementation of Domain.DLS.get:
Lines 120 to 127 in e397ed2
| let get (idx, init) = | |
| let st = maybe_grow idx in | |
| let v = st.(idx) in | |
| if v == unique_value then | |
| let v' = Obj.repr (init ()) in | |
| st.(idx) <- (Sys.opaque_identity v'); | |
| Obj.magic v' | |
| else Obj.magic v |
Two threads on a single domain that concurrently get with the same key before init has been called may both end up calling init (so init would be called twice for a single key). That can happen because maybe_grow (which potentially allocates) includes a safe point and because init (which often allocates) itself may include a safe point. What can happen then is that the two calls of get return two different results (e.g. two different mutable records), one of which is stored in DLS.
Consider the implementation of the internal Domain.DLS.maybe_grow:
Lines 98 to 111 in e397ed2
| let maybe_grow idx = | |
| let st = get_dls_state () in | |
| let sz = Array.length st in | |
| if idx < sz then st | |
| else begin | |
| let rec compute_new_size s = | |
| if idx < s then s else compute_new_size (2 * s) | |
| in | |
| let new_sz = compute_new_size sz in | |
| let new_st = Array.make new_sz unique_value in | |
| Array.blit st 0 new_st 0 sz; | |
| set_dls_state new_st; | |
| new_st | |
| end |
Two threads on a single domain that concurrently get with two different keys both call maybe_grow and may both end up allocating a new array for the DLS. This is possible because maybe_grow includes a safe point. Only one of the two arrays eventually ends up being stored as the DLS and one of the get operations ends up being undone. Furthermore if the two threads also perform set operations, some of those may end up being undone.
Here is an example interleaving (variation of the above) where the effects of DLS operations performed by Thread B vanish:
Thread A Thread B
get x starts
<--- switch at a safe point --->
get y starts
get y ends
set y starts
set y ends
<--- switch at a safe point --->
set_dls_state
get x ends
The effects of both get y and set y vanish. Even just having get y vanish is a problem, because get may call init, which may return a mutable object, which might be mutated or stored outside of the DLS.