Skip to content

Writing custom events causes GC deadlocks #12897

@talex5

Description

@talex5

On OCaml 5.1.1, this program hangs very quickly for me:

let unit =
  let encode _buf () = Gc.minor (); 0 in
  let decode _buf _len = () in
  Runtime_events.Type.register ~encode ~decode

type Runtime_events.User.tag += My_event
let my_event = Runtime_events.User.register "event" My_event unit

let run () =
  for i = 1 to 1_000_000 do
    Printf.printf "%d: %d\n%!" (Domain.self () :> int) i;
    Runtime_events.User.write my_event ();
  done

let () =
  Runtime_events.start ();
  let other = Domain.spawn run in
  run ();
  Domain.join other

The output is typically:

1: 1                                   
0: 1
1: 2
[hangs]

This is a simplified version of what happens when tracing Eio's HTTP benchmark. The Gc.minor call in encode doesn't seem to be necessary (caml_runtime_events_user_write calls alloc_and_clear_stack_parent, which allocates), but triggering the bug without it takes longer.

I think what is happening is:

  1. Domain 0 takes write_buffer_lock, then decides to do a GC while holding it. It waits for domain 1 to be ready.
  2. Domain 1 tries to take write_buffer_lock

(I'm not sure why the C code is managing the (single) write buffer. It would probably be safer to let the OCaml code deal with that and just call the events code with the buffer already filled in.)

/cc @sadiqj @OlivierNicole

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions