Skip to content

Restore the correct sigmask in systhreads#11880

Merged
shindere merged 1 commit intoocaml:trunkfrom
haesbaert:sigbug
Jan 11, 2023
Merged

Restore the correct sigmask in systhreads#11880
shindere merged 1 commit intoocaml:trunkfrom
haesbaert:sigbug

Conversation

@haesbaert
Copy link
Copy Markdown
Contributor

@haesbaert haesbaert commented Jan 10, 2023

The code correctly blocks and restores the signal mask on the parent while creating a systhread, but the new mask, not the old mask was being passed down the child to be restored.

The bug results on every systhread being run with all signals masked, which does not seem to be the intent, otherwise the init_mask has no function.

The following program shows the bug with before and after outputs:

let () =
  let cur_mask = Unix.sigprocmask SIG_BLOCK [] in
  Printf.printf "main mask: ";
  List.iter (fun signal -> Printf.printf "%d " signal) cur_mask;
  print_newline ();
  flush stdout;
  let tid = Thread.create
      (fun () ->
         let cur_mask = Unix.sigprocmask SIG_BLOCK [] in
         Printf.printf "systhread mask: ";
         List.iter (fun signal -> Printf.printf "%d " signal) cur_mask;
         print_newline ();
         flush stdout;
      ) ()
  in
  Thread.join tid

Before:

main mask:
systhread mask: 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 -24 30 -23 28 -21 -20 -28 -27 -26 -19 -18 -17 -15 -14 16 -11 -2 -8 -13 -10 -12 -3 -22 -1 -25 -5 -9 -6 -4

After:

main mask:
systhread mask:

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Jan 10, 2023 via email

@haesbaert
Copy link
Copy Markdown
Contributor Author

Many thanks for a nice little fix! Given that you have taken the time to write a program to illustrate the problem, how about adding it to the testsuite? Would it seem relevant? I can help with converting it into a test, if needed and it's appropriate to add this as a test.

Sure thing ! I can add it, but first I have to go buy some chocolate, I'll do it later tonight 👍

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Jan 10, 2023 via email

Copy link
Copy Markdown
Contributor

@gadmm gadmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me

@haesbaert
Copy link
Copy Markdown
Contributor Author

I've just piggybacked the test on testsuite/tests/lib-systhreads/threadsigmask.ml, I see no sense in putting it in a separate file.

haesbaert added a commit to haesbaert/ocaml that referenced this pull request Jan 10, 2023
haesbaert added a commit to haesbaert/ocaml that referenced this pull request Jan 10, 2023
The code correctly blocks and restores the signal mask on the parent while
creating a systhread, but the new mask, not the old mask was being passed down
the child to be restored.
The code correctly blocks and restores the signal mask on the parent while
creating a systhread, but the new mask, not the old mask was being passed down
the child to be restored.
@shindere
Copy link
Copy Markdown
Contributor

shindere commented Jan 10, 2023 via email

@haesbaert
Copy link
Copy Markdown
Contributor Author

squashed as requested

@shindere shindere merged commit 8ac0970 into ocaml:trunk Jan 11, 2023
haesbaert added a commit to haesbaert/eio that referenced this pull request Jan 23, 2023
This makes sure we can process signals in luv the same way we do in uring. As
stated in ocaml-multicore#400 the main issue is that luv's mainloop will restart on EINTR and
we will never unwind back to ocaml land, so even though the process got the
signal, the runtime will not see it until something else goes on.

The trick here is to abuse POSIX thread semantics and shove all signals into one
specific systhread by blocking them in the other threads.

Additionally, I've fixed a bug in Ocaml 5.0 where the systhreads end up starting
with all signals blocked: ocaml/ocaml#11880. This PR
works even with/without the bug.

Danger Zone
~~~~~~~~~~~
This is tricky ! If you're thinking we can do better than pipes, think again !
Unix.sigsuspend doesn't work on multithreaded, we don't have a real Unix.pause
(it's implemented on top of sigsuspend !).
The semantics for kill(2) to "self" are different than "from outside" when
multithreaded, and the runtime doesn't expose pthread_kill(2). That's why on the
test we have to fork and signal back to the parent. I know, it's horrible.
talex5 added a commit to haesbaert/eio that referenced this pull request Jan 24, 2023
This makes sure we can process signals in luv the same way we do in uring. As
stated in ocaml-multicore#400 the main issue is that luv's mainloop will restart on EINTR and
we will never unwind back to ocaml land, so even though the process got the
signal, the runtime will not see it until something else goes on.

The trick here is to abuse POSIX thread semantics and shove all signals into one
specific systhread by blocking them in the other threads.

Additionally, I've fixed a bug in OCaml 5.0 where the systhreads end up starting
with all signals blocked: ocaml/ocaml#11880. This PR
works even with/without the bug.

Danger Zone
~~~~~~~~~~~
This is tricky ! If you're thinking we can do better than pipes, think again !
Unix.sigsuspend doesn't work on multithreaded, we don't have a real Unix.pause
(it's implemented on top of sigsuspend !).
The semantics for kill(2) to "self" are different than "from outside" when
multithreaded, and the runtime doesn't expose pthread_kill(2). That's why on the
test we have to fork and signal back to the parent. I know, it's horrible.

Co-authored-by: Thomas Leonard <talex5@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants