Avoid a race condition when the process started to read a configuration file crashes/is not found#1378
Merged
voodoos merged 2 commits intoocaml:masterfrom Nov 17, 2021
Conversation
Indented some comments and removed some stray trailing spaces
The existing code could error in the following case (seven-point framing due to @stedolan): 1. We call `Mconfig_dot.get_config`. 2. On (the former) line 236, `get_config` calls `Mconfig_dot.Configurator.get_process`; this function spawns a process and stores its information in the `running_processes` hashtable. (It doesn't matter which code path it goes down; let's suppose it starts it for the first time. i.e. the `with Not_found` path.) 3. The spawned process immediately exits in the background. 4. On (the former) line 237, `get_config` calls `Unix.waitpid` on the PID from `get_process`, returning the PID (nonzero) because the spawned process crashed. Importantly, this means the PID can never be waited on again. 5. This raises `Process_exited`, which `get_config` catches; `get_config` then proceeds to exit after reporting an error. 6. We call `Mconfig_dot.get_config` again. 7. As in step 2, we call `get_process` on (the former) line 236; this looks up the old stale PID in `running_processes`, finds it, and calls `Unix.waitpid` on it. Because that PID was already waited on after completion in step 4 above, this crashes with an `ECHILD` error, terminating messily. We have actually seen this crash in practice. To prevent this from happening again, I made two changes: A. I provided a version of `get_process` that runs the second `Unix.waitpid` itself, returning a `Process.t option`. The accuracy of this wait depends on exactly how fast the spawned process fails, but as this is mostly trying to catch `dot-merlin-reader` not being installed or not being on the PATH, that's fine; it only needs to be best-effort besides that. B. I provide a module signature for the local `Configurator` module, abstracting over the `running_processes` hashtable so that we can preserve the invariant that it only contains waitable PIDs. To make this abstraction robust, I removed the PID from the `Process.t` record, and have `running_processes` store a `Process.With_pid.t`, which is just a pair of a PID and a process; I never expose this type (or any other way to access the PID) from `Configurator`'s interface.
voodoos
approved these changes
Oct 25, 2021
Collaborator
voodoos
left a comment
There was a problem hiding this comment.
These changes looks good to me. If I understand well the main issue was that we forgot to remove the pid from the hash-table when raising Process_exited.
It is a good idea to hide this invariant from users of the Configurator module.
@trefis, do you want to have a look too ?
Collaborator
|
@trefis I am going to merge this one if you have no objections |
voodoos
added a commit
that referenced
this pull request
Nov 22, 2021
…s-race Avoid a race condition when the process started to read a configuration file crashes/is not found
voodoos
added a commit
that referenced
this pull request
Nov 22, 2021
…s-race Avoid a race condition when the process started to read a configuration file crashes/is not found
voodoos
added a commit
that referenced
this pull request
Nov 22, 2021
…s-race Avoid a race condition when the process started to read a configuration file crashes/is not found
voodoos
added a commit
to voodoos/opam-repository
that referenced
this pull request
Nov 23, 2021
CHANGES:
+ merlin binary
- ignore `-error-style` compiler flag (ocaml/merlin#1402, @nojb)
- avoid a race condition when the process started to read a configuration
file crashes/is not found (ocaml/merlin#1378, @antalsz)
voodoos
added a commit
to voodoos/opam-repository
that referenced
this pull request
Nov 23, 2021
CHANGES:
Mon Jul 26 11:12:21 PM CET 2021
+ merlin binary
- Mbrowse.select_leaf: correctly ignore merlin.hide (ocaml/merlin#1376)
- enable `occurences` to work when looking for locally abstract types
(ocaml/merlin#1382)
- handle `-alert` compiler flag (ocaml/merlin#1401)
- avoid a race condition when the process started to read a configuration
file crashes/is not found (ocaml/merlin#1378, @antalsz)
- log the backtrace even when the exception is a Failure (ocaml/merlin#1377, @antalsz)
- ignore `-error-style` compiler flag (ocaml/merlin#1402, @nojb)
- fix handling of record field expressions (ocaml/merlin#1375)
- allow -pp to return an AST (ocaml/merlin#1394)
- fix merlin crashing due to short-paths (ocaml/merlin#1334, fixes ocaml/merlin#1322)
+ editor modes
- update quick setup instructions for emacs (ocaml/merlin#1380, @ScriptDevil)
+ test suite
- improve record field destruction testing (ocaml/merlin#1375)
voodoos
added a commit
to voodoos/opam-repository
that referenced
this pull request
Nov 23, 2021
CHANGES:
Mon Jul 26 11:12:21 PM CET 2021
+ ocaml support
- add support for 4.13
- stopped actively supporting version older than 4.12
+ merlin binary
- Mbrowse.select_leaf: correctly ignore merlin.hide (ocaml/merlin#1376)
- enable `occurences` to work when looking for locally abstract types
(ocaml/merlin#1382)
- handle `-alert` compiler flag (ocaml/merlin#1401)
- avoid a race condition when the process started to read a configuration
file crashes/is not found (ocaml/merlin#1378, @antalsz)
- log the backtrace even when the exception is a Failure (ocaml/merlin#1377, @antalsz)
- ignore `-error-style` compiler flag (ocaml/merlin#1402, @nojb)
- fix handling of record field expressions (ocaml/merlin#1375)
- allow -pp to return an AST (ocaml/merlin#1394)
- fix merlin crashing due to short-paths (ocaml/merlin#1334, fixes ocaml/merlin#1322)
+ editor modes
- update quick setup instructions for emacs (ocaml/merlin#1380, @ScriptDevil)
+ test suite
- improve record field destruction testing (ocaml/merlin#1375)
voodoos
added a commit
to voodoos/opam-repository
that referenced
this pull request
Nov 23, 2021
CHANGES:
Mon Jul 26 11:12:21 PM CET 2021
+ merlin binary
- Mbrowse.select_leaf: correctly ignore merlin.hide (ocaml/merlin#1376)
- enable `occurences` to work when looking for locally abstract types
(ocaml/merlin#1382)
- handle `-alert` compiler flag (ocaml/merlin#1401)
- avoid a race condition when the process started to read a configuration
file crashes/is not found (ocaml/merlin#1378, @antalsz)
- log the backtrace even when the exception is a Failure (ocaml/merlin#1377, @antalsz)
- ignore `-error-style` compiler flag (ocaml/merlin#1402, @nojb)
- fix handling of record field expressions (ocaml/merlin#1375)
- allow -pp to return an AST (ocaml/merlin#1394)
- fix merlin crashing due to short-paths (ocaml/merlin#1334, fixes ocaml/merlin#1322)
+ editor modes
- update quick setup instructions for emacs (ocaml/merlin#1380, @ScriptDevil)
+ test suite
- improve record field destruction testing (ocaml/merlin#1375)
voodoos
added a commit
to voodoos/opam-repository
that referenced
this pull request
Nov 23, 2021
CHANGES:
Tue Nov 23 11:45:21 PM CET 2021
+ merlin binary
- Mbrowse.select_leaf: correctly ignore merlin.hide (ocaml/merlin#1376)
- make `occurences` work when looking for locally abstract types (ocaml/merlin#1382)
- handle `-alert` compiler flag
- improve destruct calls on record fields (ocaml/merlin#1375)
- avoid a race condition when the process started to read a configuration
file crashes/is not found (ocaml/merlin#1378, @antalsz)
- log the backtrace even when the exception is a Failure (ocaml/merlin#1377, @antalsz)
- allow -pp to return an AST (ocaml/merlin#1394)
- ignore `-error-style` compiler flag (ocaml/merlin#1402, @nojb)
- fix handling of record field expressions (ocaml/merlin#1375)
+ test suite
- improve record field destruction testing (ocaml/merlin#1375)
voodoos
added a commit
to voodoos/opam-repository
that referenced
this pull request
Nov 23, 2021
CHANGES:
+ merlin binary
- ignore `-error-style` compiler flag (ocaml/merlin#1402, @nojb)
- avoid a race condition when the process started to read a configuration
file crashes/is not found (ocaml/merlin#1378, @antalsz)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In
Mconfig_dot, there's a race condition aroundwaitpidthat can causeocamlmerlinto crash if the process started to read a configuration file (eitherdot-merlin-readerordune) crashes immediately or is not found. This PR fixes that race condition; it also protects the PIDs behind a module abstraction boundary to prevent it from happening again.The existing code could error in the following case (seven-point framing due to @stedolan):
We call
Mconfig_dot.get_config.On (the former) line 236,
get_configcallsMconfig_dot.Configurator.get_process; this function spawns a process and stores its information in therunning_processeshashtable. (It doesn't matter which code path it goes down; let's suppose it starts it for the first time. i.e. thewith Not_foundpath.)The spawned process immediately exits in the background.
On (the former) line 237,
get_configcallsUnix.waitpidon the PID fromget_process, returning the PID (nonzero) because the spawned process crashed. Importantly, this means the PID can never be waited on again.This raises
Process_exited, whichget_configcatches;get_configthen proceeds to exit after reporting an error.We call
Mconfig_dot.get_configagain.As in step 2, we call
get_processon (the former) line 236; this looks up the old stale PID inrunning_processes, finds it, and callsUnix.waitpidon it. Because that PID was already waited on after completion in step 4 above, this crashes with anECHILDerror, terminating messily.We have actually seen this crash in practice.
To fix this and prevent it from happening again, I made two changes:
I provided a version of
get_processthat runs the secondUnix.waitpiditself, returning aProcess.t option. The accuracy of this wait depends on exactly how fast the spawned process fails, but as this is mostly trying to catchdot-merlin-readernot being installed or not being on the PATH, that's fine; it only needs to be best-effort besides that.I provide a module signature for the local
Configuratormodule, abstracting over therunning_processeshashtable so that we can preserve the invariant that it only contains waitable PIDs. To make this abstraction robust, I removed the PID from theProcess.trecord, and haverunning_processesstore aProcess.With_pid.t, which is just a pair of a PID and a process; I never expose this type (or any other way to access the PID) fromConfigurator's interface.