Fail in-progress jobs when the worker running them exits abnormally#277
Merged
Fail in-progress jobs when the worker running them exits abnormally#277
Conversation
9848dae to
80dbef5
Compare
So we can uniquely identify processes by supervisor and name, without having to rely on the PID, that can be duplicated across processes.
We were reusing the instances of Worker and Dispatcher from the initial configuration all the time, which could bring some problems with stopped pools. Now that we need a name to be generated and be unique per process instance, we really need to instantiate new processes every time they're started.
This applies to: - Killed workers that the supervisor detects as dead. - Reaped workers without a clear exit status. - Orphaned executions that somehow lost their worker. - Workers whose heartbeat expired.
69f30b4 to
3945042
Compare
As it won't be possible to start new processes after the column is made NOT NULL and before deploying the code that uses that column.
3945042 to
76d2c0f
Compare
rosa
added a commit
that referenced
this pull request
Nov 27, 2024
Closes #422. Thanks to @salmonsteak1 for spotting this.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This applies to:
To do this easily, since the supervisor doesn't register all workers for efficiency, we need to rely on a new unique identifier that links the supervisor with their configured processes. Since the registration happens after forking, the supervisor doesn't know the registered process IDs of its supervised processes. This unique identifier is a
namethat gets randomly generated when the process is instantiated. This made me realise I was reusing the configured processes object to start new processes, which is quite prone to issues with already created thread pools and stuff like that 😬 Because of this, this PR also changes the approach to have the Configuration object return configured processes that need to be instantiated before starting, and each time create a new object.