-
-
Notifications
You must be signed in to change notification settings - Fork 756
Description
Today workers heartbeat to the scheduler. If the scheduler doesn't hear back from them in a certain amount of tiime, the scheduler can ask the nanny to kill the worker, or just give up on the worker.
We have this code already, and it is configurable for how long the time to live (TTL) should be. Today the limit is set at infinity. We should maybe consider something more conservative.
cc @fjetter
This has been brought up many times. Some folks don't like this idea because they have computations that take a long time and hold the GIL (this is a valid counter-argument). We've avoided making a decision here in the past. Maybe we should set a reasonable target, like a minute or five minutes or something. We would warn loudly about what's happening and point people to the config option on how to change the behavior.
This was slightly inspired by (but does not fix) #6110