As the title says, Sablier is currently purely timeout-based, but some scenarios require an exclusive group. There is currently no way to force a container to stop when another is requested.
For example, I want to run two or more resource-heavy containers (like Ollama and whisper-asr-as-service). There is a high possibility that a request for Whisper comes in while Ollama is still in its "idle timeout" window. In this case, Sablier starts Whisper while Ollama is still running. On machines with limited VRAM or RAM, this causes an immediate OOM crash (or extremely slow operation) because the second container tries to spin up before the first one has cleared the hardware.
I think it would be useful to be able to define an exclusive group for containers. If Container A and Container B are in the same group, Sablier should force-stop A before starting B.
Basically: Last request wins, everyone else in the exclusive-group shuts down.
I’ve tried setting very short session-duration limits, but it’s a bad workaround. It either shuts things down too early while I'm actually using them, or doesn't shut them down fast enough when a different service is requested, leading to the same memory collision.
As the title says, Sablier is currently purely timeout-based, but some scenarios require an exclusive group. There is currently no way to force a container to stop when another is requested.
For example, I want to run two or more resource-heavy containers (like Ollama and whisper-asr-as-service). There is a high possibility that a request for Whisper comes in while Ollama is still in its "idle timeout" window. In this case, Sablier starts Whisper while Ollama is still running. On machines with limited VRAM or RAM, this causes an immediate OOM crash (or extremely slow operation) because the second container tries to spin up before the first one has cleared the hardware.
I think it would be useful to be able to define an exclusive group for containers. If Container A and Container B are in the same group, Sablier should force-stop A before starting B.
Basically: Last request wins, everyone else in the exclusive-group shuts down.
I’ve tried setting very short session-duration limits, but it’s a bad workaround. It either shuts things down too early while I'm actually using them, or doesn't shut them down fast enough when a different service is requested, leading to the same memory collision.