Skip to content

Feature Request: Exclusive service groups (or names) to prevent OOM on shared resources (GPU/RAM) #882

Description

@ldotlopez

As the title says, Sablier is currently purely timeout-based, but some scenarios require an exclusive group. There is currently no way to force a container to stop when another is requested.

For example, I want to run two or more resource-heavy containers (like Ollama and whisper-asr-as-service). There is a high possibility that a request for Whisper comes in while Ollama is still in its "idle timeout" window. In this case, Sablier starts Whisper while Ollama is still running. On machines with limited VRAM or RAM, this causes an immediate OOM crash (or extremely slow operation) because the second container tries to spin up before the first one has cleared the hardware.

I think it would be useful to be able to define an exclusive group for containers. If Container A and Container B are in the same group, Sablier should force-stop A before starting B.

Basically: Last request wins, everyone else in the exclusive-group shuts down.

I’ve tried setting very short session-duration limits, but it’s a bad workaround. It either shuts things down too early while I'm actually using them, or doesn't shut them down fast enough when a different service is requested, leading to the same memory collision.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions