fix: use Recreate strategy for GPU workloads to prevent rolling update deadlock by Defilan · Pull Request #196 · defilantech/LLMKube

Defilan · 2026-03-02T09:32:26Z

Summary

Sets deployment strategy to Recreate for GPU workloads (gpuCount > 0) to prevent scheduling deadlock during rolling updates
Non-GPU workloads continue using the Kubernetes default (RollingUpdate)
Adds test assertions for both GPU and CPU-only strategy behavior

Problem

When all GPUs on a node are occupied, RollingUpdate creates a deadlock: the new pod cannot schedule without a GPU, and the old pod won't terminate until the new pod is Ready. This is especially common in homelab/small clusters with no spare GPUs.

How it works

The fix adds a Recreate strategy assignment inside the existing if gpuCount > 0 block in constructDeployment(), alongside the GPU toleration logic. This means the old pod terminates first, freeing the GPU for the replacement pod. Brief downtime during updates is acceptable — the alternative is a permanent deadlock.

Existing GPU InferenceService deployments will pick up the fix on the next reconcile cycle.

Fixes #192

Test plan

make test passes — all existing + new tests
Verify GPU deployment has strategy: Recreate via kubectl get deploy <name> -o yaml
Verify CPU-only deployment retains default RollingUpdate
Trigger a GPU InferenceService update and confirm old pod terminates before new pod creates

…e deadlock When all GPUs on a node are occupied, RollingUpdate creates a deadlock: the new pod cannot schedule without a GPU, and the old pod won't terminate until the new pod is Ready. This sets the deployment strategy to Recreate for GPU workloads (gpuCount > 0) so the old pod terminates first, freeing the GPU for the replacement. Fixes #192 Signed-off-by: Christopher Maher <chris@mahercode.io>

Defilan merged commit 2e45181 into main Mar 2, 2026
15 checks passed

Defilan deleted the fix/gpu-rolling-update-deadlock branch March 2, 2026 16:53

This was referenced Mar 2, 2026

chore: release 0.5.0 #191

Merged

chore: release 0.4.22 #207

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use Recreate strategy for GPU workloads to prevent rolling update deadlock#196

fix: use Recreate strategy for GPU workloads to prevent rolling update deadlock#196
Defilan merged 1 commit intomainfrom
fix/gpu-rolling-update-deadlock

Defilan commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Defilan commented Mar 2, 2026

Summary

Problem

How it works

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant