Skip to content

Releases: defilantech/LLMKube

llmkube-0.5.0

05 Mar 09:35

Choose a tag to compare

Helm chart for LLMKube v0.5.0 — fixes appVersion to match published controller image

v0.5.0

04 Mar 09:40
b2e53b8

Choose a tag to compare

0.5.0 (2026-03-04)

Features

  • add pre-flight memory validation for Metal agent (#204) (ba252ef)
  • add health checks, metrics, and continuous monitoring to Metal agent (#205) (a113fd1)
  • add per-model memoryBudget and memoryFraction CRD fields (#206) (e632369)

Bug Fixes

  • agent: unregister service endpoints on metal process delete (#168) (147b9bc)
  • enable controller metrics endpoint in Helm chart (#195) (70940af)
  • prevent model re-download of cached models after helm upgrade (#203) (a8f9a88)
  • use Recreate strategy for GPU workloads to prevent rolling update deadlock (#196) (2e45181)

Documentation

  • rewrite README for clarity, positioning, and growth (#190) (a7fc152)

v0.4.20

01 Mar 00:21
205d91d

Choose a tag to compare

0.4.20 (2026-02-28)

Features

  • add license compliance scanning for GGUF models (#188) (c26400a)
  • add Prometheus metrics, OpenTelemetry tracing, and inference observability (#189) (c653ff1)
  • add PVC inspection to cache list for orphaned entry detection (#183) (2723d92)
  • agent: add structured zap logging to metal agent (#164) (e9d143c)
  • deps: upgrade to Kubernetes 1.35 and controller-runtime v0.23.1 (#175) (3c323f4)

Bug Fixes

  • correct Metal quickstart docs for selectorless services (#173) (89471ec)
  • prevent command injection in init container shell commands (#172) (3aa9cc3)
  • remove mutable latest tags and pin container images (#174) (3c4569a)

Documentation

  • add Apple Silicon Metal option to bug report template (#169) (e7689d8)

llmkube-0.4.20

01 Mar 00:21
205d91d

Choose a tag to compare

A Helm chart for LLMKube - Kubernetes operator for GPU-accelerated LLM inference

v0.4.19

21 Feb 01:48
07630b8

Choose a tag to compare

0.4.19 (2026-02-21)

Features

  • add --jinja flag for tool/function calling support (#162) (47624ca)

llmkube-0.4.19

21 Feb 01:48
07630b8

Choose a tag to compare

A Helm chart for LLMKube - Kubernetes operator for GPU-accelerated LLM inference

v0.4.18

20 Feb 19:12
1956000

Choose a tag to compare

0.4.18 (2026-02-20)

Bug Fixes

  • agent: read contextSize from InferenceService CRD (#160) (17f58d4)

Documentation

  • update README and Metal Agent guide for remote K8s architecture (#156) (79145b2)

v0.4.17

20 Feb 10:33
cbb16f0

Choose a tag to compare

0.4.17 (2026-02-20)

Bug Fixes

  • agent: filter InferenceServices by Metal accelerator type (#157) (5737bb7)

v0.4.16

20 Feb 09:19
7015b92

Choose a tag to compare

0.4.16 (2026-02-20)

Features

  • agent: add --host-ip flag for remote K8s cluster support (#155) (b425569)

Documentation

  • Add Metal Agent (Apple Silicon) support to README (#151) (3579426)

llmkube-0.4.18

20 Feb 19:12
1956000

Choose a tag to compare

A Helm chart for LLMKube - Kubernetes operator for GPU-accelerated LLM inference