Releases: ollama/ollama
Releases · ollama/ollama
v0.21.0
What's Changed
- launch: skip unchanged integration rewrite configration by @hoyyeva in #15491
- Revert "launch/opencode: use inline config" by @hoyyeva in #15568
- launch/openclaw: fix --yes flag behaviour to skip channels configuration by @hoyyeva in #15589
- launch: OpenCode inline config by @hoyyeva in #15586
- Add MLX closure support by @jessegross in #15590
- mlx: Improve gemma4 performance with fused operations by @dhiltgen in #15587
- mlx: fix RotatingKVCache.concat() dropping context on mid-rotation by @dhiltgen in #15591
- launch: add hermes by @ParthSareen in #15569
- launch: always list cloud recommendations first by @hoyyeva in #15593
- Keep Gemma4 router projection in source precision by @dhiltgen in #15613
- gemma4: render differently based on model size by @drifkin in #15612
- cmd/launch: add Copilot CLI integration by @scaryrawr in #15583
- mlx: fix imagegen lookup by @dhiltgen in #15588
- mlx: fix gemma4 cache to use logical view by @dhiltgen in #15617
New Contributors
- @scaryrawr made their first contribution in #15583
Full Changelog: v0.20.8-rc0...v0.21.0-rc0
v0.20.8
What's Changed
- ROCm: Update to ROCm 7.2.1 on Linux by @saman-amd in #15483
- gemma4: fix nothink case renderer by @drifkin in #15553
- gemma4: fix compiler error on metal by @dhiltgen in #15550
- gemma4: add nothink renderer tests by @drifkin in #15554
- mlx: mixed-precision quant and capability detection improvements by @dhiltgen in #15409
- mlx: add op wrappers for Conv2d, Pad, activations, trig, and masked SDPA by @dhiltgen in #14913
- Revert "gemma4: add nothink renderer tests" by @drifkin in #15555
- cgo: suppress deprecated warning to quiet down go build by @dhiltgen in #15438
- mac: prevent generate on cross-compiles by @dhiltgen in #15120
- Revert "gemma4: fix nothink case renderer" by @drifkin in #15556
- launch/opencode: use inline config by @hoyyeva in #15462
- gemma4: restore e2b-style nothink prompt by @drifkin in #15560
- Gemma4 on MLX by @dhiltgen in #15244
Full Changelog: v0.20.6...v0.20.8-rc0
v0.20.7
What's Changed
- Fix quality of gemma:e2b and gemma:e4b when thinking is disabled
- ROCm: Update to ROCm 7.2.1 on Linux by @saman-amd in #15483
Full Changelog: v0.20.6...v0.20.7
v0.20.6
What's Changed
- Gemma 4 tool calling ability is improved and updated to use Google's latest post-launch fixes
- Parallel tool calling improved for streaming responses
- Hermes agent Ollama integration guide is now available
- Ollama app is updated to fix image attachment errors
New Contributors
@matteocelani made their first contribution in #15272
Full Changelog: v0.20.5...v0.20.6
v0.20.5
OpenClaw channel setup with ollama launch
What's Changed
- OpenClaw channel setup: connect WhatsApp, Telegram, Discord, and other messaging channels through
ollama launch openclaw - Enable flash attention for Gemma 4 on compatible GPUs
ollama launch opencodenow detects curl-based OpenCode installs at~/.opencode/bin- Fix
/savecommand for models imported from safetensors
New Contributors
Full Changelog: v0.20.4...v0.20.5
v0.20.4
What's Changed
- mlx: Improve M5 performance with NAX
- gemma4: enable flash attention
Full Changelog: v0.20.3...v0.20.4
v0.20.3
What's Changed
- Gemma 4 Tool Calling improvements
- Added latest models to Ollama App
- OpenClaw fixes for launching TUI
Full Changelog: v0.20.2...v0.20.3
v0.20.2
What's Changed
- app: default app home view to new chat instead of launch by @jmorganca in #15312
Full Changelog: v0.20.1...v0.20.2
v0.20.1
What's Changed
- bench: add prompt calibration, context size flag, and NumCtx reporting by @dhiltgen in #15158
- model/parsers: fix gemma4 arg parsing when quoted strings contain " by @drifkin in #15254
- ggml: skip cublasGemmBatchedEx during graph reservation by @jessegross in #15301
- gemma4: enable flash attention by @dhiltgen in #15296
- ggml: fix ROCm build for cublasGemmBatchedEx reserve wrapper by @jessegross in #15305
- model/parsers: rework gemma4 tool call handling by @drifkin in #15306
Full Changelog: v0.20.0...v0.20.1
v0.20.0
Gemma 4
Effective 2B (E2B)
ollama run gemma4:e2b
Effective 4B (E4B)
ollama run gemma4:e4b
26B (Mixture of Experts model with 4B active parameters)
ollama run gemma4:26b
31B (Dense)
ollama run gemma4:31b
What's Changed
- docs: update pi docs by @ParthSareen in #15152
- mlx: respect tokenizer add_bos_token setting in pipeline by @dhiltgen in #15185
- tokenizer: add SentencePiece-style BPE support by @dhiltgen in #15162
Full Changelog: v0.19.0...v0.20.0-rc0