docs(llama.cpp): note tensor split now works with quantized KV cache by mudler · Pull Request #10135 · mudler/LocalAI

mudler · 2026-06-02T13:52:03Z

The split_mode: tensor description claimed tensor parallelism requires KV-cache quantization to be disabled. ggml-org/llama.cpp#23792 lifts that restriction by extending the meta backend to preserve shape information through KV-cache flatten/reshape, so cache_type_k/cache_type_v quantization can be combined with -sm tensor on builds that include it.

Documentation only: no backend code, grpc-server.cpp comment, or llama.cpp pin changes.

Assisted-by: Claude Code:claude-opus-4-8

Description

This PR fixes #

Notes for Reviewers

Signed commits

Yes, I signed my commits.

The split_mode: tensor description claimed tensor parallelism requires KV-cache quantization to be disabled. ggml-org/llama.cpp#23792 lifts that restriction by extending the meta backend to preserve shape information through KV-cache flatten/reshape, so cache_type_k/cache_type_v quantization can be combined with -sm tensor on builds that include it. Documentation only: no backend code, grpc-server.cpp comment, or llama.cpp pin changes. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude Code:claude-opus-4-8

mudler merged commit 595e448 into master Jun 2, 2026
52 checks passed

mudler deleted the worktree-docs-tensor-split-quant-kv branch June 2, 2026 13:52

localai-bot added the kind/documentation Improvements or additions to documentation label Jun 10, 2026

BrewTestBot mentioned this pull request Jun 10, 2026

localai 4.4.0 Homebrew/homebrew-core#287347

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs(llama.cpp): note tensor split now works with quantized KV cache#10135

docs(llama.cpp): note tensor split now works with quantized KV cache#10135
mudler merged 1 commit into
masterfrom
worktree-docs-tensor-split-quant-kv

mudler commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

mudler commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants