Skip to content

Misc. bug: Tensor parallelism causes loops #21703

@eleqtrizit

Description

@eleqtrizit

Name and Version

❯ llama-cli --version
ggml_cuda_init: found 2 CUDA devices (Total VRAM: 64215 MiB):
Device 0: NVIDIA GeForce RTX 5090, compute capability 12.0, VMM: yes, VRAM: 32106 MiB
Device 1: NVIDIA GeForce RTX 5090, compute capability 12.0, VMM: yes, VRAM: 32109 MiB
version: 8740 (e34f042)
built with GNU 13.3.0 for Linux x86_64

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-cli

Command line

llama-cli -hf unsloth/Qwen3-Coder-Next-GGUF:MXFP4_MOE --split-mode tensor --temp 1.0 --top-p 0.95 --top-k 40 --min-p 0.01 --repeat-penalty 1.0

Problem description & steps to reproduce

Testing the new tensor parallelism with my two 5090's.

llama-cli -hf unsloth/Qwen3-Coder-Next-GGUF:MXFP4_MOE --split-mode tensor --temp 1.0 --top-p 0.95 --top-k 40 --min-p 0.01 --repeat-penalty 1.0

Simple prompt: Write me a python snake game

Causes looping. Without --split-mode tensor, task completes easily.

Example:

food.penup()  # to prevent showing lines when turtle not using tracers
food.shapesize(0.5)  # size of the snake (0.5 is big) and by the snake (0.5 is big) and by the snake (0.5 is big) and by the snake (0.5 is big) and by the snake (0.5 is big) and by the snake (0.5 is big) and by the snake (0.5 is big) and by the snake (0.5 is big) and by the snake (0.5 is big) and by the snake (0.5 is big!0.5 is big) and by the snake (0.5 is big!0.5 is big?0.5 is big?0.5 is big?0.5 is big?0.5 is big?0.5 is big?0.5 is big?0.5 is big?0.5 is big?0.5 is big?0.5 is big?0.5 is big?0.5 is big?0.5 is big?0

Running CUDA 13.2 and I compiled it myself

#!/bin/sh

rm -rf build
git pull

cmake -B build -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc -DGGML_CUDA=ON -DGGML_CUDA_FA=ON -DLLAMA_CURL=ON -DGGML_RPC=ON -DLLAMA_BUILD_BORINGSSL=ON -DLLAMA_BUILD_LIBRESSL=ON #-DGGML_VULKAN=1
cmake --build build --config Release -j $(nproc)

First Bad Commit

One of these? #19378

Relevant log output

No logs.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions