Name and Version
» build_vk/bin/llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = NVIDIA GeForce RTX 3090 (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 49152 | int dot: 1 | matrix cores: NV_coopmat2
version: 5835 (7209e12d6)
built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
Operating systems
Linux
GGML backends
Vulkan
Hardware
RTX 3090 (with and without coopmat2 or coopmat1), AMD Radeon Pro VII, Intel A770
Models
Mistral 7B Q8_0
Problem description & steps to reproduce
Incoherent output (repetitions, missing spaces, random letters, etc) after generating for a while with build_vk/bin/llama-cli -p "Once upon a time in" -c 16384 -b 16384 -ub 512 -n 1024 --ignore-eos -m models/Mistral-7B-Instruct-v0.3-Q8_0.gguf -ngl 99 -no-cnv
First Bad Commit
bd9c981 (#14366)
Relevant log output
build_vk/bin/llama-cli -p "Once upon a time in" -c 16384 -b 16384 -ub 512 -n 1024 --ignore-eos -m models/Mistral-7B-Instruct-v0.3-Q8_0.gguf -ngl 99 -no-cnv
Name and Version
» build_vk/bin/llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = NVIDIA GeForce RTX 3090 (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 49152 | int dot: 1 | matrix cores: NV_coopmat2
version: 5835 (7209e12d6)
built with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
Operating systems
Linux
GGML backends
Vulkan
Hardware
RTX 3090 (with and without coopmat2 or coopmat1), AMD Radeon Pro VII, Intel A770
Models
Mistral 7B Q8_0
Problem description & steps to reproduce
Incoherent output (repetitions, missing spaces, random letters, etc) after generating for a while with
build_vk/bin/llama-cli -p "Once upon a time in" -c 16384 -b 16384 -ub 512 -n 1024 --ignore-eos -m models/Mistral-7B-Instruct-v0.3-Q8_0.gguf -ngl 99 -no-cnvFirst Bad Commit
bd9c981 (#14366)
Relevant log output
build_vk/bin/llama-cli -p "Once upon a time in" -c 16384 -b 16384 -ub 512 -n 1024 --ignore-eos -m models/Mistral-7B-Instruct-v0.3-Q8_0.gguf -ngl 99 -no-cnv