-
Notifications
You must be signed in to change notification settings - Fork 15.4k
Closed
Labels
Description
Name and Version
ggml_vulkan: Found 2 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6800 (AMD proprietary driver) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 32768 | int dot: 0 | matrix cores: none
ggml_vulkan: 1 = AMD Radeon RX 5700 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 32768 | int dot: 0 | matrix cores: none
version: 5139 (84778e9)
built with MSVC 19.43.34809.0 for x64
Operating systems
Windows
GGML backends
Vulkan
Hardware
Ryzen 5900X + Rx 5700XT + Rx 6800
Models
DeepSeek-V2-Lite-Chat.IQ4_NL.gguf
Problem description & steps to reproduce
The program always crashes after generating a single token. Prompt processing seem to work fine.
.\llama-cli.exe -m .\models\DeepSeek-V2-Lite-Chat.IQ4_NL.gguf -ngl 99 -t 12 -p "Hello"
First Bad Commit
most likely daa4228, I haven't done a proper bisect, but d6d2c2a was working.
Relevant log output
User: Hello
Assistant: HelloC:\Users\user\llama.cpp\ggml\src\ggml-vulkan\ggml-vulkan.cpp:ggml-vulkan.cpp:6078: GGML_ASSERT(ggml_vk_op_supports_incontiguous(op) || ggml_vk_dim01_contiguous(src0)) failedReactions are currently unavailable