Skip to content

Misc. bug: ggml-backend.cpp:746: pre-allocated tensor (cache_k_l0 (view) (copy of cache_k_l0 (view))) in a buffer (Vulkan0) that cannot run the operation (CPY) #12045

@simonchen

Description

@simonchen

Name and Version

llama-server.exe --version
version: 4764 (7ad0779)
built with MSVC 19.42.34436.0 for x64

Operating systems

No response

Which llama.cpp modules do you know to be affected?

No response

Command line

llama-server.exe -m %file_path_16b% --no-mmap -fa -ctk q4_0 -c 8192 -np 2 -ngl 50 --temp 0.6 -t 10 -tb 8 -C FF000 --no-perf --host 0.0.0.0 --port 3000

Problem description & steps to reproduce

prompt eval time = 16975.44 ms / 282 tokens ( 60.20 ms per token, 16.61 tokens per second)
eval time = 2257.84 ms / 28 tokens ( 80.64 ms per token, 12.40 tokens per second)
total time = 19233.28 ms / 310 tokens
srv log_server_r: request: POST /v1/chat/completions 127.0.0.1 200
srv update_slots: all slots are idle
srv params_from_: Chat format: Content-only
slot launch_slot_: id 1 | task 1773 | processing task
slot update_slots: id 1 | task 1773 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 1426
slot update_slots: id 1 | task 1773 | kv cache rm [64, end)
slot update_slots: id 1 | task 1773 | prompt processing progress, n_past = 1426, n_tokens = 1362, progress = 0.955119
slot update_slots: id 1 | task 1773 | prompt done, n_past = 1426, n_tokens = 1362
D:\a\llama.cpp\llama.cpp\ggml\src\ggml-backend.cpp:746: pre-allocated tensor (cache_k_l0 (view) (copy of cache_k_l0 (view))) in a buffer (Vulkan0) that cannot run the operation (CPY)

[process exited with code 3221226505 (0xc0000409)]

First Bad Commit

Please help to resolve the error:

pre-allocated tensor (cache_k_l0 (view) (copy of cache_k_l0 (view))) in a buffer (Vulkan0) that cannot run the operation (CPY)

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions