Misc. bug: ggml-backend.cpp:746: pre-allocated tensor (cache_k_l0 (view) (copy of cache_k_l0 (view))) in a buffer (Vulkan0) that cannot run the operation (CPY)

### Name and Version

llama-server.exe --version
version: 4764 (7ad0779f)
built with MSVC 19.42.34436.0 for x64

### Operating systems

_No response_

### Which llama.cpp modules do you know to be affected?

_No response_

### Command line

```shell
llama-server.exe -m %file_path_16b% --no-mmap -fa -ctk q4_0 -c 8192 -np 2 -ngl 50 --temp 0.6 -t 10 -tb 8 -C FF000 --no-perf --host 0.0.0.0 --port 3000
```

### Problem description & steps to reproduce

prompt eval time =   16975.44 ms /   282 tokens (   60.20 ms per token,    16.61 tokens per second)
       eval time =    2257.84 ms /    28 tokens (   80.64 ms per token,    12.40 tokens per second)
      total time =   19233.28 ms /   310 tokens
srv  log_server_r: request: POST /v1/chat/completions 127.0.0.1 200
srv  update_slots: all slots are idle
srv  params_from_: Chat format: Content-only
slot launch_slot_: id  1 | task 1773 | processing task
slot update_slots: id  1 | task 1773 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 1426
slot update_slots: id  1 | task 1773 | kv cache rm [64, end)
slot update_slots: id  1 | task 1773 | prompt processing progress, n_past = 1426, n_tokens = 1362, progress = 0.955119
slot update_slots: id  1 | task 1773 | prompt done, n_past = 1426, n_tokens = 1362
D:\a\llama.cpp\llama.cpp\ggml\src\ggml-backend.cpp:746: pre-allocated tensor (cache_k_l0 (view) (copy of cache_k_l0 (view))) in a buffer (Vulkan0) that cannot run the operation (CPY)

[process exited with code 3221226505 (0xc0000409)]

### First Bad Commit

Please help to resolve the error:

pre-allocated tensor (cache_k_l0 (view) (copy of cache_k_l0 (view))) in a buffer (Vulkan0) that cannot run the operation (CPY)

### Relevant log output

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: ggml-backend.cpp:746: pre-allocated tensor (cache_k_l0 (view) (copy of cache_k_l0 (view))) in a buffer (Vulkan0) that cannot run the operation (CPY) #12045

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: ggml-backend.cpp:746: pre-allocated tensor (cache_k_l0 (view) (copy of cache_k_l0 (view))) in a buffer (Vulkan0) that cannot run the operation (CPY) #12045

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions