Misc. bug: Off-by-one-errror causes one layer to not go on the GPU

### Name and Version

```
version: 9527 (9c955c48b)
built with Clang 21.1.7 for Linux x86_64
```

### Operating systems

Linux

### Which llama.cpp modules do you know to be affected?

llama-server

### Command line

```shell
./build/bin/llama-server --jinja --model ~/llama/Qwen3.6-27B-MTP-GGUF/Qwen3.6-27B-UD-Q4_K_XL.gguf --threads -1 -fa on -ctv q5_1 -ctk q8_0 --host 0.0.0.0 --temp 1.0 --top-k 20 --top-p 0.95 --min-p 0 -fitt 320 --repeat-penalty 1.0 --presence-penalty 0.0 --reasoning on -np 1 -n 32768 --mmproj ~/llama/Qwen3.6-27B-MTP-GGUF/mmproj-F32.gguf --no-mmproj-offload --image-min-tokens 1024 --spec-type draft-mtp --spec-draft-n-max 3 --spec-draft-p-min 0.6 -lv 4 --cache-ram 32768 -c 131072
```

### Problem description & steps to reproduce

One layer no longer seems to be offloaded according to the logs.

### First Bad Commit

Very likely to be caused by https://github.com/ggml-org/llama.cpp/pull/24060

### Relevant log output

```
0.02.063.505 I load_tensors: loading model tensors, this can take a while... (mmap = true, direct_io = false)
0.03.232.046 I load_tensors: offloading output layer to GPU
0.03.232.049 I load_tensors: offloading 64 repeating layers to GPU                                                        
0.03.232.050 I load_tensors: offloaded 65/66 layers to GPU
0.03.232.053 I load_tensors:   CPU_Mapped model buffer size =   942.97 MiB                                                
0.03.232.053 I load_tensors:        CUDA0 model buffer size = 16126.00 MiB
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: Off-by-one-errror causes one layer to not go on the GPU #24183

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Misc. bug: Off-by-one-errror causes one layer to not go on the GPU #24183

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions