Misc. bug: -sm tensor + MTP + ngram-mod = crash

### Name and Version

version 9438
compiled for CUDA 13.2

### Tested on

Linux Cachyos 
5060ti 16gb + 2060 super llama cpp built with CUDA 13.2 

Windows 11
2x 5060ti 16gb llama cpp built with CUDA 13.2 

```
cmake -B build -G Ninja \
      -DGGML_CUDA=ON \
      -DCMAKE_CUDA_ARCHITECTURES="native" \
      -DGGML_CUDA_FA_ALL_QUANTS=ON \
      -DGGML_CUDA_F16=ON \
      -DGGML_NATIVE=ON \
      -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON \
      -DCMAKE_BUILD_TYPE=Release \
      -DLLAMA_BUILD_TESTS=OFF \
      -DLLAMA_BUILD_EXAMPLES=ON \
      -DLLAMA_BUILD_SERVER=ON

  cmake --build build --config Release -j$(nproc)
```

### Operating systems

Linux Cachyos 
Windows 11

### Which llama.cpp modules do you know to be affected?

llama-server

### Command line

```shell
./build/bin/llama-server \
      --verbosity 4 \
      -m /home/abdulrahman/Personal/Programs/llama/models/Qwen/Qwen3.6-27B-Q4_K_S.gguf \
      --chat_template_kwargs '{"preserve_thinking": "True"}' \
      --jinja \
      --host 0.0.0.0 --port 8080 \
      --spec-type draft-mtp,ngram-mod --spec-draft-n-max 2 --spec-ngram-mod-n-match 24 --spec-ngram-mod-n-min 2 --spec-ngram-mod-n-max 48 \
      -c 64000 \
      -sm tensor --tensor-split 66,24 -ub 48
```

### Problem description & steps to reproduce

it simply crashes when ngram starts generating tokens and the only error is `terminated by signal SIGSEGV (Address boundary error)` nothing else it just lages for a bit and stops generating then crach and that error


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: -sm tensor + MTP + ngram-mod = crash #23929

Name and Version

Tested on

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Misc. bug: -sm tensor + MTP + ngram-mod = crash #23929

Description

Name and Version

Tested on

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions