Skip to content

Eval bug: Lllama.cpp crashes when running Qwen Next 80B Coder #19304

@mdierolf

Description

@mdierolf

Name and Version

commit 91ea44e (HEAD -> master, tag: b7917, origin/master, origin/HEAD)
Author: lhez lih@qti.qualcomm.com
Date: Mon Feb 2 15:54:43 2026 -0800

opencl: refactor some ops, concat, repeat, tanh and scale (#19226)

* opencl: refactor concat

* opencl: refactor repeat

* opencl: refactor tanh

* opencl: enable fp16 for tanh

* opencl: refactor scale

* opencl: fix unused variables

Operating systems

Linux

GGML backends

CUDA

Hardware

RTX 6000

Models

Qwen3 Coder Next
https://huggingface.co/Qwen/Qwen3-Coder-Next-GGUF at Q8_0

Problem description & steps to reproduce

When running model at 86K context length after 50+ tool calls, llama.cpp crashed in llama_grammar_accept_token

First Bad Commit

No response

Relevant log output

Logs
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
__syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
warning: 56     ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S: No such file or directory
#0  __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
56      in ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S
#1  0x00007fbac6e99668 in __internal_syscall_cancel (a1=<optimized out>, a2=<optimized out>, a3=<optimized out>, a4=<optimized out>, a5=a5@entry=0, a6=a6@entry=0, nr=61) at ./nptl/cancellation.c:49
warning: 49     ./nptl/cancellation.c: No such file or directory
#2  0x00007fbac6e996ad in __syscall_cancel (a1=<optimized out>, a2=<optimized out>, a3=<optimized out>, a4=<optimized out>, a5=a5@entry=0, a6=a6@entry=0, nr=61) at ./nptl/cancellation.c:75
75      in ./nptl/cancellation.c
#3  0x00007fbac6f04787 in __GI___wait4 (pid=<optimized out>, stat_loc=<optimized out>, options=<optimized out>, usage=<optimized out>) at ../sysdeps/unix/sysv/linux/wait4.c:30
warning: 30     ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory
#4  0x00007fbac7775deb in ggml_print_backtrace () from /home/mdierolf/gitprojects/llama.cpp/build/bin/libggml-base.so.0
#5  0x00007fbac7787a39 in ggml_uncaught_exception() () from /home/mdierolf/gitprojects/llama.cpp/build/bin/libggml-base.so.0
#6  0x00007fbac70b344a in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007fbac70a15e9 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007fbac70b36c8 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007fbac746b991 in llama_grammar_accept_token(llama_grammar&, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [clone .cold] () from /home/mdierolf/gitprojects/llama.cpp/build/bin/libllama.so.0
#10 0x00007fbac74c67ee in llama_grammar_accept_impl(llama_grammar&, int) () from /home/mdierolf/gitprojects/llama.cpp/build/bin/libllama.so.0
#11 0x000055f258046b0f in common_sampler_accept(common_sampler*, int, bool) ()
#12 0x000055f257eabf8e in server_context_impl::update_slots() ()
#13 0x000055f257ef10ff in server_queue::start_loop(long) ()
#14 0x000055f257e0cc9e in main ()
[Inferior 1 (process 1349221) detached]
terminate called after throwing an instance of 'std::runtime_error'
  what():  Unexpected empty grammar stack after accepting piece: =read (89871)
Aborted (core dumped)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions