-
Notifications
You must be signed in to change notification settings - Fork 15.3k
Labels
Description
Name and Version
commit 91ea44e (HEAD -> master, tag: b7917, origin/master, origin/HEAD)
Author: lhez lih@qti.qualcomm.com
Date: Mon Feb 2 15:54:43 2026 -0800
opencl: refactor some ops, concat, repeat, tanh and scale (#19226)
* opencl: refactor concat
* opencl: refactor repeat
* opencl: refactor tanh
* opencl: enable fp16 for tanh
* opencl: refactor scale
* opencl: fix unused variables
Operating systems
Linux
GGML backends
CUDA
Hardware
RTX 6000
Models
Qwen3 Coder Next
https://huggingface.co/Qwen/Qwen3-Coder-Next-GGUF at Q8_0
Problem description & steps to reproduce
When running model at 86K context length after 50+ tool calls, llama.cpp crashed in llama_grammar_accept_token
First Bad Commit
No response
Relevant log output
Logs
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
__syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
warning: 56 ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S: No such file or directory
#0 __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
56 in ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S
#1 0x00007fbac6e99668 in __internal_syscall_cancel (a1=<optimized out>, a2=<optimized out>, a3=<optimized out>, a4=<optimized out>, a5=a5@entry=0, a6=a6@entry=0, nr=61) at ./nptl/cancellation.c:49
warning: 49 ./nptl/cancellation.c: No such file or directory
#2 0x00007fbac6e996ad in __syscall_cancel (a1=<optimized out>, a2=<optimized out>, a3=<optimized out>, a4=<optimized out>, a5=a5@entry=0, a6=a6@entry=0, nr=61) at ./nptl/cancellation.c:75
75 in ./nptl/cancellation.c
#3 0x00007fbac6f04787 in __GI___wait4 (pid=<optimized out>, stat_loc=<optimized out>, options=<optimized out>, usage=<optimized out>) at ../sysdeps/unix/sysv/linux/wait4.c:30
warning: 30 ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory
#4 0x00007fbac7775deb in ggml_print_backtrace () from /home/mdierolf/gitprojects/llama.cpp/build/bin/libggml-base.so.0
#5 0x00007fbac7787a39 in ggml_uncaught_exception() () from /home/mdierolf/gitprojects/llama.cpp/build/bin/libggml-base.so.0
#6 0x00007fbac70b344a in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x00007fbac70a15e9 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#8 0x00007fbac70b36c8 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007fbac746b991 in llama_grammar_accept_token(llama_grammar&, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [clone .cold] () from /home/mdierolf/gitprojects/llama.cpp/build/bin/libllama.so.0
#10 0x00007fbac74c67ee in llama_grammar_accept_impl(llama_grammar&, int) () from /home/mdierolf/gitprojects/llama.cpp/build/bin/libllama.so.0
#11 0x000055f258046b0f in common_sampler_accept(common_sampler*, int, bool) ()
#12 0x000055f257eabf8e in server_context_impl::update_slots() ()
#13 0x000055f257ef10ff in server_queue::start_loop(long) ()
#14 0x000055f257e0cc9e in main ()
[Inferior 1 (process 1349221) detached]
terminate called after throwing an instance of 'std::runtime_error'
what(): Unexpected empty grammar stack after accepting piece: =read (89871)
Aborted (core dumped)
Reactions are currently unavailable