Releases: withcatai/node-llama-cpp
v3.15.0
3.15.0 (2026-01-10)
Features
LlamaCompletion:stopOnAbortSignal(#538) (734693d) (documentation:LlamaCompletionGenerationOptions["stopOnAbortSignal"])
LlamaModel:useDirectIo(#538) (734693d) (documentation:LlamaModelOptions["useDirectIo"])
Bug Fixes
- support new CUDA 13.1 archs (#538) (734693d)
- build the prebuilt binaries with CUDA 13.1 instead of 13.0 (#538) (734693d)
Shipped with llama.cpp release b7698
To use the latest
llama.cpprelease available, runnpx -n node-llama-cpp source download --release latest. (learn more)
v3.14.5
v3.14.4
v3.14.3
3.14.3 (2025-12-08)
Features
Bug Fixes
- adapt to
llama.cppchanges (#522) (e37835c) - pad the context size to align with the implementation in
llama.cpp(#522) (e37835c) (see #522 for more details)
Shipped with llama.cpp release b7315
To use the latest
llama.cpprelease available, runnpx -n node-llama-cpp source download --release latest. (learn more)
v3.14.2
v3.14.1
3.14.1 (2025-10-26)
Bug Fixes
- Vulkan: include integrated GPU memory (#516) (47475ac)
- Vulkan: deduplicate the same device coming from different drivers (#516) (47475ac)
- adapt Llama chat wrappers to breaking
llama.cppchanges (#516) (47475ac)
Shipped with llama.cpp release b6843
To use the latest
llama.cpprelease available, runnpx -n node-llama-cpp source download --release latest. (learn more)
v3.14.0
3.14.0 (2025-10-02)
Features
- Qwen3 Reranker support (#506) (00305f7) (see #506 for prequantized Qwen3 Reranker models you can use)
Bug Fixes
- handle HuggingFace rate limit responses (#506) (00305f7)
- adapt to
llama.cppbreaking changes (#506) (00305f7)
Shipped with llama.cpp release b6673
To use the latest
llama.cpprelease available, runnpx -n node-llama-cpp source download --release latest. (learn more)
v3.13.0
3.13.0 (2025-09-09)
Features
Bug Fixes
- adapt to breaking
llama.cppchanges (#501) (76b505e) - Vulkan: read external memory usage (#500) (d33cc31)
Shipped with llama.cpp release b6431
To use the latest
llama.cpprelease available, runnpx -n node-llama-cpp source download --release latest. (learn more)
v3.12.4
✨ gpt-oss is here! ✨
Read about the release in the blog post
3.12.4 (2025-08-28)
Bug Fixes
Shipped with llama.cpp release b6301
To use the latest
llama.cpprelease available, runnpx -n node-llama-cpp source download --release latest. (learn more)
v3.12.3
✨ gpt-oss is here! ✨
Read about the release in the blog post
3.12.3 (2025-08-26)
Bug Fixes
- Vulkan: context creation edge cases (#492) (12749c0)
- prebuilt binaries CUDA 13 support (#494) (b10999d)
- don't share loaded shared libraries between backends (#492) (12749c0)
- split prebuilt CUDA binaries into 2 npm modules (#495) (6e59160)
Shipped with llama.cpp release b6294
To use the latest
llama.cpprelease available, runnpx -n node-llama-cpp source download --release latest. (learn more)
