3.15.0 (2026-01-10)

Features

LlamaCompletion: stopOnAbortSignal (#538) (734693d) (documentation: LlamaCompletionGenerationOptions["stopOnAbortSignal"])
LlamaModel: useDirectIo (#538) (734693d) (documentation: LlamaModelOptions["useDirectIo"])

Bug Fixes

support new CUDA 13.1 archs (#538) (734693d)
build the prebuilt binaries with CUDA 13.1 instead of 13.0 (#538) (734693d)

Shipped with llama.cpp release b7698

3.14.5 (2025-12-10)

Bug Fixes

OIDC package publish (#531) (3d3cb97)

Shipped with llama.cpp release b7347

3.14.4 (2025-12-08)

Bug Fixes

create-node-llama-cpp module package release (#530) (9a428e5)

Shipped with llama.cpp release b7324

3.14.3 (2025-12-08)

Features

source download CLI: log the downloaded release when the release is set to latest (#522) (e37835c)

Bug Fixes

adapt to llama.cpp changes (#522) (e37835c)
pad the context size to align with the implementation in llama.cpp (#522) (e37835c) (see #522 for more details)

Shipped with llama.cpp release b7315

3.14.2 (2025-10-26)

Bug Fixes

a new release due to a semantic-release failure in the previous release (#518) (e516e50)

Shipped with llama.cpp release b6845

3.14.1 (2025-10-26)

Bug Fixes

Vulkan: include integrated GPU memory (#516) (47475ac)
Vulkan: deduplicate the same device coming from different drivers (#516) (47475ac)
adapt Llama chat wrappers to breaking llama.cpp changes (#516) (47475ac)

Shipped with llama.cpp release b6843

3.14.0 (2025-10-02)

Features

Qwen3 Reranker support (#506) (00305f7) (see #506 for prequantized Qwen3 Reranker models you can use)

Bug Fixes

handle HuggingFace rate limit responses (#506) (00305f7)
adapt to llama.cpp breaking changes (#506) (00305f7)

Shipped with llama.cpp release b6673

3.13.0 (2025-09-09)

Features

Seed OSS support (#502) (eefe78c)

Bug Fixes

adapt to breaking llama.cpp changes (#501) (76b505e)
Vulkan: read external memory usage (#500) (d33cc31)

Shipped with llama.cpp release b6431

✨ `gpt-oss` is here! ✨

Read about the release in the blog post

3.12.4 (2025-08-28)

Bug Fixes

gpt-oss prompt preloading (#496) (db4a243)

Shipped with llama.cpp release b6301

✨ `gpt-oss` is here! ✨

Read about the release in the blog post

3.12.3 (2025-08-26)

Bug Fixes

Vulkan: context creation edge cases (#492) (12749c0)
prebuilt binaries CUDA 13 support (#494) (b10999d)
don't share loaded shared libraries between backends (#492) (12749c0)
split prebuilt CUDA binaries into 2 npm modules (#495) (6e59160)

Shipped with llama.cpp release b6294

Uh oh!

Releases: withcatai/node-llama-cpp

v3.15.0

3.15.0 (2026-01-10)

Features

Bug Fixes

Uh oh!

v3.14.5

3.14.5 (2025-12-10)

Bug Fixes

Uh oh!

v3.14.4

3.14.4 (2025-12-08)

Bug Fixes

Uh oh!

v3.14.3

3.14.3 (2025-12-08)

Features

Bug Fixes

Uh oh!

v3.14.2

3.14.2 (2025-10-26)

Bug Fixes

Uh oh!

v3.14.1

3.14.1 (2025-10-26)

Bug Fixes

Uh oh!

v3.14.0

3.14.0 (2025-10-02)

Features

Bug Fixes

Uh oh!

v3.13.0

3.13.0 (2025-09-09)

Features

Bug Fixes

Uh oh!

v3.12.4

✨ gpt-oss is here! ✨

3.12.4 (2025-08-28)

Bug Fixes

Uh oh!

v3.12.3

✨ gpt-oss is here! ✨

3.12.3 (2025-08-26)

Bug Fixes

Uh oh!

✨ `gpt-oss` is here! ✨

✨ `gpt-oss` is here! ✨