fix(llamacpp-upstream): bump Windows CUDA-13 backend 13.1 → 13.3 (fixes "Failed to download GPU backend") by danyurkin · Pull Request #42 · AtomicBot-ai/Atomic-Chat

danyurkin · 2026-06-03T21:12:20Z

Problem

Since 1.1.95, Windows users on CUDA 13 hosts get "Failed to download GPU backend". The "Better Configuration Available → CUDA 13" prompt offers a backend that 404s on download.

Root cause: ggml-org/llama.cpp renamed the Windows CUDA-13 release asset from llama-{tag}-bin-win-cuda-13.1-x64.zip to ...-cuda-13.3-x64.zip (visible on release b9495). The llamacpp-upstream provider hardcodes the 13.1 minor as the canonical backend id, so:

detectIdealBackendType() recommends the literal win-cuda-13.1-x64 to every CUDA-13 host (the prompt), and
getBackendDownloadUrl() turns it into …/b9495/llama-b9495-bin-win-cuda-13.1-x64.zip → HTTP 404.

Observed download attempt from a reporter (1.1.95):

Downloading backend b9495/win-cuda-13.1-x64 from
https://github.com/ggml-org/llama.cpp/releases/download/b9495/llama-b9495-bin-win-cuda-13.1-x64.zip
Error: Failed to get file size: HTTP status 404 Not Found

The release only ships llama-b9495-bin-win-cuda-13.3-x64.zip.

Fix (hotfix)

Bump the canonical Windows CUDA-13 minor 13.1 → 13.3 everywhere it's treated as an identity string, across the TS extension and the Rust plugin:

recommendation tier (detectIdealBackendType), static variant list, remote-asset whitelist
cudart sidecar filename + toolkit-version maps + backend regex
category matcher + priority list + is_cuda_installed cudart lib lookup
map_old_backend_to_new canonical target

Back-compat: legacy cuda-13.1 is still recognized by the category matcher and is migrated forward to 13.3 by map_old_backend_to_new, so an already-installed win-cuda-13.1-x64 folder is not orphaned. Updated the affected Rust unit tests accordingly.

Out of scope / follow-ups

Driver gate unchanged. The CUDA-13 gate (>= 581.15) is the documented minimum for Toolkit 13.1; CUDA 13.3's minimum Windows driver should be verified separately. Low risk here because the runtime device probe already degrades to 12.4 / Vulkan / CPU when a recommended tier can't enumerate a GPU.
Root-cause hardening (so this never recurs on the next upstream minor bump) is tracked separately — see the companion issue: resolve the CUDA-13.x asset name dynamically from the release instead of hardcoding the minor.

Testing

CI (TS typecheck/vitest + cargo test for tauri-plugin-llamacpp-upstream)
Manual: on a Windows CUDA-13 host, accept the "CUDA 13" upgrade prompt and confirm the backend downloads (no 404).

🤖 Generated with Claude Code

ggml-org/llama.cpp renamed the Windows CUDA-13 release asset from `...-bin-win-cuda-13.1-x64.zip` to `...-bin-win-cuda-13.3-x64.zip` (observed on release b9495). The upstream provider hardcoded the `13.1` minor as the canonical backend id, so on >= 1.1.95: - detectIdealBackendType() recommends `win-cuda-13.1-x64` to every CUDA-13 host ("Better Configuration Available → CUDA 13"), and - getBackendDownloadUrl() resolves it to `llama-b9495-bin-win-cuda-13.1-x64.zip` → HTTP 404 ("Failed to download GPU backend"). Bump the canonical Windows CUDA-13 minor to 13.3 across the TS extension and the Rust plugin (recommendation, static variant list, remote whitelist, cudart sidecar filename, toolkit-version map, category matcher, priority list, cudart lib lookup). Legacy `cuda-13.1` is still recognized by the category matcher and migrated forward by map_old_backend_to_new(), so already-installed `win-cuda-13.1-x64` folders are not orphaned. Note: the CUDA-13 driver gate (>= 581.15) is left unchanged — it is the documented minimum for Toolkit 13.1; CUDA 13.3's minimum driver should be verified as a follow-up. The runtime device probe already degrades to 12.4 / Vulkan / CPU if a recommended tier cannot enumerate a GPU. Fixes the "Failed to download GPU backend" report on Windows CUDA 13. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

danyurkin · 2026-06-03T21:12:52Z

Root-cause hardening follow-up tracked in #43 (resolve the CUDA-13 asset minor dynamically so this can't recur on the next ggml-org bump).

danyurkin mentioned this pull request Jun 3, 2026

llamacpp-upstream: resolve Windows CUDA-13 asset minor dynamically (stop hardcoding 13.x) #43

Closed

Vect0rM merged commit 6d876ec into AtomicBot-ai:main Jun 3, 2026

danyurkin mentioned this pull request Jun 3, 2026

llamacpp-upstream: unsupported mmproj projector type (gemma4a) crashes whole llama-server with opaque "unexpected error" #44

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llamacpp-upstream): bump Windows CUDA-13 backend 13.1 → 13.3 (fixes "Failed to download GPU backend")#42

fix(llamacpp-upstream): bump Windows CUDA-13 backend 13.1 → 13.3 (fixes "Failed to download GPU backend")#42
Vect0rM merged 1 commit into
AtomicBot-ai:mainfrom
danyurkin:fix/cuda-13-3-backend-download

danyurkin commented Jun 3, 2026

Uh oh!

danyurkin commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

danyurkin commented Jun 3, 2026

Problem

Fix (hotfix)

Out of scope / follow-ups

Testing

Uh oh!

danyurkin commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants