Background
PR #42 hotfixes the "Failed to download GPU backend" 404 by bumping the hardcoded Windows CUDA-13 minor 13.1 → 13.3, matching ggml-org's rename on release b9495.
This is a recurring class of bug: ggml-org periodically bumps the CUDA toolkit minor used in its Windows CI (…→ 13.1 → 13.3 → 13.x …). Every time, our hardcoded canonical id (win-cuda-13.x-x64) drifts from the actual release asset and downloads 404.
Root cause
The recommendation path and the actual release assets are decoupled:
detectIdealBackendType() emits a literal win-cuda-13.x-x64 from hardware features, without consulting the release.
fetchRemoteBackends() whitelists an exact win-cuda-13.x-x64 string.
getBackendDownloadUrl() / getCudartDownloadUrl() substitute the id straight into the filename.
So a recommendation can name a CUDA-13 backend that doesn't exist in the target release.
Proposed fix
Treat the CUDA-13 family as stable and the minor as dynamic:
- In
fetchRemoteBackends(), accept any win-cuda-1[23]\.\d+-x64 asset the release actually publishes (regex instead of a literal whitelist), capturing the real minor.
- Have
detectIdealBackendType() (and the Rust determine_supported_backends) pick the recommended CUDA-13 tier from the fetched asset list rather than emitting a literal — guaranteeing the recommended id is downloadable.
- Derive the cudart sidecar filename + toolkit version from the resolved id instead of the
WINDOWS_CUDART_FILENAME / WINDOWS_CUDA_TOOLKIT_VERSION literal maps.
- Verify the CUDA-13 driver gate against the current toolkit's documented minimum (the
>= 581.15 threshold is for Toolkit 13.1; confirm/adjust for whatever 13.x is live).
Result: no code change needed when ggml-org bumps the CUDA minor again.
Acceptance
- Windows CUDA-13 host gets a download that matches a real release asset for any ggml-org CUDA-13.x minor, with no source change.
- Existing
win-cuda-13.1-x64 / win-cuda-13.3-x64 installs continue to resolve (migration preserved).
- Unit tests cover an unexpected future minor (e.g.
13.4).
Companion to #42 (hotfix).
Background
PR #42 hotfixes the "Failed to download GPU backend" 404 by bumping the hardcoded Windows CUDA-13 minor
13.1 → 13.3, matching ggml-org's rename on releaseb9495.This is a recurring class of bug: ggml-org periodically bumps the CUDA toolkit minor used in its Windows CI (…→ 13.1 → 13.3 → 13.x …). Every time, our hardcoded canonical id (
win-cuda-13.x-x64) drifts from the actual release asset and downloads 404.Root cause
The recommendation path and the actual release assets are decoupled:
detectIdealBackendType()emits a literalwin-cuda-13.x-x64from hardware features, without consulting the release.fetchRemoteBackends()whitelists an exactwin-cuda-13.x-x64string.getBackendDownloadUrl()/getCudartDownloadUrl()substitute the id straight into the filename.So a recommendation can name a CUDA-13 backend that doesn't exist in the target release.
Proposed fix
Treat the CUDA-13 family as stable and the minor as dynamic:
fetchRemoteBackends(), accept anywin-cuda-1[23]\.\d+-x64asset the release actually publishes (regex instead of a literal whitelist), capturing the real minor.detectIdealBackendType()(and the Rustdetermine_supported_backends) pick the recommended CUDA-13 tier from the fetched asset list rather than emitting a literal — guaranteeing the recommended id is downloadable.WINDOWS_CUDART_FILENAME/WINDOWS_CUDA_TOOLKIT_VERSIONliteral maps.>= 581.15threshold is for Toolkit 13.1; confirm/adjust for whatever 13.x is live).Result: no code change needed when ggml-org bumps the CUDA minor again.
Acceptance
win-cuda-13.1-x64/win-cuda-13.3-x64installs continue to resolve (migration preserved).13.4).Companion to #42 (hotfix).