Skip to content

llamacpp-upstream: resolve Windows CUDA-13 asset minor dynamically (stop hardcoding 13.x) #43

@danyurkin

Description

@danyurkin

Background

PR #42 hotfixes the "Failed to download GPU backend" 404 by bumping the hardcoded Windows CUDA-13 minor 13.1 → 13.3, matching ggml-org's rename on release b9495.

This is a recurring class of bug: ggml-org periodically bumps the CUDA toolkit minor used in its Windows CI (…→ 13.1 → 13.3 → 13.x …). Every time, our hardcoded canonical id (win-cuda-13.x-x64) drifts from the actual release asset and downloads 404.

Root cause

The recommendation path and the actual release assets are decoupled:

  • detectIdealBackendType() emits a literal win-cuda-13.x-x64 from hardware features, without consulting the release.
  • fetchRemoteBackends() whitelists an exact win-cuda-13.x-x64 string.
  • getBackendDownloadUrl() / getCudartDownloadUrl() substitute the id straight into the filename.

So a recommendation can name a CUDA-13 backend that doesn't exist in the target release.

Proposed fix

Treat the CUDA-13 family as stable and the minor as dynamic:

  1. In fetchRemoteBackends(), accept any win-cuda-1[23]\.\d+-x64 asset the release actually publishes (regex instead of a literal whitelist), capturing the real minor.
  2. Have detectIdealBackendType() (and the Rust determine_supported_backends) pick the recommended CUDA-13 tier from the fetched asset list rather than emitting a literal — guaranteeing the recommended id is downloadable.
  3. Derive the cudart sidecar filename + toolkit version from the resolved id instead of the WINDOWS_CUDART_FILENAME / WINDOWS_CUDA_TOOLKIT_VERSION literal maps.
  4. Verify the CUDA-13 driver gate against the current toolkit's documented minimum (the >= 581.15 threshold is for Toolkit 13.1; confirm/adjust for whatever 13.x is live).

Result: no code change needed when ggml-org bumps the CUDA minor again.

Acceptance

  • Windows CUDA-13 host gets a download that matches a real release asset for any ggml-org CUDA-13.x minor, with no source change.
  • Existing win-cuda-13.1-x64 / win-cuda-13.3-x64 installs continue to resolve (migration preserved).
  • Unit tests cover an unexpected future minor (e.g. 13.4).

Companion to #42 (hotfix).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions