Automatic GPU arch detection

When a user builds a package, they don't need to tell Spack which platform/OS/target/CPU they want to build for, Spack automatically detects this using libraries like distro and archspec.

This is not the case for GPUs and other accelerators. For users of ML libraries, the CPU is barely used, most computation actually happens on the GPU. Currently, users need to manually figure out what GPU they are using (run a command like `nvidia-smi`, look up the corresponding CUDA arch from https://developer.nvidia.com/cuda-gpus, and add a section to their `packages.yaml` like:
```yaml
packages:
  all:
    variants: +cuda cuda_arch=XY
```
This is not documented anywhere as far as I know, users need to figure this out by trial and error. If you don't set this under `all:` or try to set it on the command line, you may end up with a DAG with different settings for each package. For some packages, you'll see a concretization error message telling you to set `cuda_arch` if you want to use `+cuda`. For others, Spack will simply build `~cuda` if you don't tell it otherwise. This is not ideal.

We should allow Spack to automatically detect things like `cuda_arch` (NVIDIA) and `amdgpu_target` (AMD) and set them accordingly for all packages. The groundwork for this will need to be done in archspec, see https://github.com/archspec/archspec/issues/25. Once this is done, we'll need to change Spack to remove variants like `cuda_arch` and `amdgpu_target` and include them directly in the spec architecture. This issue is to track progress on this for the new ML SIG.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatic GPU arch detection #31581

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Automatic GPU arch detection #31581

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions