Skip to content

What prebuilt wheels AO ships and why #1747

@drisspg

Description

@drisspg

Current Wheel Distribution

TorchAO currently ships two types of wheels:

  1. Linux CUDA wheel with custom extensions:

    • Filename format: torchao-0.8.0-cp39-abi3-linux_x86_64.whl
    • Built specifically for Linux platforms with CUDA support
    • Contains compiled custom extensions
  2. Pure Python wheel:

    • Filename format: torchao-0.7.0-py3-none-any.whl
    • Used for all other platforms
    • No compiled extensions

Historical Context

Prior to PR #1276, TorchAO built separate binaries across all operating systems due to the presence of init.cpp, which required platform-specific compilation. After removing init.cpp, the package became a pure Python wheel for all platforms except Linux CUDA.

Recent Changes

PR #1276 and #1277 introduced two significant changes:

  1. Removed init.cpp, simplifying the build process for most platforms
  2. Implemented py_limited_api semantics in setup.py, which changed the Linux CUDA wheel naming convention from: torchao-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl to torchao-0.8.0-cp39-abi3-linux_x86_64.whl

Build Types & Outputs

1. Platform-Specific Build

Currently, only Linux CUDA produces platform-specific wheels:

  • Input: Linux CUDA build
  • Output: torchao-0.8.0-cp39-abi3-linux_x86_64.whl
  • Contains: Custom CUDA extensions

2. Accelerator-Specific Builds

These are pure Python wheels with accelerator-specific suffixes:

  • ROCm: torchao-0.7.0+rocm-py3-none-any.whl
  • XPU: torchao-0.7.0+xpu-py3-none-any.whl
  • Note: Despite having specialized CI runners, these contain no hardware-specific extensions, thus we want to disable these builds

3. Pure Python Builds

All other configurations produce identical pure Python wheels:

  • Output: torchao-0.7.0-py3-none-any.whl
  • Platforms:
    • Linux CPU
    • Windows
    • ARM64/aarch64
    • M1 (Apple Silicon)

Build Pipeline Visualization

flowchart TD
    A[TorchAO Build Process] --> B{Platform/Accelerator Check}
    
    B -->|Linux CUDA| C[Build CUDA Extensions]
    B -->|Linux ROCm| D[Pure Python Build]
    B -->|Linux XPU| E[Pure Python Build]
    B -->|Linux CPU| F[Pure Python Build]
    B -->|Windows| G[Pure Python Build]
    B -->|ARM64| H[Pure Python Build]
    B -->|M1| I[Pure Python Build]
    
    C --> J[CUDA Wheel<br>cp39-abi3-linux_x86_64]
    D --> K[Pure Python + ROCm suffix<br>py3-none-any+rocm]
    E --> L[Pure Python + XPU suffix<br>py3-none-any+xpu]
    F --> M[Pure Python Wheel<br>py3-none-any]
    G --> M
    H --> M
    I --> M
    
    subgraph Wheel_Types [Resulting Wheels]
        J[torchao-0.8.0-cp39-abi3-linux_x86_64.whl]
        K[torchao-0.7.0+rocm-py3-none-any.whl]
        L[torchao-0.7.0+xpu-py3-none-any.whl]
        M[torchao-0.7.0-py3-none-any.whl]
    end
Loading

Planned Changes

Given that we currently only produce two distinct types of wheels (CUDA-specific and pure Python), we plan to streamline our CI/CD process:

  1. Reduce Redundant CI Steps

    • Since most builds result in identical pure Python wheels, we can consolidate these build steps
    • Only maintain specialized runners for builds that produce unique artifacts (currently only Linux CUDA)
  2. Future Extensibility

    • If native code support is added for other accelerators (ROCm, XPU, M1), we can re-enable dedicated CI pipelines
    • Each accelerator-specific build pipeline would need to include:
      • Wheel building
      • Platform-specific validation
      • PyTorch S3 publishing workflow
  3. Ownership Model

    • Platform-specific builds and validation will be owned by the respective platform teams
    • Maintains clear responsibility for build pipeline maintenance and respective CI/CD health ownership

This approach allows us to maintain efficiency in our current setup while keeping the door open for future hardware-specific optimizations.

flowchart TD
    A[TorchAO Build Process] --> B{Has Native Extensions?}
    
    B -->|Yes: CUDA| C[Linux CUDA Pipeline]
    B -->|No| D[Pure Python Pipeline]
    
    C --> E[Build CUDA Extensions]
    E --> F[Platform Validation]
    F --> G[PyTorch S3 Upload]
    
    D --> H[Build Pure Python Wheel]
    H --> I[Basic Validation]
    I --> J[PyPI Upload]
    
    subgraph Future_Extensions [Future Platform Extensions]
        K[ROCm Pipeline]
        L[XPU Pipeline]
        K & L -->|When Native Code Added| C
    end
    
    subgraph Outputs [Current Wheel Types]
        G --> W1[torchao-0.X.0-cp39-abi3-linux_x86_64.whl]
        J --> W2[torchao-0.X.0-py3-none-any.whl]
    end

    style Future_Extensions stroke-dasharray: 5 5
Loading

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions