What prebuilt wheels AO ships and why

## Current Wheel Distribution

TorchAO currently ships two types of wheels:

1. Linux CUDA wheel with custom extensions:
   - Filename format: `torchao-0.8.0-cp39-abi3-linux_x86_64.whl`
   - Built specifically for Linux platforms with CUDA support
   - Contains compiled custom extensions

2. Pure Python wheel:
   - Filename format: `torchao-0.7.0-py3-none-any.whl`
   - Used for all other platforms
   - No compiled extensions

## Historical Context

Prior to PR #1276, TorchAO built separate binaries across all operating systems due to the presence of `init.cpp`, which required platform-specific compilation. After removing `init.cpp`, the package became a pure Python wheel for all platforms except Linux CUDA.

## Recent Changes

PR #1276 and #1277 introduced two significant changes:
1. Removed `init.cpp`, simplifying the build process for most platforms
2. Implemented `py_limited_api` semantics in `setup.py`, which changed the Linux CUDA wheel naming convention from: `torchao-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl` to `torchao-0.8.0-cp39-abi3-linux_x86_64.whl`


### Build Types & Outputs

#### 1. Platform-Specific Build
Currently, only Linux CUDA produces platform-specific wheels:
- Input: Linux CUDA build
- Output: `torchao-0.8.0-cp39-abi3-linux_x86_64.whl`
- Contains: Custom CUDA extensions

#### 2. Accelerator-Specific Builds
These are pure Python wheels with accelerator-specific suffixes:
- ROCm: `torchao-0.7.0+rocm-py3-none-any.whl`
- XPU: `torchao-0.7.0+xpu-py3-none-any.whl`
- Note: Despite having specialized CI runners, these contain no hardware-specific extensions, thus we want to disable these builds

#### 3. Pure Python Builds
All other configurations produce identical pure Python wheels:
- Output: `torchao-0.7.0-py3-none-any.whl`
- Platforms:
  - Linux CPU
  - Windows
  - ARM64/aarch64
  - M1 (Apple Silicon)

### Build Pipeline Visualization

```mermaid
flowchart TD
    A[TorchAO Build Process] --> B{Platform/Accelerator Check}
    
    B -->|Linux CUDA| C[Build CUDA Extensions]
    B -->|Linux ROCm| D[Pure Python Build]
    B -->|Linux XPU| E[Pure Python Build]
    B -->|Linux CPU| F[Pure Python Build]
    B -->|Windows| G[Pure Python Build]
    B -->|ARM64| H[Pure Python Build]
    B -->|M1| I[Pure Python Build]
    
    C --> J[CUDA Wheel<br>cp39-abi3-linux_x86_64]
    D --> K[Pure Python + ROCm suffix<br>py3-none-any+rocm]
    E --> L[Pure Python + XPU suffix<br>py3-none-any+xpu]
    F --> M[Pure Python Wheel<br>py3-none-any]
    G --> M
    H --> M
    I --> M
    
    subgraph Wheel_Types [Resulting Wheels]
        J[torchao-0.8.0-cp39-abi3-linux_x86_64.whl]
        K[torchao-0.7.0+rocm-py3-none-any.whl]
        L[torchao-0.7.0+xpu-py3-none-any.whl]
        M[torchao-0.7.0-py3-none-any.whl]
    end
```

### Planned Changes

Given that we currently only produce two distinct types of wheels (CUDA-specific and pure Python), we plan to streamline our CI/CD process:

1. **Reduce Redundant CI Steps**
   - Since most builds result in identical pure Python wheels, we can consolidate these build steps
   - Only maintain specialized runners for builds that produce unique artifacts (currently only Linux CUDA)

2. **Future Extensibility**
   - If native code support is added for other accelerators (ROCm, XPU, M1), we can re-enable dedicated CI pipelines
   - Each accelerator-specific build pipeline would need to include:
     - Wheel building
     - Platform-specific validation
     - PyTorch S3 publishing workflow

3. **Ownership Model**
   - Platform-specific builds and validation will be owned by the respective platform teams
   - Maintains clear responsibility for build pipeline maintenance and respective CI/CD health ownership

This approach allows us to maintain efficiency in our current setup while keeping the door open for future hardware-specific optimizations.

```mermaid
flowchart TD
    A[TorchAO Build Process] --> B{Has Native Extensions?}
    
    B -->|Yes: CUDA| C[Linux CUDA Pipeline]
    B -->|No| D[Pure Python Pipeline]
    
    C --> E[Build CUDA Extensions]
    E --> F[Platform Validation]
    F --> G[PyTorch S3 Upload]
    
    D --> H[Build Pure Python Wheel]
    H --> I[Basic Validation]
    I --> J[PyPI Upload]
    
    subgraph Future_Extensions [Future Platform Extensions]
        K[ROCm Pipeline]
        L[XPU Pipeline]
        K & L -->|When Native Code Added| C
    end
    
    subgraph Outputs [Current Wheel Types]
        G --> W1[torchao-0.X.0-cp39-abi3-linux_x86_64.whl]
        J --> W2[torchao-0.X.0-py3-none-any.whl]
    end

    style Future_Extensions stroke-dasharray: 5 5
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What prebuilt wheels AO ships and why #1747

Current Wheel Distribution

Historical Context

Recent Changes

Build Types & Outputs

1. Platform-Specific Build

2. Accelerator-Specific Builds

3. Pure Python Builds

Build Pipeline Visualization

Planned Changes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

What prebuilt wheels AO ships and why #1747

Description

Current Wheel Distribution

Historical Context

Recent Changes

Build Types & Outputs

1. Platform-Specific Build

2. Accelerator-Specific Builds

3. Pure Python Builds

Build Pipeline Visualization

Planned Changes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions