You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Built specifically for Linux platforms with CUDA support
Contains compiled custom extensions
Pure Python wheel:
Filename format: torchao-0.7.0-py3-none-any.whl
Used for all other platforms
No compiled extensions
Historical Context
Prior to PR #1276, TorchAO built separate binaries across all operating systems due to the presence of init.cpp, which required platform-specific compilation. After removing init.cpp, the package became a pure Python wheel for all platforms except Linux CUDA.
Recent Changes
PR #1276 and #1277 introduced two significant changes:
Removed init.cpp, simplifying the build process for most platforms
Implemented py_limited_api semantics in setup.py, which changed the Linux CUDA wheel naming convention from: torchao-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl to torchao-0.8.0-cp39-abi3-linux_x86_64.whl
Build Types & Outputs
1. Platform-Specific Build
Currently, only Linux CUDA produces platform-specific wheels:
Input: Linux CUDA build
Output: torchao-0.8.0-cp39-abi3-linux_x86_64.whl
Contains: Custom CUDA extensions
2. Accelerator-Specific Builds
These are pure Python wheels with accelerator-specific suffixes:
ROCm: torchao-0.7.0+rocm-py3-none-any.whl
XPU: torchao-0.7.0+xpu-py3-none-any.whl
Note: Despite having specialized CI runners, these contain no hardware-specific extensions, thus we want to disable these builds
3. Pure Python Builds
All other configurations produce identical pure Python wheels:
Output: torchao-0.7.0-py3-none-any.whl
Platforms:
Linux CPU
Windows
ARM64/aarch64
M1 (Apple Silicon)
Build Pipeline Visualization
flowchart TD
A[TorchAO Build Process] --> B{Platform/Accelerator Check}
B -->|Linux CUDA| C[Build CUDA Extensions]
B -->|Linux ROCm| D[Pure Python Build]
B -->|Linux XPU| E[Pure Python Build]
B -->|Linux CPU| F[Pure Python Build]
B -->|Windows| G[Pure Python Build]
B -->|ARM64| H[Pure Python Build]
B -->|M1| I[Pure Python Build]
C --> J[CUDA Wheel<br>cp39-abi3-linux_x86_64]
D --> K[Pure Python + ROCm suffix<br>py3-none-any+rocm]
E --> L[Pure Python + XPU suffix<br>py3-none-any+xpu]
F --> M[Pure Python Wheel<br>py3-none-any]
G --> M
H --> M
I --> M
subgraph Wheel_Types [Resulting Wheels]
J[torchao-0.8.0-cp39-abi3-linux_x86_64.whl]
K[torchao-0.7.0+rocm-py3-none-any.whl]
L[torchao-0.7.0+xpu-py3-none-any.whl]
M[torchao-0.7.0-py3-none-any.whl]
end
Loading
Planned Changes
Given that we currently only produce two distinct types of wheels (CUDA-specific and pure Python), we plan to streamline our CI/CD process:
Reduce Redundant CI Steps
Since most builds result in identical pure Python wheels, we can consolidate these build steps
Only maintain specialized runners for builds that produce unique artifacts (currently only Linux CUDA)
Future Extensibility
If native code support is added for other accelerators (ROCm, XPU, M1), we can re-enable dedicated CI pipelines
Each accelerator-specific build pipeline would need to include:
Wheel building
Platform-specific validation
PyTorch S3 publishing workflow
Ownership Model
Platform-specific builds and validation will be owned by the respective platform teams
Maintains clear responsibility for build pipeline maintenance and respective CI/CD health ownership
This approach allows us to maintain efficiency in our current setup while keeping the door open for future hardware-specific optimizations.
flowchart TD
A[TorchAO Build Process] --> B{Has Native Extensions?}
B -->|Yes: CUDA| C[Linux CUDA Pipeline]
B -->|No| D[Pure Python Pipeline]
C --> E[Build CUDA Extensions]
E --> F[Platform Validation]
F --> G[PyTorch S3 Upload]
D --> H[Build Pure Python Wheel]
H --> I[Basic Validation]
I --> J[PyPI Upload]
subgraph Future_Extensions [Future Platform Extensions]
K[ROCm Pipeline]
L[XPU Pipeline]
K & L -->|When Native Code Added| C
end
subgraph Outputs [Current Wheel Types]
G --> W1[torchao-0.X.0-cp39-abi3-linux_x86_64.whl]
J --> W2[torchao-0.X.0-py3-none-any.whl]
end
style Future_Extensions stroke-dasharray: 5 5
Current Wheel Distribution
TorchAO currently ships two types of wheels:
Linux CUDA wheel with custom extensions:
torchao-0.8.0-cp39-abi3-linux_x86_64.whlPure Python wheel:
torchao-0.7.0-py3-none-any.whlHistorical Context
Prior to PR #1276, TorchAO built separate binaries across all operating systems due to the presence of
init.cpp, which required platform-specific compilation. After removinginit.cpp, the package became a pure Python wheel for all platforms except Linux CUDA.Recent Changes
PR #1276 and #1277 introduced two significant changes:
init.cpp, simplifying the build process for most platformspy_limited_apisemantics insetup.py, which changed the Linux CUDA wheel naming convention from:torchao-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whltotorchao-0.8.0-cp39-abi3-linux_x86_64.whlBuild Types & Outputs
1. Platform-Specific Build
Currently, only Linux CUDA produces platform-specific wheels:
torchao-0.8.0-cp39-abi3-linux_x86_64.whl2. Accelerator-Specific Builds
These are pure Python wheels with accelerator-specific suffixes:
torchao-0.7.0+rocm-py3-none-any.whltorchao-0.7.0+xpu-py3-none-any.whl3. Pure Python Builds
All other configurations produce identical pure Python wheels:
torchao-0.7.0-py3-none-any.whlBuild Pipeline Visualization
flowchart TD A[TorchAO Build Process] --> B{Platform/Accelerator Check} B -->|Linux CUDA| C[Build CUDA Extensions] B -->|Linux ROCm| D[Pure Python Build] B -->|Linux XPU| E[Pure Python Build] B -->|Linux CPU| F[Pure Python Build] B -->|Windows| G[Pure Python Build] B -->|ARM64| H[Pure Python Build] B -->|M1| I[Pure Python Build] C --> J[CUDA Wheel<br>cp39-abi3-linux_x86_64] D --> K[Pure Python + ROCm suffix<br>py3-none-any+rocm] E --> L[Pure Python + XPU suffix<br>py3-none-any+xpu] F --> M[Pure Python Wheel<br>py3-none-any] G --> M H --> M I --> M subgraph Wheel_Types [Resulting Wheels] J[torchao-0.8.0-cp39-abi3-linux_x86_64.whl] K[torchao-0.7.0+rocm-py3-none-any.whl] L[torchao-0.7.0+xpu-py3-none-any.whl] M[torchao-0.7.0-py3-none-any.whl] endPlanned Changes
Given that we currently only produce two distinct types of wheels (CUDA-specific and pure Python), we plan to streamline our CI/CD process:
Reduce Redundant CI Steps
Future Extensibility
Ownership Model
This approach allows us to maintain efficiency in our current setup while keeping the door open for future hardware-specific optimizations.
flowchart TD A[TorchAO Build Process] --> B{Has Native Extensions?} B -->|Yes: CUDA| C[Linux CUDA Pipeline] B -->|No| D[Pure Python Pipeline] C --> E[Build CUDA Extensions] E --> F[Platform Validation] F --> G[PyTorch S3 Upload] D --> H[Build Pure Python Wheel] H --> I[Basic Validation] I --> J[PyPI Upload] subgraph Future_Extensions [Future Platform Extensions] K[ROCm Pipeline] L[XPU Pipeline] K & L -->|When Native Code Added| C end subgraph Outputs [Current Wheel Types] G --> W1[torchao-0.X.0-cp39-abi3-linux_x86_64.whl] J --> W2[torchao-0.X.0-py3-none-any.whl] end style Future_Extensions stroke-dasharray: 5 5