Skip to content

[release/1.7.0] Added AITER as a submodule and use in fused_rope.py#226

Merged
amd-sriram merged 11 commits intorelease/1.7.0from
add_aiter_fused_rope_kernels_1.7.0
Jul 9, 2025
Merged

[release/1.7.0] Added AITER as a submodule and use in fused_rope.py#226
amd-sriram merged 11 commits intorelease/1.7.0from
add_aiter_fused_rope_kernels_1.7.0

Conversation

@amd-sriram
Copy link
Copy Markdown
Collaborator

@amd-sriram amd-sriram commented Jun 4, 2025

Added AITER support in fused_rope.py for all 4 variants. Updated fused rope test, reduced tolerances according to unit test in aiter repo.
Tested UT - python tests/L0/run_transformer/test_fused_rope.py

Added aiter as a submodule and build it in setup.py if it is rocm.

For rocm, it uses AITER backend
For cuda, it uses apex native kernels

Tested with rocm and upstream release/2.7

Fixes : https://ontrack-internal.amd.com/browse/SWDEV-496182

…d rope test, reduced tolerances according to unit test in aiter repo.
@amd-sriram amd-sriram self-assigned this Jun 4, 2025
…r backend if it is rocm and aiter is installed
…y error - ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
@amd-sriram amd-sriram changed the title Added AITER as a submodule and use in fused_rope.py [release/1.7.0] Added AITER as a submodule and use in fused_rope.py Jun 4, 2025
@jithunnair-amd
Copy link
Copy Markdown
Collaborator

jithunnair-amd commented Jun 4, 2025

@amd-sriram @pruthvistony

Tested with 6.5_internal_testing and upstream main

This PR should be tested with release/2.7 (both ROCm fork and upstream) to ensure it's compatible.

@amd-sriram
Copy link
Copy Markdown
Collaborator Author

@jithunnair-amd I have tested with release/2.7 (rocm and upstream). I also updated the description.

@pruthvistony
Copy link
Copy Markdown

Wait for PR - #222 to be merged before this.

amd-sriram and others added 8 commits July 8, 2025 12:45
…nd use pip install -e . instead of python setup.py develop for installing aiter.
…nc and select apex or aiter subclass based on AITER_ROPE_BACKEND value. The user can specify the environment variable USE_ROCM_AITER_ROPE_BACKEND to select between aiter and apex backends for fused rope.
…est otherwise use the original precision 1e-3
remove spaces
@amd-sriram amd-sriram merged commit 53f3c64 into release/1.7.0 Jul 9, 2025
@amd-sriram amd-sriram deleted the add_aiter_fused_rope_kernels_1.7.0 branch July 9, 2025 10:26
jithunnair-amd pushed a commit to ROCm/pytorch that referenced this pull request Jul 14, 2025
Fixing the C10_warpsize issue. replacing the macros with
at::cuda::warp_size() - ROCm/apex#244

[[release/1.7.0] Added AITER as a submodule and use in
fused_rope.py](ROCm/apex@53f3c64)
- ROCm/apex#226

[Replaced warpsize with
C10_WARP_SIZE](ROCm/apex@f417097)
- ROCm/apex#253

[Disabling Aiter Installation in default build
](ROCm/apex@1c50337)
- ROCm/apex#255

Fixes https://ontrack-internal.amd.com/browse/SWDEV-496182
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants