Add Ascend NPU as a backend#1826
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1826
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 1485314 with merge base 24d3579 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
ca78eca to
b6332dd
Compare
|
Hi @ebsmothers, @RdoubleA: I hope you’re doing well! Could you please help me review my code? I would really appreciate it if you could take a look and share any feedback or suggestions. Thank you so much in advance for your time and support! 😊 Best regards |
ebsmothers
left a comment
There was a problem hiding this comment.
Hi @noemotiovon thanks for the PR! And apologies for the delay in getting to the review here. A couple other questions I have that don't really fit neatly anywhere inline:
- Do we expect compile to work? If so, we should test that. If not, we could raise an error
- Do we expect quant-related APIs (e.g. QLoRA or QAT) from torchao to work? Same as point 1: if so we should test or possibly raise an error
- PyTorch has now released 2.5 as stable. In general we do not claim to support anything but the latest stable release of PyTorch -- do you know the contract on torch_npu releases here?
|
distributed training seems to have problems e.g qat_distributed @noemotiovon |
I would be very happy to! I will contact you via email. |
@noemotiovon through 126 email thanks. Looking forward to your email. |
b6332dd to
1aad0d2
Compare
Basic Usage Test
|
Basic Usage Test
|
Hi @ebsmothers, Thank you very much for reviewing my code! |
ecbacac to
5d6cf85
Compare
|
Hi @ebsmothers, could you please take a moment to review the code |
5d6cf85 to
03f782c
Compare
|
Hi @noemotiovon sorry for the delay! I will take a look tomorrow if that's alright. Until then I'll tag @RdoubleA and @joecummings in case either of them gets a minute to take a look |
03f782c to
4478809
Compare
|
Hi @ebsmothers, when you have a moment, could you take a quick look at the recent changes I made? Your feedback would be greatly appreciated. Thank you! |
ebsmothers
left a comment
There was a problem hiding this comment.
Thanks @noemotiovon for your patience! I left a handful more comments, please let me know if anything is unclear
Thank you for your review! Your feedback is very clear, and I will make the necessary code changes as soon as possible based on your suggestions. |
|
Hi @ebsmothers, I’ve made the code changes based on your suggestions; could you please review it again? Additionally:
Best regards |
ebsmothers
left a comment
There was a problem hiding this comment.
Thanks @noemotiovon for the updates! I left a couple more comments but I think this is pretty close now. It looks like a unit test is failing in CI though, can you take a look? Happy to provide any debugging pointers if you need
Hi @ebsmothers, Best regards |
c950861 to
073c3d2
Compare
|
Hi @noemotiovon looks like CI is green now but there are merge conflicts. Can you pull from latest main and merge with your changes? |
2a6c63b to
1485314
Compare
Hi @ebsmothers, I’d be happy to! I’ve resolved the conflicts and made the necessary adjustments. Could you please help merge it? Thank you for your continued support, and wishing you all the best in work and life! 😊 |
What does this PR do?
Overview
🚀This PR enables the users of
torhtuneto leverage the Ascend NPU for better performance in inferencing.This PR primarily addresses the initial refactoring of device-independent code. In upcoming changes, we’ll focus on further adjustments, using NPU as an example to refine each recipe and complete the remaining device-independent modifications. For now, this PR only touches on recipe lora_finetune_single_device and full_finetune_single_device.
For more details, see: [#1797].
Environment
Note
To properly install CANN, see [here] for more details.
The version of
torch-npushould match that oftorch, see [here] for more details.In addition,
torch_npuhas a pre-release version, 2.4.0 RC1, which is also the basis for this test. For more information, please visit [here].Examples
To start with, the library
torch_npushould be correctly installed and imported. Part of the codes are showed below:torchtune/utils/_device_support.py:Plus, there are some other places of the codes might be adjusted, which won't be too much.
Feel free to leave comments to guide me in further improvements 😊.