Skip to content

fix build failed with NCCL#5

Merged
Yancey0623 merged 2 commits intomerge_discfrom
fix_nccl_failed
Aug 13, 2024
Merged

fix build failed with NCCL#5
Yancey0623 merged 2 commits intomerge_discfrom
fix_nccl_failed

Conversation

@Yancey0623
Copy link
Copy Markdown

No description provided.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Aug 12, 2024

CLA assistant check
All committers have signed the CLA.

@Yancey0623 Yancey0623 requested a review from yitongh August 12, 2024 08:46
Comment thread torch_xla/csrc/runtime/disc/BUILD Outdated
Comment thread torch_xla/csrc/runtime/disc/BUILD Outdated
@Yancey0623 Yancey0623 merged commit 1e499d9 into merge_disc Aug 13, 2024
@Yancey0623 Yancey0623 deleted the fix_nccl_failed branch August 13, 2024 02:45
anw90 pushed a commit that referenced this pull request Oct 11, 2024
* build with BladeDISC (#8)

* [to #53687860] feat: DISC client header, implement DISCComputation and DISCData

POC implement in : https://code.alibaba-inc.com/torchx/xla/codereview/14984824

Link: https://code.alibaba-inc.com/torchx/xla/codereview/14987956

* Disc computation (#2)

Support Disc as backend
Co-authored-by: yancey.yx <yancey.yx@antfin.com>
Co-authored-by: wangang.wa <wangang.wa@alibaba-inc.com>

* add bazel flag to disable disc backend (#23)

* add flag to disable disc backend in bazel workspace

* support disc debug mode to dump mhlo and logs (#25)

support disc backend debug mode to dump DISC compilation logs

* support flash attention in disc (pytorch#34)

* fix disc flag when complie python (pytorch#39)

* fix bazel flag when complie python

* fix lint.

* support bf16 on disc backend (pytorch#40)

add float-norm pass to support bf16 amp training

* Support Flash Attention 2.5.6 for disc backend (#4)

* fix build failed with NCCL (#5)

* fix build failed on nccl

* using nccl hdrs

* Use the value of DISC_DEVICE as the device type of disc backend (#8)

* change the device type of disc to cuda to make amp work properly

* Use the value of DISC_DEVICE as the device type of disc backend

* disable compilation of DISC by default (#15)

---------

Co-authored-by: Yan Xu <yancey1989@gmail.com>
Co-authored-by: wenting.swt <wenting.swt@alibaba-inc.com>
Co-authored-by: Dalong <yuanxiulong.yxl@alibaba-inc.com>
Co-authored-by: Baole Ai <baole.abl@alibaba-inc.com>
Co-authored-by: Yan Xu <yancey.yx@alibaba-inc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants