Skip to content

[Distributed] Add more runtime distributed communication functions #314

Merged
yaoyaoding merged 15 commits intohidet-org:mainfrom
soodoshll:dist
Jul 14, 2023
Merged

[Distributed] Add more runtime distributed communication functions #314
yaoyaoding merged 15 commits intohidet-org:mainfrom
soodoshll:dist

Conversation

@soodoshll
Copy link
Copy Markdown
Collaborator

We add the following communication functions related to distributed functionalities:

Runtime API

We mimic the APIs of PyTorch's distributed package

  • all_reduce
  • broadcast
  • reduce
  • all_gather
  • all_gather_into_tensor
  • gather
  • scatter
  • reduce_scatter
  • reduce_scatter_tensor
  • barrier
  • send
  • recv

Hidet OPs

  • all_reduce
  • all_gather
  • reduce_scatter

We distinguish these two different usages of distributed primitives because many communication primitives cannot be directly viewed as nodes in computational graphs.

Still working on documentation and comments

@soodoshll soodoshll changed the title [WIP] Add more distributed communication functions [WIP] Add more runtime distributed communication functions Jul 13, 2023
@soodoshll soodoshll changed the title [WIP] Add more runtime distributed communication functions Add more runtime distributed communication functions Jul 13, 2023
@soodoshll soodoshll changed the title Add more runtime distributed communication functions [Distributed] Add more runtime distributed communication functions Jul 13, 2023
@soodoshll
Copy link
Copy Markdown
Collaborator Author

@yaoyaoding Hi, could you please take a look at this pr? It's basically the correspondence of torch.distributed, adding some imperative runtime APIs, which will not be OPs and part of flowgraphs. The semantics of each communication function has been documented in the comments.

@yaoyaoding
Copy link
Copy Markdown
Member

The PR looks good to me, thanks @soodoshll !

@yaoyaoding yaoyaoding merged commit 99feace into hidet-org:main Jul 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants