Skip to content

add cuda memory and distributed metadata#57252

Closed
guotuofeng wants to merge 5 commits intopytorch:masterfrom
guotuofeng:myguo/dist
Closed

add cuda memory and distributed metadata#57252
guotuofeng wants to merge 5 commits intopytorch:masterfrom
guotuofeng:myguo/dist

Conversation

@guotuofeng
Copy link
Contributor

Implementation for pytorch/kineto#155

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Apr 29, 2021

💊 CI failures summary and remediations

As of commit 853cb0f (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@codecov
Copy link

codecov bot commented Apr 29, 2021

Codecov Report

Merging #57252 (853cb0f) into master (3a66a1c) will decrease coverage by 0.00%.
The diff coverage is 57.14%.

@@            Coverage Diff             @@
##           master   #57252      +/-   ##
==========================================
- Coverage   76.82%   76.82%   -0.01%     
==========================================
  Files        1984     1984              
  Lines      197163   197190      +27     
==========================================
+ Hits       151465   151484      +19     
- Misses      45698    45706       +8     

@gchanan gchanan requested a review from gdankel April 29, 2021 16:44
@gchanan gchanan added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 29, 2021
@facebook-github-bot
Copy link
Contributor

@ilia-cher has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@gdankel
Copy link
Contributor

gdankel commented May 7, 2021

@guotuofeng can you add an example json or screenshot of the metadata section (when pressing M button in chrome trace)?

@guotuofeng
Copy link
Contributor Author

@guotuofeng can you add an example json or screenshot of the metadata section (when pressing M button in chrome trace)?

The metadata is like:

"metadata": {
"devices": "[{"id": 0, "name": "Tesla V100-DGXS-32GB", "multi_processor_count": 80, "total_memory": 34084028416}, {"id": 1, "name": "Tesla V100-DGXS-32GB", "multi_processor_count": 80, "total_memory": 34087305216}, {"id": 2, "name": "Tesla V100-DGXS-32GB", "multi_processor_count": 80, "total_memory": 34087305216}, {"id": 3, "name": "Tesla V100-DGXS-32GB", "multi_processor_count": 80, "total_memory": 34087305216}]",
"distributed": "{"backend": "nccl", "rank": 0, "world_size": 2}"
},
image

@facebook-github-bot
Copy link
Contributor

@ilia-cher has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ilia-cher ilia-cher self-requested a review May 10, 2021 19:16
@facebook-github-bot
Copy link
Contributor

@ilia-cher merged this pull request in 98fcdb8.

@facebook-github-bot
Copy link
Contributor

This pull request has been reverted by 0361671.

@ilia-cher
Copy link
Contributor

(will resend)

ilia-cher pushed a commit that referenced this pull request May 11, 2021
Summary:
Resending #57252

Test Plan: CI

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 11, 2021
Summary:
Resending #57252

Test Plan: CI

ghstack-source-id: 6f18720
Pull Request resolved: #58010
ilia-cher pushed a commit that referenced this pull request May 11, 2021
…tributed metadata"

Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 11, 2021
Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 11, 2021
…tributed metadata"

Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 11, 2021
Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
dgl-intel pushed a commit to dgl-intel/pytorch that referenced this pull request May 11, 2021
Summary:
Resending pytorch#57252

Test Plan: CI

ghstack-source-id: 2e63611
Pull Request resolved: pytorch#58010
ilia-cher pushed a commit that referenced this pull request May 11, 2021
…tributed metadata"

Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 11, 2021
Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 11, 2021
…tributed metadata"

Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 11, 2021
Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 11, 2021
…tributed metadata"

Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 11, 2021
Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 12, 2021
…tributed metadata"

Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 12, 2021
Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 12, 2021
…tributed metadata"

Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 12, 2021
Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 12, 2021
…tributed metadata"

Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
ilia-cher pushed a commit that referenced this pull request May 12, 2021
Summary:
Resending #57252

Test Plan: CI

Differential Revision: [D28345161](https://our.internmc.facebook.com/intern/diff/D28345161)

[ghstack-poisoned]
facebook-github-bot pushed a commit that referenced this pull request May 12, 2021
Summary:
Pull Request resolved: #58010

Resending #57252

Test Plan: CI

Reviewed By: gdankel

Differential Revision: D28345161

Pulled By: ilia-cher

fbshipit-source-id: 18be07b275403205f5b5487ae3589bd39a8eac96
krshrimali pushed a commit to krshrimali/pytorch that referenced this pull request May 19, 2021
Summary:
Implementation for pytorch/kineto#155

Pull Request resolved: pytorch#57252

Reviewed By: gdankel

Differential Revision: D28294662

Pulled By: ilia-cher

fbshipit-source-id: 3c71ffa333e341ff8113e891681a4905f54802dc
krshrimali pushed a commit to krshrimali/pytorch that referenced this pull request May 19, 2021
…58010)

Summary:
Pull Request resolved: pytorch#58010

Resending pytorch#57252

Test Plan: CI

Reviewed By: gdankel

Differential Revision: D28345161

Pulled By: ilia-cher

fbshipit-source-id: 18be07b275403205f5b5487ae3589bd39a8eac96
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged open source Reverted triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants