[1/n] Add graph function to list DataPipes from graph#888
[1/n] Add graph function to list DataPipes from graph#888ejguan wants to merge 2 commits intometa-pytorch:mainfrom
Conversation
|
@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
f65a440 to
5993878
Compare
|
@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
| for dp_id, (dp, src_graph) in g.items(): | ||
| if dp_id not in cache: | ||
| cache.add(dp_id) | ||
| dps.append(dp) |
There was a problem hiding this comment.
Side note, this doesn't guarantee any order and returns a list. This should suffice for the current use case.
I wonder if people will want a Dict (basically just traverse_dps without the IDs). To be clear, this seems fine to me for now.
There was a problem hiding this comment.
If we treat the graph as a tree, there can be in different orders like how do we traverse a tree. Pre-order; post-order, etc. It highly depends on users' preference especially our graph can be more complicated than a tree. I would say let's wait until we have received any use case.
|
@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Add a
list_dpsfunction to listDataPipesfrom the graph.get_all_graph_pipesfrom pytorch coreexclude_dpsto exclude theDataPipeand its prior graph from the result.Reason to add this function:
sharding_filterStep 1 for #885