[Fixbug] Fix graph metadata hash#428
Conversation
|
Very interesting! Is there any example to reproduce the behavior of producing different hashes? |
|
@yaoyaoding I am able to observe this on my resnet app branch, with >30 layers. Was not able to observe on smaller graphs.
|
| @@ -145,6 +145,11 @@ def get_graph_meta_data(graph: FlowGraph, num_kernels, space: int) -> GraphMetaD | |||
| lines.append(str(node.task)) | |||
| lines.append(str(graph)) | |||
There was a problem hiding this comment.
Will this line produce different string when the topological order is different?
There was a problem hiding this comment.
Yes, I think anything that relies on graph.nodes will cause this.
|
@KTong821 What is the reason of Graph node traversal used in generating hash is non-deterministic? |
Good catch! We might need to avoid iterating set when we construct the topological order of nodes of flow graph. @vadiklyutiy for your question, we used set and iterate it when we compute a topological order of our computation graph. |
|
@yaoyaoding yes, making the topological ordering one-to-one is a better solution than sorting the hash input string. We might make different graphs have the same hash by sorting here. Will update PR. |
|
Thanks! @KTong821 |
Same graphs should have the same hash. Graph node traversal used in generating hash is non-deterministic, leading to different hashes for the same graph. This can prevent graph runs from using the fast path, since dispatch tables cannot be found for the current (different) hash!
Adds ResNet and image classifier pipeline functionality. Includes changes from #428 See huggingface implementation for original API inspiration. Resolves CentML/hidet#60
Adds ResNet and image classifier pipeline functionality. Includes changes from #428 See huggingface implementation for original API inspiration. Resolves CentML/hidet#60
Same graphs should have the same hash. Graph node traversal used in generating hash is non-deterministic, leading to different hashes for the same graph.
This can prevent graph runs from using the fast path, since dispatch tables cannot be found for the current (different) hash!