Consider changing AOTAutograd cache to hit on graphs with different input and node names

We ran into this in vLLM, where vLLM:
1) creates one big graph with a lot of transformer layers
2) splits the big graph into multiple graphs, each of which are the same
3) compiles each graph separately (sending it through AOTAutograd). All of these graphs cache miss due to having different input names.

The workaround I am currently applying is to normalize the inputs of the graphs before sending them through AOTAutograd. It's not clear to me if the right long-term solution is:
1) AOTAutograd cache becomes agnostic to names
2) user is supposed to do some sort of hierarchical compilation
3) user is supposed to know these quirks with AOTAutograd cache and program with those in mind

cc @chauhang @penguinwu @oulgen @jamesjwu @aorenste @anijain2305 @laithsakka @masnesral @coconutruben

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider changing AOTAutograd cache to hit on graphs with different input and node names #157792

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Consider changing AOTAutograd cache to hit on graphs with different input and node names #157792

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions