Skip to content

Allow custom backend to execute getitem when they want#87007

Closed
wschin wants to merge 1 commit intopytorch:masterfrom
wschin:fix-regression
Closed

Allow custom backend to execute getitem when they want#87007
wschin wants to merge 1 commit intopytorch:masterfrom
wschin:fix-regression

Conversation

@wschin
Copy link
Collaborator

@wschin wschin commented Oct 15, 2022

If a backend wants to execute getitem, CapabilityBasedPartitioner should let that backend to execute getitem if that backend wants. The function getitem is just a normal Python function. If CapabilityBasedPartitioner rejects to partition getitem, it will cause 2x slow down when running ONNXRuntime via TorchDynamo.

Figure 1. Profiling result when rejecting getitem. There are much more sub-graphs (each gray bin below "F" bin is a sub-graph) and there is a launching cost for each of them.
image

Figure 2. Profiling result when partitioning getitem (each gray bin below "F" bin is a sub-graph). Most BERT-base is partitioned into a single sub-graph.
image

@pytorch-bot pytorch-bot bot added the release notes: fx release notes category label Oct 15, 2022
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 15, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/87007

Note: Links to docs will display an error until the docs builds have been completed.

❌ 7 Failures, 11 Pending

As of commit 098f1b0:

The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

receive much smaller sub-graphs (from ~5 graphs to ~50 graphs).
For ORT, this is 2x slow down.
@wschin wschin requested a review from jjsjann123 October 15, 2022 00:26
@wschin wschin requested a review from SherlockNoMad October 15, 2022 00:26
@mikaylagawarecki mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 18, 2022
@SherlockNoMad
Copy link
Contributor

See discussion in #87073

@jjsjann123
Copy link
Collaborator

Yes, we did realize the problem caused by last refactor. Totally my fault. We are trying to patch it as well, but still working out how.

One thing we absolutely should add is a test case where getitem node connects fusible ops, so we won't accidentally break this in our getitem special handling refactor in the future.

_get_qualified_name(node.target) != "_operator.getitem" # type: ignore[arg-type]
)
)
return self.operator_support.is_node_supported(dict(self.graph_module.named_modules()), node)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, this is not safe.

You'll have getitem node absorbed by its consumer and later hitting an assert somewhere in the merge_single_node function. I think you'll be failing one of the getitem special handling tests we have.

@wschin
Copy link
Collaborator Author

wschin commented Oct 18, 2022

Thanks for the replies. Close before we have #87073.

@wschin wschin closed this Oct 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

open source release notes: fx release notes category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants