[nativert] Move GraphSignature to pytorch core by yiming0416 · Pull Request #152969 · pytorch/pytorch

yiming0416 · 2025-05-06T18:34:46Z

Summary:
Torch Native Runtime RFC: pytorch/rfcs#72

Added an in-memory representation for input and output specs of a graph. The GraphSignature class models the input and output specs of an exported graph produced by torch.export, which holds the graph information deserialized from the pt2 archive package. Runtime relies on the GraphSignature for weight name lookup and weight loading.

The serialization schema is defined in torch/_export/serde/schema.py
See more at: https://docs.pytorch.org/docs/stable/export.html#torch.export.ExportGraphSignature

Test Plan: Added tests under test/cpp/nativert/test_graph_signature.cpp

Differential Revision: D73895378

pytorch-bot · 2025-05-06T18:34:50Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152969

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7a34259 with merge base 5163bf0 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-05-06T18:35:11Z

This pull request was exported from Phabricator. Differential Revision: D73895378

facebook-github-bot · 2025-05-06T18:53:32Z

This pull request was exported from Phabricator. Differential Revision: D73895378

facebook-github-bot · 2025-05-06T20:07:12Z

This pull request was exported from Phabricator. Differential Revision: D73895378

torch/nativert/graph/GraphSignature.cpp

facebook-github-bot · 2025-05-07T17:23:36Z

This pull request was exported from Phabricator. Differential Revision: D73895378

facebook-github-bot · 2025-05-07T17:36:47Z

This pull request was exported from Phabricator. Differential Revision: D73895378

torch/nativert/graph/GraphSignature.h

swolchok · 2025-05-07T23:25:48Z

torch/nativert/graph/GraphSignature.h

this many maps of strings to other strings is an efficiency code smell.

why do we have multiple parallel maps keyed by the same string? couldn't we, for example, map each input to a struct containing its parameters, buffers, tensor constants, and custom objects?

is a string really the right data type for all this stuff?

More context about what is really going on here would be helpful for making concrete suggestions

@swolchok This representation of GraphSignature in nativert follows the existingExportGraphSignature (https://docs.pytorch.org/docs/stable/export.html#torch.export.ExportGraphSignature)

@swolchok Oh I believe there is some misunderstanding. these maps are not keyed by the same string.
As the the input in inputsToParameters_ and inputsToBuffers_ are not the same inputs.

swolchok · 2025-05-07T23:26:44Z

torch/nativert/graph/GraphSignature.h

here we are duplicating the map keys again! can't we just iterate over the map?

@swolchok again, these follow the existing ExportGraphSignature in torch.export (please see my comments above). And later on we rely on these strings for weight look-up and loading. If we don't store them and just iterate over the map, it might add some overhead every time we access them?

If we don't store them and just iterate over the map, it might add some overhead every time we access them?

if we use a good hash table (such c10::FastMap), I would not expect iterating over map keys to be materially more expensive than iterating over an array.

these follow the existing ExportGraphSignature

Directly porting Python data structures to C++ is not necessarily appropriate. Python strings are stored by pointer and are often interned (if my sources are to be trusted), whereas C++ std::string is a value type with no interning. Efficiency demands on C++ code are also typically higher; Python is by its nature inefficient and so micro-optimizing Python code and data structures tends to miss the point.

@swolchok I changed unordered_set to c10::FastSet and unordered_map to c10::FastMap
But I have to keep these std::vector<std::string>, because we need to maintain the order of these strings as it is an important assumption in nativert for weight loading and some unused weight optimization later.

we need to maintain the order of these strings as it is an important assumption in nativert for weight loading and some unused weight optimization later.

sounds like a great code comment, together with a pointer to the components that care.

How big is this data structure typically? Could we do something like store std::string_view or std::string* in the maps (except the ones that get touched in replaceAllUses!) since the data is owned by the vectors? (note that this would require appropriate care to make sure the referenced strings don't get moved, such as by vector reallocation)

@swolchok I think we can't use string_view here because the strings stored in parameters_ are loaded from an in-memory representation of the serialized json format torch::_export::GraphSignature and we free it after torch::native::GraphSignature is constructed. So we have to store actual strings.
There was an attempt to dedupe attributes in GraphSignature in D69926343, but it was reverted as it required C++20. I think we can revisit this later when PyTorch build supports C++20

I think we can't use string_view here because the strings stored in parameters_ are loaded from an in-memory representation of the serialized json format torch::_export::GraphSignature and we free it after torch::native::GraphSignature is constructed. So we have to store actual strings.

I meant that the string_view or string* would refer to the string data owned by the std::vectors in this class, not in the input.

I meant that the string_view or string* would refer to the string data owned by the std::vectors in this class, not in the input.

@swolchok I see your points, and I have given that a try. Unfortunately under the current design it's difficult to avoid vector reallocation, so using string_view in the map to refer to strings in the vector will cause downstream weight loading failures. Therefore, we'd prefer to keep the current implementation. We'll definitely resume the attempt to optimize this, as we did in D69926343, when the PyTorch build supports C++20.
Do you think that's okay? Are there any other blocking comments on this PR?

it's difficult to avoid vector reallocation

AIUI replaceAllUses is the only non-const method in this class. I understand why it's difficult to avoid reallocation for the vectors it touches, but it looks like there is a significant subset of the state that is never modified after construction. Why is it difficult to avoid reallocation for that subset?

@swolchok In the latest version I have done the duplication. Now we only have two maps inputsToWeights_ and inputsToCustomObjs_ actually saving the strings (these two maps are disjoint). Rest of the maps and vectors will use string_view to refer the strings saved in these two maps.

torch/nativert/graph/GraphSignature.h

torch/nativert/graph/GraphSignature.cpp

swolchok · 2025-05-07T23:28:35Z

torch/nativert/graph/GraphSignature.cpp

if we kept our data nicely deduplicated in the first place we wouldn't have to do this expensive operation here

True, can we not just use an ordered set?

@Skylion007 do you mind elaborating?

malfet

Neither RFC nor this PR sheds any light on what is graph signature and why is it needed for native runtime

torch/nativert/graph/GraphSignature.cpp

facebook-github-bot · 2025-05-16T00:32:59Z

This pull request was exported from Phabricator. Differential Revision: D73895378

yiming0416 · 2025-05-19T16:27:24Z

@swolchok Sorry for pinging again but can you please take a look at my response above and let me know if there is any other blocking comments? Thanks!

facebook-github-bot · 2025-05-19T16:35:33Z

This pull request was exported from Phabricator. Differential Revision: D73895378

facebook-github-bot · 2025-05-19T20:59:09Z

This pull request was exported from Phabricator. Differential Revision: D73895378

facebook-github-bot · 2025-05-19T21:59:58Z

This pull request was exported from Phabricator. Differential Revision: D73895378

facebook-github-bot · 2025-05-19T22:08:19Z

This pull request was exported from Phabricator. Differential Revision: D73895378

torch/nativert/graph/GraphSignature.cpp

facebook-github-bot · 2025-05-20T01:13:26Z

This pull request was exported from Phabricator. Differential Revision: D73895378

facebook-github-bot · 2025-05-20T16:14:32Z

This pull request was exported from Phabricator. Differential Revision: D73895378

Summary: Pull Request resolved: pytorch#152969 An in-memory representation of `GraphSignature` which will be consumed by the runtime. Test Plan: Added tests under `test/cpp/nativert/test_graph_signature.cpp` Reviewed By: zhxchen17 Differential Revision: D73895378

facebook-github-bot · 2025-05-20T16:21:46Z

This pull request was exported from Phabricator. Differential Revision: D73895378

facebook-github-bot · 2025-05-20T21:41:00Z

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorch-bot · 2025-05-20T21:41:04Z

This PR has pending changes requested. Please address the comments and update the PR before merging.

yiming0416 · 2025-05-20T21:47:57Z

@pytorchbot merge -f "landed internally"

pytorchmergebot · 2025-05-20T21:49:42Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

facebook-github-bot added the fb-exported label May 6, 2025

yiming0416 added the topic: not user facing topic category label May 6, 2025

yiming0416 force-pushed the export-D73895378 branch from ecec4a7 to 75169ac Compare May 6, 2025 18:53

yiming0416 force-pushed the export-D73895378 branch from 75169ac to 7f46ae6 Compare May 6, 2025 20:07

yiming0416 requested review from Skylion007, albanD, cyyever, malfet, swolchok and zhxchen17 May 6, 2025 21:05

Skylion007 reviewed May 7, 2025

View reviewed changes

torch/nativert/graph/GraphSignature.cpp Outdated Show resolved Hide resolved

torch/nativert/graph/GraphSignature.cpp Outdated Show resolved Hide resolved

torch/nativert/graph/GraphSignature.cpp Outdated Show resolved Hide resolved

This comment was marked as resolved.

Sign in to view

yiming0416 force-pushed the export-D73895378 branch from 7f46ae6 to 61f2d89 Compare May 7, 2025 17:23

This comment was marked as resolved.

Sign in to view

zhxchen17 requested a review from Skylion007 May 7, 2025 17:25

yiming0416 force-pushed the export-D73895378 branch from 61f2d89 to 1172604 Compare May 7, 2025 17:36

swolchok suggested changes May 7, 2025

View reviewed changes

malfet requested changes May 7, 2025

View reviewed changes

torch/nativert/graph/GraphSignature.cpp Outdated Show resolved Hide resolved

This comment was marked as resolved.

Sign in to view

yiming0416 force-pushed the export-D73895378 branch from 56241c7 to 4c98c19 Compare May 16, 2025 00:33

yiming0416 force-pushed the export-D73895378 branch from 4c98c19 to 5714072 Compare May 19, 2025 16:35

yiming0416 force-pushed the export-D73895378 branch from 5714072 to 6b31ae7 Compare May 19, 2025 20:59

yiming0416 force-pushed the export-D73895378 branch from 6b31ae7 to a9e970f Compare May 19, 2025 22:00

yiming0416 force-pushed the export-D73895378 branch from a9e970f to 444dc5e Compare May 19, 2025 22:08

swolchok approved these changes May 20, 2025

View reviewed changes

torch/nativert/graph/GraphSignature.cpp Outdated Show resolved Hide resolved

torch/nativert/graph/GraphSignature.cpp Outdated Show resolved Hide resolved

torch/nativert/graph/GraphSignature.cpp Outdated Show resolved Hide resolved

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 20, 2025

yiming0416 force-pushed the export-D73895378 branch from 444dc5e to 24de774 Compare May 20, 2025 01:13

yiming0416 force-pushed the export-D73895378 branch from 24de774 to 5109eaf Compare May 20, 2025 16:14

Conversation

yiming0416 commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152969

✅ No Failures

Uh oh!

facebook-github-bot commented May 6, 2025

Uh oh!

facebook-github-bot commented May 6, 2025

Uh oh!

facebook-github-bot commented May 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

facebook-github-bot commented May 7, 2025

Uh oh!

This comment was marked as resolved.

facebook-github-bot commented May 7, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

swolchok May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

malfet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

This comment was marked as resolved.

facebook-github-bot commented May 16, 2025

Uh oh!

yiming0416 commented May 19, 2025

Uh oh!

facebook-github-bot commented May 19, 2025

Uh oh!

facebook-github-bot commented May 19, 2025

Uh oh!

facebook-github-bot commented May 19, 2025

Uh oh!

facebook-github-bot commented May 19, 2025

Uh oh!

yiming0416 commented May 6, 2025 •

edited

Loading

pytorch-bot bot commented May 6, 2025 •

edited

Loading

swolchok May 12, 2025 •

edited

Loading