[export] Add lifted constant obj to input#116985
[export] Add lifted constant obj to input#116985angelayi wants to merge 1 commit intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/116985
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (3 Unrelated Failures)As of commit 0ea005e with merge base a669319 ( BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This pull request was exported from Phabricator. Differential Revision: D52556070 |
This PR needs a
|
|
This pull request was exported from Phabricator. Differential Revision: D52556070 |
a7f08e3 to
62b5f28
Compare
|
This pull request was exported from Phabricator. Differential Revision: D52556070 |
62b5f28 to
eeda880
Compare
suo
left a comment
There was a problem hiding this comment.
mostly looks good! Had some questions inline about the implementation
There was a problem hiding this comment.
For other types, deserialize_graph populates a self.serialized_name_to_meta table, which we then use in sync_fx_node to fill in the correct val. Why are custom objects taking a different path here?
There was a problem hiding this comment.
This is assuming that our val is the custom obj itself, if we serialize to serialized_name_to_meta, then the serialized custom object will live in both serialized_name_to_meta, which is attached to the serialized exported program, and in constants, which is saved at a different location from the exported program. I was thinking it's probably not good to save one thing in two places, and especially if it's a custom object blob..?
There was a problem hiding this comment.
serialized_name_to_meta is a field on GraphModuleDeserializer and isn't persisted to the final ExportedProgram, as far as I can tell. It's just bookkeeping to map an IR name to meta['val'], which seems like what we're doing here. So tracking it there should not affect the serialized format.
I think the more fundamental issue is that we don't have a serialized table mapping names to custom objs in schema.Graph like we do for every other dynamic value type (tensors, symints, symbools). We are representing it implicitly through placeholder arguments/inputs to nodes.
The asymmetry of how we handle these nodes is a good signal that there's a missing piece in our design. The missing piece is that we don't have a metadata object for ScriptObjects: Tensor : TensorMeta :: ScriptObject : ???.
If we had such a piece (call it ScriptObjectMeta), the right fix would be:
schema.Graphgets acustom_object_values: Dict[str, ScriptObjectMeta]field, analogous totensor_values.- We deserialize that and populate
serialized_name_to_metalike everything else inGraphModuleDeserializer.deserialize_graph. meta["val"]should containScriptObjectMeta, not theScriptObjectitself.
Okay, so what should ScriptObjectMeta look like? If we were being faithful to the Tensor : TensorMeta analogy, it should be something that we can deserialized into a FakeScriptObject that will work when invoking the node in FakeMode. So that's probably the right thing.
That's out of scope for this diff. But we should try to design things with a structure as close as possible to the right thing, so that when we go fix it later it will be easier (and we don't create weird divergences in the meantime).
My suggestion for a temporary ScriptObjectMeta would be a dataclass that looks like:
@dataclass
class ScriptObjectMeta:
# key into constants table to retrieve the real ScriptObject.
# Optional because not all ScriptObject nodes are from constants.
Optional[constant_name]: str
typename: str # fully qualified type of the ScriptObject.
What do you think?
There was a problem hiding this comment.
From your comment I realized I was implementing it stupidly, so I updated the code, but not exactly what you said. Let me know what you think! Basically I just populated serialized_name_to_meta based on the signature, since the signature stores the custom obj's node name and fqn to the constants dictionary.
The reason why I didn't make another Dict[str, ScriptObjectMeta] is because I felt that the signature already stores something similar to that already InputSpec(kind=InputKind.CUSTOM_OBJ, arg=CustomObjArgument(node_name), target=custom_obj_fqn).
There was a problem hiding this comment.
Based on what we discussed, I added a ScriptObjectMeta, which stores just the name of the constant in the constants table.
|
This pull request was exported from Phabricator. Differential Revision: D52556070 |
eeda880 to
56f3acf
Compare
|
This pull request was exported from Phabricator. Differential Revision: D52556070 |
56f3acf to
d16953a
Compare
|
This pull request was exported from Phabricator. Differential Revision: D52556070 |
Summary: Pull Request resolved: pytorch#116985 This is in preparation for tracing support for custom class holder objects. We expect them to initially appear in the graph as getattr nodes. This diff will then lift the getattr nodes as inputs to the graph with the graph signature being `InputKind.CUSTOM_OBJ`, and adds serialization support. Test Plan: Added test Differential Revision: D52556070 fbshipit-source-id: abe6358980fa1af112bee75d49f9a6ebbda264e7
d16953a to
753d967
Compare
|
This pull request was exported from Phabricator. Differential Revision: D52556070 |
753d967 to
03ff13c
Compare
|
This pull request was exported from Phabricator. Differential Revision: D52556070 |
03ff13c to
9f0080c
Compare
Summary: Pull Request resolved: pytorch#116985 This is in preparation for tracing support for custom class holder objects. We expect them to initially appear in the graph as getattr nodes. This diff will then lift the getattr nodes as inputs to the graph with the graph signature being `InputKind.CUSTOM_OBJ`, and adds serialization support. Test Plan: Added test Reviewed By: suo Differential Revision: D52556070 fbshipit-source-id: 530c1004ac65ad7997a6408cad43e9b3c2414245
9f0080c to
357d0bf
Compare
|
This pull request was exported from Phabricator. Differential Revision: D52556070 |
|
This pull request was exported from Phabricator. Differential Revision: D52556070 |
357d0bf to
128609c
Compare
Summary: Pull Request resolved: pytorch#116985 This is in preparation for tracing support for custom class holder objects. We expect them to initially appear in the graph as getattr nodes. This diff will then lift the getattr nodes as inputs to the graph with the graph signature being `InputKind.CUSTOM_OBJ`, and adds serialization support. Test Plan: Added test Reviewed By: suo Differential Revision: D52556070 fbshipit-source-id: be6743c5a94b115052274b7afbe54711f76d5b80
|
This pull request was exported from Phabricator. Differential Revision: D52556070 |
128609c to
0ea005e
Compare
|
@pytorchbot merge -f 'Landed internally' (Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally) |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Test Plan: wip
Differential Revision: D52556070