[schema_upgrader] add C++ upgrader for json based upgrading#156761
[schema_upgrader] add C++ upgrader for json based upgrading#156761ydwu4 wants to merge 10 commits intogh/ydwu4/267/basefrom
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/156761
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit f466c82 with merge base 3ee75b7 ( UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
[ghstack-poisoned]
[ghstack-poisoned]
torch/csrc/export/upgrader.cpp
Outdated
|
|
||
| // NOTE: The following version_0 and version_1 upgraders are for testing | ||
| // purposes only. They demonstrate the upgrader system functionality and are | ||
| // used in test/export/test_upgrader.py. |
There was a problem hiding this comment.
should these be put inside of a testing folder? you could follow what I did to test AOTI custom ops and just added a cpp file in the testing directory.
There was a problem hiding this comment.
Good idea. I put the test upgraders in a seperate file, add a python binding for it and the python test can dynamically register/deregister it. I don't want to add a legit C++ test, the build system is complicated lol
torch/csrc/export/upgrader.cpp
Outdated
| return keypath < other.keypath; | ||
| } | ||
|
|
||
| static void registerUpgrader( |
There was a problem hiding this comment.
how come this function isn't in the .h? it seems like this function is the most user facing so we should add proper docs
There was a problem hiding this comment.
Yeah, I put it into the interface now. The initial idea is all regsiteration can be just done inside this file but it doesn't hurt to add some flexibility.
torch/csrc/export/upgrader.cpp
Outdated
| } | ||
|
|
||
| static void registerUpgrader( | ||
| int version, |
There was a problem hiding this comment.
nit: add docstring that version is the version to upgrade from?
torch/csrc/export/upgrader.cpp
Outdated
|
|
||
| static void registerUpgrader( | ||
| int version, | ||
| const std::vector<std::string>& keypath, |
There was a problem hiding this comment.
what if we just passed in a string keypath and split it with .? ex. instead of passing in {"graph_module", "graph", "nodes"} we can just pass in "graph_module.graph.nodes"?
There was a problem hiding this comment.
Add a separate overload for this purpose. My initial thought is that we don't want to hardcode some deliminator into the key path, which can be troublesome somtimes (e.g. the nn_module_stack deliminator).
torch/csrc/export/upgrader.cpp
Outdated
| int current_version = current_artifact["schema_version"]["major"]; | ||
|
|
||
| // Iteratively apply upgraders until no more are available | ||
| while (true) { |
There was a problem hiding this comment.
should we also stop if current_version == the current highest schema version? Maybe that's a little hard to find, since it's in python...
just wondering what if there aren't any more upgraders, but current_version hasn't reached the version consumable by the deserializer yet.
There was a problem hiding this comment.
Yeah, we could query the schema version and pass it to the upgrader function.
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
torch/csrc/export/upgrader.cpp
Outdated
| } | ||
|
|
||
| // Validate that we reached the target version if requested | ||
| if (validate_target && current_version != target_version) { |
There was a problem hiding this comment.
Wouldn't we always want to validate this?
There was a problem hiding this comment.
sometimes, it could be easier to use not provide a target_version e.g. in testing lol.
torch/csrc/export/upgrader.cpp
Outdated
| } | ||
|
|
||
| nlohmann::json upgrade(const nlohmann::json& artifact) { | ||
| return upgradeImpl(artifact, std::numeric_limits<int>::max(), false); |
There was a problem hiding this comment.
In what case would we want to use max? I feel like we would just always want the existing schema version?
There was a problem hiding this comment.
easier for testing. Could change to always upgrade to a target version though
There was a problem hiding this comment.
People might not want to always upgrade to the latest version e.g. for debug reasons maybe? It doesn't hurt to provide some flexibility i feel
There was a problem hiding this comment.
hm but are you using this anywhere? all your test cases are specifying the version
|
|
||
| // Query the current Python schema version as target | ||
| py::module_ schema_module = | ||
| py::module_::import("torch._export.serde.schema"); |
There was a problem hiding this comment.
I wonder if we could add the schema version to generated_serialization_types.h so that we can query the schema version directly in the upgrade cpp function rather than here. cc @zhxchen17
can leave as a TODO for now
There was a problem hiding this comment.
Right now it's inside the exported program class, we would need to deserialize ExportedProgram first in order to access the field but we cannot because we need to first do upgrade. Either we could have a top-level class that wraps up ExportedProgram and the version and make sure the schema version can be accessed all the time or we store it somewhere else.
edit: oh, you mean we expose the schema_version in generated_serialization_types.h as a constant?
There was a problem hiding this comment.
edit: oh, you mean we expose the schema_version in generated_serialization_types.h as a constant?
yup
[ghstack-poisoned]
| @@ -0,0 +1,89 @@ | |||
| #include <torch/csrc/export/example_upgraders.h> | |||
There was a problem hiding this comment.
I still wish this file was in test/ but I guess if the build system is too annoying to figure out it's ok
There was a problem hiding this comment.
lol, it's indeed annoying
torch/csrc/export/upgrader.cpp
Outdated
| } | ||
|
|
||
| nlohmann::json upgrade(const nlohmann::json& artifact) { | ||
| return upgradeImpl(artifact, std::numeric_limits<int>::max(), false); |
There was a problem hiding this comment.
hm but are you using this anywhere? all your test cases are specifying the version
angelayi
left a comment
There was a problem hiding this comment.
letsgoo
you might want to import to internal before landing, just to make sure none of the cpp changes will break
|
@ydwu4 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot revert -m "break linter test, which doesn't show up in the pr" -c nosignal |
|
@pytorchbot successfully started a revert job. Check the current status here. |
…156761)" This reverts commit 61712e6. Reverted #156761 on behalf of https://github.com/ydwu4 due to break linter test, which doesn't show up in the pr ([comment](#156761 (comment)))
|
@ydwu4 your PR has been successfully reverted. |
Differential Revision: [D77459912](https://our.internmc.facebook.com/intern/diff/D77459912) [ghstack-poisoned]
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
Differential Revision: D77459912