[caffe2] Support deserializing tensors using alternate serialization formats#53403
Closed
simpkins wants to merge 4 commits intogh/simpkins/6/basefrom
Closed
[caffe2] Support deserializing tensors using alternate serialization formats#53403simpkins wants to merge 4 commits intogh/simpkins/6/basefrom
simpkins wants to merge 4 commits intogh/simpkins/6/basefrom
Conversation
…formats This updates the `TensorProto` field to independently track the data type of the in-memory (deserialized) data from the serialized data format. This will allow us to support multiple different serialization formats in the future. For instance, we could choose to perform quantization of floating point data types, or varint encoding for integer fields. For now this diff does not actually change the serialization code path yet, and does not introduce any new serialization formats, but only refactors the deserialization code path to make it easier to introduce new formats. I'm not really that thrilled with the heavy use of macros and templates here, but I didn't really see better alternatives that made it as simple to specify new deserialization function implementations. Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/) [ghstack-poisoned]
This was referenced Mar 5, 2021
This was referenced Mar 5, 2021
Contributor
💊 CI failures summary and remediationsAs of commit bda5422 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
…ialization formats" This updates the `TensorProto` field to independently track the data type of the in-memory (deserialized) data from the serialized data format. This will allow us to support multiple different serialization formats in the future. For instance, we could choose to perform quantization of floating point data types, or varint encoding for integer fields. For now this diff does not actually change the serialization code path yet, and does not introduce any new serialization formats, but only refactors the deserialization code path to make it easier to introduce new formats. I'm not really that thrilled with the heavy use of macros and templates here, but I didn't really see better alternatives that made it as simple to specify new deserialization function implementations. Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/) [ghstack-poisoned]
…ialization formats" This updates the `TensorProto` field to independently track the data type of the in-memory (deserialized) data from the serialized data format. This will allow us to support multiple different serialization formats in the future. For instance, we could choose to perform quantization of floating point data types, or varint encoding for integer fields. For now this diff does not actually change the serialization code path yet, and does not introduce any new serialization formats, but only refactors the deserialization code path to make it easier to introduce new formats. I'm not really that thrilled with the heavy use of macros and templates here, but I didn't really see better alternatives that made it as simple to specify new deserialization function implementations. Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/) [ghstack-poisoned]
simpkins
added a commit
that referenced
this pull request
Mar 10, 2021
…formats Pull Request resolved: #53403 This updates the `TensorProto` field to independently track the data type of the in-memory (deserialized) data from the serialized data format. This will allow us to support multiple different serialization formats in the future. For instance, we could choose to perform quantization of floating point data types, or varint encoding for integer fields. For now this diff does not actually change the serialization code path yet, and does not introduce any new serialization formats, but only refactors the deserialization code path to make it easier to introduce new formats. I'm not really that thrilled with the heavy use of macros and templates here, but I didn't really see better alternatives that made it as simple to specify new deserialization function implementations. ghstack-source-id: 123579495 Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/)
…ialization formats" This updates the `TensorProto` field to independently track the data type of the in-memory (deserialized) data from the serialized data format. This will allow us to support multiple different serialization formats in the future. For instance, we could choose to perform quantization of floating point data types, or varint encoding for integer fields. For now this diff does not actually change the serialization code path yet, and does not introduce any new serialization formats, but only refactors the deserialization code path to make it easier to introduce new formats. I'm not really that thrilled with the heavy use of macros and templates here, but I didn't really see better alternatives that made it as simple to specify new deserialization function implementations. Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/) [ghstack-poisoned]
Codecov Report
@@ Coverage Diff @@
## gh/simpkins/6/base #53403 +/- ##
======================================================
- Coverage 77.63% 77.63% -0.01%
======================================================
Files 1869 1869
Lines 182379 182379
======================================================
- Hits 141590 141588 -2
- Misses 40789 40791 +2 |
Contributor
|
This pull request has been merged in 33aaea9. |
xsacha
pushed a commit
to xsacha/pytorch
that referenced
this pull request
Mar 31, 2021
…formats (pytorch#53403) Summary: Pull Request resolved: pytorch#53403 This updates the `TensorProto` field to independently track the data type of the in-memory (deserialized) data from the serialized data format. This will allow us to support multiple different serialization formats in the future. For instance, we could choose to perform quantization of floating point data types, or varint encoding for integer fields. For now this diff does not actually change the serialization code path yet, and does not introduce any new serialization formats, but only refactors the deserialization code path to make it easier to introduce new formats. I'm not really that thrilled with the heavy use of macros and templates here, but I didn't really see better alternatives that made it as simple to specify new deserialization function implementations. ghstack-source-id: 123594220 Test Plan: Confirmed that the existing unit tests pass. This diff only touches the deserialization code path and not the serialization code to help ensure that the deserialization code works with the existing serialization logic, and that there are no changes to the current serialization format. Reviewed By: mraway Differential Revision: D26658206 fbshipit-source-id: d7297d600aee28b92fd9f4ece437b7f519060942
laurentdupin
pushed a commit
to laurentdupin/pytorch
that referenced
this pull request
Apr 24, 2026
…formats (pytorch#53403) Summary: Pull Request resolved: pytorch#53403 This updates the `TensorProto` field to independently track the data type of the in-memory (deserialized) data from the serialized data format. This will allow us to support multiple different serialization formats in the future. For instance, we could choose to perform quantization of floating point data types, or varint encoding for integer fields. For now this diff does not actually change the serialization code path yet, and does not introduce any new serialization formats, but only refactors the deserialization code path to make it easier to introduce new formats. I'm not really that thrilled with the heavy use of macros and templates here, but I didn't really see better alternatives that made it as simple to specify new deserialization function implementations. ghstack-source-id: 123594220 Test Plan: Confirmed that the existing unit tests pass. This diff only touches the deserialization code path and not the serialization code to help ensure that the deserialization code works with the existing serialization logic, and that there are no changes to the current serialization format. Reviewed By: mraway Differential Revision: D26658206 fbshipit-source-id: d7297d600aee28b92fd9f4ece437b7f519060942
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack:
This updates the
TensorProtofield to independently track the data type ofthe in-memory (deserialized) data from the serialized data format.
This will allow us to support multiple different serialization formats in the
future. For instance, we could choose to perform quantization of floating
point data types, or varint encoding for integer fields.
For now this diff does not actually change the serialization code path yet,
and does not introduce any new serialization formats, but only refactors the
deserialization code path to make it easier to introduce new formats.
I'm not really that thrilled with the heavy use of macros and templates here,
but I didn't really see better alternatives that made it as simple to specify
new deserialization function implementations.
Differential Revision: D26658206