Skip to content

[caffe2] Support deserializing tensors using alternate serialization formats#53403

Closed
simpkins wants to merge 4 commits intogh/simpkins/6/basefrom
gh/simpkins/6/head
Closed

[caffe2] Support deserializing tensors using alternate serialization formats#53403
simpkins wants to merge 4 commits intogh/simpkins/6/basefrom
gh/simpkins/6/head

Conversation

@simpkins
Copy link
Copy Markdown
Contributor

@simpkins simpkins commented Mar 5, 2021

Stack from ghstack:

This updates the TensorProto field to independently track the data type of
the in-memory (deserialized) data from the serialized data format.

This will allow us to support multiple different serialization formats in the
future. For instance, we could choose to perform quantization of floating
point data types, or varint encoding for integer fields.

For now this diff does not actually change the serialization code path yet,
and does not introduce any new serialization formats, but only refactors the
deserialization code path to make it easier to introduce new formats.

I'm not really that thrilled with the heavy use of macros and templates here,
but I didn't really see better alternatives that made it as simple to specify
new deserialization function implementations.

Differential Revision: D26658206

…formats

This updates the `TensorProto` field to independently track the data type of
the in-memory (deserialized) data from the serialized data format.

This will allow us to support multiple different serialization formats in the
future.  For instance, we could choose to perform quantization of floating
point data types, or varint encoding for integer fields.

For now this diff does not actually change the serialization code path yet,
and does not introduce any new serialization formats, but only refactors the
deserialization code path to make it easier to introduce new formats.

I'm not really that thrilled with the heavy use of macros and templates here,
but I didn't really see better alternatives that made it as simple to specify
new deserialization function implementations.

Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Mar 5, 2021

💊 CI failures summary and remediations

As of commit bda5422 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

…ialization formats"

This updates the `TensorProto` field to independently track the data type of
the in-memory (deserialized) data from the serialized data format.

This will allow us to support multiple different serialization formats in the
future.  For instance, we could choose to perform quantization of floating
point data types, or varint encoding for integer fields.

For now this diff does not actually change the serialization code path yet,
and does not introduce any new serialization formats, but only refactors the
deserialization code path to make it easier to introduce new formats.

I'm not really that thrilled with the heavy use of macros and templates here,
but I didn't really see better alternatives that made it as simple to specify
new deserialization function implementations.

Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/)

[ghstack-poisoned]
…ialization formats"

This updates the `TensorProto` field to independently track the data type of
the in-memory (deserialized) data from the serialized data format.

This will allow us to support multiple different serialization formats in the
future.  For instance, we could choose to perform quantization of floating
point data types, or varint encoding for integer fields.

For now this diff does not actually change the serialization code path yet,
and does not introduce any new serialization formats, but only refactors the
deserialization code path to make it easier to introduce new formats.

I'm not really that thrilled with the heavy use of macros and templates here,
but I didn't really see better alternatives that made it as simple to specify
new deserialization function implementations.

Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/)

[ghstack-poisoned]
simpkins added a commit that referenced this pull request Mar 10, 2021
…formats

Pull Request resolved: #53403

This updates the `TensorProto` field to independently track the data type of
the in-memory (deserialized) data from the serialized data format.

This will allow us to support multiple different serialization formats in the
future.  For instance, we could choose to perform quantization of floating
point data types, or varint encoding for integer fields.

For now this diff does not actually change the serialization code path yet,
and does not introduce any new serialization formats, but only refactors the
deserialization code path to make it easier to introduce new formats.

I'm not really that thrilled with the heavy use of macros and templates here,
but I didn't really see better alternatives that made it as simple to specify
new deserialization function implementations.
ghstack-source-id: 123579495

Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/)
…ialization formats"

This updates the `TensorProto` field to independently track the data type of
the in-memory (deserialized) data from the serialized data format.

This will allow us to support multiple different serialization formats in the
future.  For instance, we could choose to perform quantization of floating
point data types, or varint encoding for integer fields.

For now this diff does not actually change the serialization code path yet,
and does not introduce any new serialization formats, but only refactors the
deserialization code path to make it easier to introduce new formats.

I'm not really that thrilled with the heavy use of macros and templates here,
but I didn't really see better alternatives that made it as simple to specify
new deserialization function implementations.

Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/)

[ghstack-poisoned]
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 11, 2021

Codecov Report

Merging #53403 (bda5422) into gh/simpkins/6/base (e7a7424) will decrease coverage by 0.00%.
The diff coverage is n/a.

@@                  Coverage Diff                   @@
##           gh/simpkins/6/base   #53403      +/-   ##
======================================================
- Coverage               77.63%   77.63%   -0.01%     
======================================================
  Files                    1869     1869              
  Lines                  182379   182379              
======================================================
- Hits                   141590   141588       -2     
- Misses                  40789    40791       +2     

@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request has been merged in 33aaea9.

@facebook-github-bot facebook-github-bot deleted the gh/simpkins/6/head branch March 16, 2021 14:14
xsacha pushed a commit to xsacha/pytorch that referenced this pull request Mar 31, 2021
…formats (pytorch#53403)

Summary:
Pull Request resolved: pytorch#53403

This updates the `TensorProto` field to independently track the data type of
the in-memory (deserialized) data from the serialized data format.

This will allow us to support multiple different serialization formats in the
future.  For instance, we could choose to perform quantization of floating
point data types, or varint encoding for integer fields.

For now this diff does not actually change the serialization code path yet,
and does not introduce any new serialization formats, but only refactors the
deserialization code path to make it easier to introduce new formats.

I'm not really that thrilled with the heavy use of macros and templates here,
but I didn't really see better alternatives that made it as simple to specify
new deserialization function implementations.
ghstack-source-id: 123594220

Test Plan:
Confirmed that the existing unit tests pass.  This diff only touches the
deserialization code path and not the serialization code to help ensure that
the deserialization code works with the existing serialization logic, and that
there are no changes to the current serialization format.

Reviewed By: mraway

Differential Revision: D26658206

fbshipit-source-id: d7297d600aee28b92fd9f4ece437b7f519060942
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
…formats (pytorch#53403)

Summary:
Pull Request resolved: pytorch#53403

This updates the `TensorProto` field to independently track the data type of
the in-memory (deserialized) data from the serialized data format.

This will allow us to support multiple different serialization formats in the
future.  For instance, we could choose to perform quantization of floating
point data types, or varint encoding for integer fields.

For now this diff does not actually change the serialization code path yet,
and does not introduce any new serialization formats, but only refactors the
deserialization code path to make it easier to introduce new formats.

I'm not really that thrilled with the heavy use of macros and templates here,
but I didn't really see better alternatives that made it as simple to specify
new deserialization function implementations.
ghstack-source-id: 123594220

Test Plan:
Confirmed that the existing unit tests pass.  This diff only touches the
deserialization code path and not the serialization code to help ensure that
the deserialization code works with the existing serialization logic, and that
there are no changes to the current serialization format.

Reviewed By: mraway

Differential Revision: D26658206

fbshipit-source-id: d7297d600aee28b92fd9f4ece437b7f519060942
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants