[caffe2] Support deserializing tensors using alternate serialization formats by simpkins · Pull Request #53403 · pytorch/pytorch

simpkins · 2021-03-05T20:32:22Z

Stack from ghstack:

[caffe2] Refactor tensor serialization function #53404 [caffe2] Refactor tensor serialization function
[caffe2] Support deserializing tensors using alternate serialization formats #53403 [caffe2] Support deserializing tensors using alternate serialization formats
[caffe2] add a CAFFE2_NODISCARD macro to help support old compilers #53754 [caffe2] add a CAFFE2_NODISCARD macro to help support old compilers
[caffe2] add a SerializationOptions field for the save operator #53402 [caffe2] add a SerializationOptions field for the save operator
[caffe2] update load_save_test.py to also verify the chunking behavior #53401 [caffe2] update load_save_test.py to also verify the chunking behavior
[caffe2] use AddNAlreadyReserved() when serializing blobs #53400 [caffe2] use AddNAlreadyReserved() when serializing blobs

This updates the TensorProto field to independently track the data type of
the in-memory (deserialized) data from the serialized data format.

This will allow us to support multiple different serialization formats in the
future. For instance, we could choose to perform quantization of floating
point data types, or varint encoding for integer fields.

For now this diff does not actually change the serialization code path yet,
and does not introduce any new serialization formats, but only refactors the
deserialization code path to make it easier to introduce new formats.

I'm not really that thrilled with the heavy use of macros and templates here,
but I didn't really see better alternatives that made it as simple to specify
new deserialization function implementations.

Differential Revision: D26658206

…formats This updates the `TensorProto` field to independently track the data type of the in-memory (deserialized) data from the serialized data format. This will allow us to support multiple different serialization formats in the future. For instance, we could choose to perform quantization of floating point data types, or varint encoding for integer fields. For now this diff does not actually change the serialization code path yet, and does not introduce any new serialization formats, but only refactors the deserialization code path to make it easier to introduce new formats. I'm not really that thrilled with the heavy use of macros and templates here, but I didn't really see better alternatives that made it as simple to specify new deserialization function implementations. Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/) [ghstack-poisoned]

facebook-github-bot · 2021-03-05T20:32:47Z

💊 CI failures summary and remediations

As of commit bda5422 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

…ialization formats" This updates the `TensorProto` field to independently track the data type of the in-memory (deserialized) data from the serialized data format. This will allow us to support multiple different serialization formats in the future. For instance, we could choose to perform quantization of floating point data types, or varint encoding for integer fields. For now this diff does not actually change the serialization code path yet, and does not introduce any new serialization formats, but only refactors the deserialization code path to make it easier to introduce new formats. I'm not really that thrilled with the heavy use of macros and templates here, but I didn't really see better alternatives that made it as simple to specify new deserialization function implementations. Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/) [ghstack-poisoned]

…formats Pull Request resolved: #53403 This updates the `TensorProto` field to independently track the data type of the in-memory (deserialized) data from the serialized data format. This will allow us to support multiple different serialization formats in the future. For instance, we could choose to perform quantization of floating point data types, or varint encoding for integer fields. For now this diff does not actually change the serialization code path yet, and does not introduce any new serialization formats, but only refactors the deserialization code path to make it easier to introduce new formats. I'm not really that thrilled with the heavy use of macros and templates here, but I didn't really see better alternatives that made it as simple to specify new deserialization function implementations. ghstack-source-id: 123579495 Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/)

…ialization formats" This updates the `TensorProto` field to independently track the data type of the in-memory (deserialized) data from the serialized data format. This will allow us to support multiple different serialization formats in the future. For instance, we could choose to perform quantization of floating point data types, or varint encoding for integer fields. For now this diff does not actually change the serialization code path yet, and does not introduce any new serialization formats, but only refactors the deserialization code path to make it easier to introduce new formats. I'm not really that thrilled with the heavy use of macros and templates here, but I didn't really see better alternatives that made it as simple to specify new deserialization function implementations. Differential Revision: [D26658206](https://our.internmc.facebook.com/intern/diff/D26658206/) [ghstack-poisoned]

codecov · 2021-03-11T02:04:16Z

Codecov Report

Merging #53403 (bda5422) into gh/simpkins/6/base (e7a7424) will decrease coverage by 0.00%.
The diff coverage is n/a.

@@                  Coverage Diff                   @@
##           gh/simpkins/6/base   #53403      +/-   ##
======================================================
- Coverage               77.63%   77.63%   -0.01%     
======================================================
  Files                    1869     1869              
  Lines                  182379   182379              
======================================================
- Hits                   141590   141588       -2     
- Misses                  40789    40791       +2

facebook-github-bot · 2021-03-12T19:35:41Z

This pull request has been merged in 33aaea9.

…formats (pytorch#53403) Summary: Pull Request resolved: pytorch#53403 This updates the `TensorProto` field to independently track the data type of the in-memory (deserialized) data from the serialized data format. This will allow us to support multiple different serialization formats in the future. For instance, we could choose to perform quantization of floating point data types, or varint encoding for integer fields. For now this diff does not actually change the serialization code path yet, and does not introduce any new serialization formats, but only refactors the deserialization code path to make it easier to introduce new formats. I'm not really that thrilled with the heavy use of macros and templates here, but I didn't really see better alternatives that made it as simple to specify new deserialization function implementations. ghstack-source-id: 123594220 Test Plan: Confirmed that the existing unit tests pass. This diff only touches the deserialization code path and not the serialization code to help ensure that the deserialization code works with the existing serialization logic, and that there are no changes to the current serialization format. Reviewed By: mraway Differential Revision: D26658206 fbshipit-source-id: d7297d600aee28b92fd9f4ece437b7f519060942

This was referenced Mar 5, 2021

[caffe2] move the SaveOp implementation from a header to a .cc file #53298

Closed

[caffe2] use AddNAlreadyReserved() when serializing blobs #53400

Closed

facebook-github-bot added the cla signed label Mar 5, 2021

This was referenced Mar 5, 2021

[caffe2] update load_save_test.py to also verify the chunking behavior #53401

Closed

[caffe2] add a SerializationOptions field for the save operator #53402

Closed

simpkins mentioned this pull request Mar 5, 2021

[caffe2] Refactor tensor serialization function #53404

Closed

simpkins mentioned this pull request Mar 10, 2021

[caffe2] support serializing float data as bfloat16 #53735

Closed

simpkins mentioned this pull request Mar 10, 2021

[caffe2] add a CAFFE2_NODISCARD macro to help support old compilers #53754

Closed

facebook-github-bot closed this in 33aaea9 Mar 12, 2021

facebook-github-bot added the Merged label Mar 12, 2021

facebook-github-bot deleted the gh/simpkins/6/head branch March 16, 2021 14:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[caffe2] Support deserializing tensors using alternate serialization formats#53403

[caffe2] Support deserializing tensors using alternate serialization formats#53403
simpkins wants to merge 4 commits intogh/simpkins/6/basefrom
gh/simpkins/6/head

simpkins commented Mar 5, 2021 •

edited

Loading

Uh oh!

facebook-github-bot commented Mar 5, 2021 •

edited

Loading

Uh oh!

codecov Bot commented Mar 11, 2021

Uh oh!

facebook-github-bot commented Mar 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

simpkins commented Mar 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Mar 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

codecov Bot commented Mar 11, 2021

Codecov Report

Uh oh!

facebook-github-bot commented Mar 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

simpkins commented Mar 5, 2021 •

edited

Loading

facebook-github-bot commented Mar 5, 2021 •

edited

Loading