[C++ API] Protobuf serialization#11619
Conversation
|
I wouldn't worry about (2). How would you change this if you didn't have to support Cereal at all? |
|
Is the problem this is solving primarily "how to serialize to both protobuf and cereal", or are there other problems? |
dzhulgakov
left a comment
There was a problem hiding this comment.
Looks neat!
For future we should have some mechanism of encoding at least typename of the c++ module in the serialized file for reference. Since for now modules are different from jit modules I guess it's ok
P.S. reusing jit module serialization is... cute :)
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
36d2455 to
3f83954
Compare
|
After discussing with @ebetica we found that backwards compatibility with cereal is not as important and that StarCraft would be happy to switch to protobuf. For this reason I've opted to specialize my design a little more towards our Protobuf serialization format. The new primitives are which gets populated e.g. in This API is not intended for inheritence. The reason why is that This means we only support protobuf serialization out of the box. However, I've made sure that all parameters and buffers in modules and optimizers that one would want to store are properly exposed. This means if one wanted to, one could write a new serialization library on top of the |
6379c68 to
7881ad9
Compare
|
Do you think overloading torch::save for a stream is nice? It just feels more C++-y to me, and is more flexible for users. |
7881ad9 to
be74fad
Compare
ebetica
left a comment
There was a problem hiding this comment.
Happy with the API changes, I didn't do a deep reading of this PR.
be74fad to
95ac9b7
Compare
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
.gitmodules
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
test/cpp/api/util.h
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
ezyang
left a comment
There was a problem hiding this comment.
Please don't leak file descriptors
7bb46aa to
03d08f6
Compare
03d08f6 to
1e92bf3
Compare
facebook-github-bot
left a comment
There was a problem hiding this comment.
goldsborough has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary:
This PR serves two purposes:
1. Design an abstraction over a serialization scheme for C++ modules, optimizers and tensors in general,
2. Add serialization to the ONNX/PyTorch proto format.
This is currently a rough prototype I coded up today, to get quick feedback.
For this I propose the following serialization interface within the C++ API:
```cpp
namespace torch { namespace serialize {
class Reader {
public:
virtual ~Reader() = default;
virtual void read(const std::string& key, Tensor& tensor, bool is_buffer = false) = 0;
virtual void finish() { }
};
class Writer {
public:
virtual ~Reader() = default;
virtual void writer(const std::string& key, const Tensor& tensor, bool is_buffer = false) = 0;
virtual void finish() { }
};
}} // namespace torch::serialize
```
There are then subclasses of these two for (1) Cereal and (2) Protobuf (called the "DefaultWriter" and "DefaultReader" to hide the implementation details). See `torch/serialize/cereal.h` and `torch/serialize/default.h`. This abstraction and subclassing for these two allows us to:
1. Provide a cereal-less serialization forward that we can ship and iterate on going forward,
2. Provide no-friction backwards compatibility with existing C++ API uses, mainly StarCraft.
The user-facing API is (conceptually):
```cpp
void torch::save(const Module& module, Writer& writer);
void torch::save(const Optimizer& optimizer, Writer& writer);
void torch::read(Module& module, Reader& reader);
void torch::read(Optimizer& optimizer, Reader& reader);
```
with implementations for both optimizers and modules that write into the `Writer` and read from the `Reader`
ebetica ezyang zdevito dzhulgakov
Pull Request resolved: pytorch/pytorch#11619
Differential Revision: D9984664
Pulled By: goldsborough
fbshipit-source-id: e03afaa646221546e7f93bb8dfe3558e384a5847
Summary:
This PR serves two purposes:
1. Design an abstraction over a serialization scheme for C++ modules, optimizers and tensors in general,
2. Add serialization to the ONNX/PyTorch proto format.
This is currently a rough prototype I coded up today, to get quick feedback.
For this I propose the following serialization interface within the C++ API:
```cpp
namespace torch { namespace serialize {
class Reader {
public:
virtual ~Reader() = default;
virtual void read(const std::string& key, Tensor& tensor, bool is_buffer = false) = 0;
virtual void finish() { }
};
class Writer {
public:
virtual ~Reader() = default;
virtual void writer(const std::string& key, const Tensor& tensor, bool is_buffer = false) = 0;
virtual void finish() { }
};
}} // namespace torch::serialize
```
There are then subclasses of these two for (1) Cereal and (2) Protobuf (called the "DefaultWriter" and "DefaultReader" to hide the implementation details). See `torch/serialize/cereal.h` and `torch/serialize/default.h`. This abstraction and subclassing for these two allows us to:
1. Provide a cereal-less serialization forward that we can ship and iterate on going forward,
2. Provide no-friction backwards compatibility with existing C++ API uses, mainly StarCraft.
The user-facing API is (conceptually):
```cpp
void torch::save(const Module& module, Writer& writer);
void torch::save(const Optimizer& optimizer, Writer& writer);
void torch::read(Module& module, Reader& reader);
void torch::read(Optimizer& optimizer, Reader& reader);
```
with implementations for both optimizers and modules that write into the `Writer` and read from the `Reader`
ebetica ezyang zdevito dzhulgakov
Pull Request resolved: pytorch#11619
Differential Revision: D9984664
Pulled By: goldsborough
fbshipit-source-id: e03afaa646221546e7f93bb8dfe3558e384a5847
This PR serves two purposes:
This is currently a rough prototype I coded up today, to get quick feedback.
For this I propose the following serialization interface within the C++ API:
There are then subclasses of these two for (1) Cereal and (2) Protobuf (called the "DefaultWriter" and "DefaultReader" to hide the implementation details). See
torch/serialize/cereal.handtorch/serialize/default.h. This abstraction and subclassing for these two allows us to:The user-facing API is (conceptually):
with implementations for both optimizers and modules that write into the
Writerand read from theReader@ebetica @ezyang @zdevito @dzhulgakov