[WIP] Module export using custom file format#9794
[WIP] Module export using custom file format#9794jamesr66a wants to merge 4 commits intopytorch:masterfrom
Conversation
23dd447 to
17e4045
Compare
|
@pytorchbot retest this please |
zdevito
left a comment
There was a problem hiding this comment.
Generally the on-disk format looks right, but the APIs for reading/writing need serious cleanup. We need to get them right otherwise the ways we will try to improve them won't be possible.
| // string verisions of their file offsets. This also finalizes the file, | ||
| // and calling serializeTensor after calling this method is illegal. | ||
| // NOTE: this method mutates the model proto | ||
| size_t serializeModelProto(::ONNX_NAMESPACE::ModelProto *model_proto) { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| } | ||
|
|
||
| // Serialize a tensor to file, then return its offset | ||
| size_t serializeTensor(const std::string& name, at::Tensor t) { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| padToNextAlignmentBoundary(); | ||
| } | ||
|
|
||
| void swapTensorAttributeNames(::ONNX_NAMESPACE::GraphProto *g) { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| readAndValidateFileHeader(); | ||
| } | ||
|
|
||
| // returns raw data map and the index of the last record (i.e. the ModelProto) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| } | ||
|
|
||
| // returns raw data map and the index of the last record (i.e. the ModelProto) | ||
| std::tuple<std::unordered_map<std::string, std::string>&, size_t> read_raw() { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| ~PyTorchFileReader() { | ||
| std::fclose(fp); | ||
| } | ||
|
|
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| return std::make_tuple(model_proto.SerializeAsString(), graph_encoder.get_raw_data_export_map()); | ||
| } | ||
|
|
||
| class PyTorchFileWriter { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| }, py::arg("onnx_opset_version")=0, | ||
| py::arg("operator_export_type")=::torch::onnx::OperatorExportTypes::RAW) | ||
| .def("export_to_pytorch_file", [](const std::shared_ptr<Module> m, const std::string& filename, | ||
| int64_t onnx_opset_version, |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
Starting new PR soon |
Summary: This is a follow-up to #9794 that contains only the serialization library and exposes a cleaner API. This should later be incorporated into the module export code Pull Request resolved: #9900 Reviewed By: zdevito Differential Revision: D9021057 Pulled By: jamesr66a fbshipit-source-id: 01af74a7fdd1b90b2f5484644c3121d8ba9eb3b3
Summary: This is a follow-up to pytorch#9794 that contains only the serialization library and exposes a cleaner API. This should later be incorporated into the module export code Pull Request resolved: pytorch#9900 Reviewed By: zdevito Differential Revision: D9021057 Pulled By: jamesr66a fbshipit-source-id: 01af74a7fdd1b90b2f5484644c3121d8ba9eb3b3
Summary: This is a follow-up to pytorch#9794 that contains only the serialization library and exposes a cleaner API. This should later be incorporated into the module export code Pull Request resolved: pytorch#9900 Reviewed By: zdevito Differential Revision: D9021057 Pulled By: jamesr66a fbshipit-source-id: 01af74a7fdd1b90b2f5484644c3121d8ba9eb3b3
Stacked on #9746. Look at the last commit.
This implements module (de)serialization using a custom aligned data format