Skip to content

[Onnx] - refactoring serialization of ONNX initializers to be name-based#17420

Closed
spandantiwari wants to merge 8 commits intopytorch:masterfrom
spandantiwari:spandantiwari/params_with_names
Closed

[Onnx] - refactoring serialization of ONNX initializers to be name-based#17420
spandantiwari wants to merge 8 commits intopytorch:masterfrom
spandantiwari:spandantiwari/params_with_names

Conversation

@spandantiwari
Copy link

Currently, serialization of model parameters in ONNX export depends on the order in which they are stored in a container (list on Python side and std::vector on C++ side). This has worked fine till now, but if we need to do any pass on that graph that mutates the parameter list, then strictly order-based serialization may not work.

This PR is the first in a set to bring in more passes (such as constant folding) related to ONNX export. This PR lays the groundwork by moving the serialization in ONNX export from order-based to name based approach, which is more amenable to some of the passes.

@houseroad - As discussed this change uses a map for export, and removes the code from export.cpp that relies on the order to compute initializer names.

@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Feb 22, 2019
Copy link
Member

@houseroad houseroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the idea looks good to me. Could you fix the build issue?

@spandantiwari
Copy link
Author

@houseroad - yes, working on it.

@spandantiwari spandantiwari force-pushed the spandantiwari/params_with_names branch from cbf0ff6 to ba538db Compare February 27, 2019 01:50
@spandantiwari
Copy link
Author

Submitted a fix for the build issue.

@spandantiwari
Copy link
Author

Seeing test failure(s). Investigating...

@spandantiwari spandantiwari force-pushed the spandantiwari/params_with_names branch 2 times, most recently from ea85ca5 to e9ee672 Compare February 28, 2019 21:43
@spandantiwari
Copy link
Author

spandantiwari commented Feb 28, 2019

A couple of ONNX export tests in test_jit.py were failing because now the order of initializers could possibly differ. Did a sanity check that the model is indeed the same as before and checked that the model outputs is still the same. Fixed the test by updating the .expect files for them.

@houseroad - I also saw some unexpected SIGFPE type failure in the last build (possibly similar to your reserve_param PR CI build). Waiting to see if that shows up in this build or not.

@spandantiwari
Copy link
Author

@houseroad - I am seeing that SIGFPE type failure again. Could you please take a look and suggest what can be done to fix this. Thanks.

@spandantiwari spandantiwari force-pushed the spandantiwari/params_with_names branch 2 times, most recently from 55038e3 to 0acd0e3 Compare March 4, 2019 23:59
@spandantiwari
Copy link
Author

Rebased to master.

Copy link
Member

@houseroad houseroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we would like to reserve the order, we may try using orderedmap (in python)

@spandantiwari spandantiwari force-pushed the spandantiwari/params_with_names branch from 0acd0e3 to 6c1a725 Compare March 6, 2019 21:58
@spandantiwari
Copy link
Author

Updated the export method to use std::map to get deterministic order of serialization of the initializers during ONNX export for testing purposes.

@houseroad
Copy link
Member

@pytorchbot retest this please

Copy link
Member

@houseroad houseroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! This is a great change.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@houseroad has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@spandantiwari
Copy link
Author

@houseroad - thanks for your help!

petrex pushed a commit to petrex/pytorch that referenced this pull request Mar 7, 2019
* upstream/master: (24 commits)
  Automatic update of fbcode/onnx to 96c58ceeacf0f2b73d752e413e4fd78787a12da3 (pytorch#17676)
  Set the default ONNX opset to the latest stable opset (i.e., 9) (pytorch#17736)
  Add module attributes (pytorch#17309)
  - refactoring serialization of ONNX initializers to be name-based (pytorch#17420)
  ONNX Export for Max and Average Pooling in CEIL_MODE
  use flake8-mypy (pytorch#17721)
  use fp16<->fp32 intrinsic (pytorch#17496)
  Implement a Caffe2 standalone LSTM operator (pytorch#17726)
  caffe2:libtorch_cuda depends on caffe2:caffe2_gpu (pytorch#17729)
  add tensor and cost inference functions (pytorch#17684)
  ONNX Export Narrow op
  Keep the dim_type of hinted shape as BATCH if possible (pytorch#17734)
  fix different round behavior on CPU and GPU pytorch#16498 (pytorch#17443)
  Warn about memory overlaps on expanded tensors (pytorch#17576)
  fix exp fam. formula
  refactor caffe2 operator constructors - 10/9 (pytorch#17659)
  Improve ONNX symbolic for logsoftmax and softmax (pytorch#17672)
  Enable using CMD when building cpp extensions on Windows
  Do not rename net boundary inputs/outputs during ssaRewrite. (pytorch#17545)
  Reapply D14078519 (pytorch#17596)
  ...
facebook-github-bot pushed a commit that referenced this pull request Mar 29, 2019
…bmission) (#17830)

Summary:
houseroad - this is the resubmission of #17420, as suggested.
Pull Request resolved: #17830

Reviewed By: zrphercule

Differential Revision: D14398714

Pulled By: houseroad

fbshipit-source-id: bda475f1ae8a5273ebdb0f6883fc66036c29d326
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

oncall: jit Add this issue/PR to JIT oncall triage queue open source

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants