Skip to content

IValue pickle does not work properly if an empty tensor table is not provided #25591

@yxjiang

Description

@yxjiang

🐛 Bug

According to the description for the torch::jit::pickle API, the parameter tensor_table is optional. If not provided, the tensors are expected to be stored in the same byte stream as the pickle data. See https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/pickle.h#L9-L39.

However, if we only pass IValue without an empty tensor_table. The program will throw exception:

C++ exception with description "Expected length to be 8, got 10 (readInstruction at external/torch/torch/csrc/jit/pickler.cpp:654)
frame #0: <unknown function> +  (0x7f8afbe1ee9f in [0x7f8afbe1ee9f])
frame #1: <unknown function> +  (0x7f8afbe1ee3f in [0x7f8afbe1ee3f])
...

To Reproduce

Steps to reproduce the behavior:

  1. We can reuse the test case at
    TEST(TorchScriptTest, TestPickle) {
    torch::IValue float_value(2.3);
    // TODO: when tensors are stored in the pickle, delete this
    std::vector<at::Tensor> tensor_table;
    auto data = torch::jit::pickle(float_value, &tensor_table);
    torch::IValue ivalue = torch::jit::unpickle(data.data(), data.size());
    double diff = ivalue.toDouble() - float_value.toDouble();
    double eps = 0.0001;
    ASSERT_TRUE(diff < eps && diff > -eps);
    }
    and only change:
auto data = torch::jit::pickle(float_value, &tensor_table);

to

auto data = torch::jit::pickle(float_value);
  1. It would be better if we can test with more complex cases. For example:
    // Test of list of tensors
    c10::List<torch::Tensor> list;
    vector<torch::Tensor>    tensor_table;
    list.push_back(torch::rand({2, 3}));
    list.push_back(torch::rand({3, 2}));
    auto        data         = torch::jit::pickle(list, &tensor_table);
    c10::IValue deserialized = torch::jit::unpickle(data.data(), data.size());

Or

    // Elements:
    //   list<dict<string, tensor>, dict<string, tensor>>
    //   1x1 tensor
    //   2x1 tensor
    c10::impl::GenericList inputs = static_cast<c10::impl::GenericList>(c10::impl::deprecatedUntypedList()); 
    torch::Dict<string, torch::Tensor> iDict1; 
    iDict1.insert("key1", torch::ones(1)); 
    iDict1.insert("key2", torch::ones({2, 2}));

    torch::Dict<string, torch::Tensor> iDict2;
    iDict2.insert("key3", torch::rand({3, 3}));
    iDict2.insert("key4", torch::rand({4, 4}));

    // list<dict<string, tensor>, dict<string, tensor>>
    torch::List<c10::Dict<string, torch::Tensor>> iComplex;
    iComplex.push_back(iDict1);
    iComplex.push_back(iDict2);

    torch::jit::IValue tensorA = torch::ones(1);
    torch::jit::IValue tensorB = torch::zeros({2, 1});

    inputs.push_back(iComplex);
    inputs.push_back(tensorA);
    inputs.push_back(tensorB);

    vector<char> data = torch::jit::pickle(inputs);
    torch::IValue deserialized = torch::jit::unpickle(data.data(), data.size());

Expected behavior

It would be expected that the deserialized IValue equals to the original IValue.

Environment

Please copy and paste the output from our
environment collection script
(or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
  • PyTorch Version (e.g., 1.0): 50161f3 (after 1.3)
  • OS (e.g., Linux): Ubuntu 18.04.3 LTS
  • How you installed PyTorch (conda, pip, source): source
  • Build command you used (if compiling from source): bazel
  • Python version: 3.4.10
  • cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.2
  • GPU models and configuration: GPU 0: Quadro P2000
  • Nvidia driver version: 430.40

Additional context

We are Uber ATG and this feature is requested by us (#23241. Thanks @driazati for working on it!). This feature was not included in pytorch 1.3.

cc @suo

Metadata

Metadata

Assignees

No one assigned

    Labels

    jit-backlogoncall: jitAdd this issue/PR to JIT oncall triage queuetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions