Skip to content

New features in cloudpickle give unexpected results #214

@suquark

Description

@suquark

I am a developer of Ray, and we have something we want to fix soon. We are updating cloudpickle to 0.6.1 to fix an issue. However, although version 0.6.1 fixed our issue, it failed to pickle several objects correctly. These failures only happened when we were trying to run a complex set of unittests so we found it hard to give some simple instructions to trigger those cases.

I investigated into those cases and created a PR which fixed those failures and this PR showed that:

  • When restoring the global variables, in Ray we should always give serialized global variables higher priority (that is, always overriding existing global variables). The new changes in cloudpickle gives higher priority to existing global variables.
  • In Ray, do not try to infer global variables from the __module__ attribute of a function. The new changes in cloudpickle try to make use of __module__ when it fails to fetch global variables.

It seems that these new features are not compatible with Ray, but they are not really bugs to be reverted. And since we would like to follow the cloudpickle upstream instead of maintaining our own copy, the proper way should be creating several switches to control the availability of those new features in the upstream. Could you have any suggestions about how and where should we put those switches?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions