Skip to content

Permanently pin object IDs that are captured in functions or serialized out of band#7358

Merged
edoakes merged 2 commits intoray-project:masterfrom
edoakes:captured-ids
Feb 28, 2020
Merged

Permanently pin object IDs that are captured in functions or serialized out of band#7358
edoakes merged 2 commits intoray-project:masterfrom
edoakes:captured-ids

Conversation

@edoakes
Copy link
Copy Markdown
Collaborator

@edoakes edoakes commented Feb 27, 2020

Why are these changes needed?

ObjectIDs can't be properly reference counted when they're used in unsupported ways (e.g., serialized directly with cloudpickle or captured in a function/actor). This PR permanently increases their reference count when serialized, effectively pinning the object until the offending worker/driver disconnects.

Related issue number

Checks

@AmplabJenkins
Copy link
Copy Markdown

Can one of the admins verify this patch?

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/22520/
Test FAILed.

# then pin the object for the lifetime of this worker by adding
# a local reference that won't ever be removed.
ray.worker.get_global_worker(
).core_worker.add_object_id_reference(obj)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of modifying the serialization tracking, could we instead call get_and_clear_contained_object_ids() at the end of pickling a function definition, and add all of those to globals?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that solution would be enough, since ObjectIDs can also get arbitrarily pickled by other parts of the user's code.

# then pin the object for the lifetime of this worker by adding
# a local reference that won't ever be removed.
ray.worker.get_global_worker(
).core_worker.add_object_id_reference(obj)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that solution would be enough, since ObjectIDs can also get arbitrarily pickled by other parts of the user's code.

@edoakes edoakes merged commit f321eae into ray-project:master Feb 28, 2020
ffbin pushed a commit to antgroup/ant-ray that referenced this pull request Mar 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants