[Object Spilling] Multi node file spilling by rkooo567 · Pull Request #13336 · ray-project/ray

rkooo567 · 2021-01-11T11:46:54Z

Why are these changes needed?

NOTE: Not done yet. Waiting for Stephanie's review just to make sure we are on the same page in terms of implementation before making changes that are necessary (e.g., cleaning stuff)

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

rkooo567 · 2021-01-12T19:35:57Z

@stephanie-wang Can you take a look at the logic part before I write cleaning up the code? I need to

Remove spilled URL field from object location protobuf
Fix broken cpp unit tests

stephanie-wang · 2021-01-12T20:06:37Z

Just skimmed the code so far, and I'm a bit surprised there are changes to the ObjectManager. Is there a way we can keep the changes more contained to the LocalObjectManager or the Python-based spill/restore code? I guess I was imagining something like this:

On spilled URL found, ObjectManager asks LocalObjectManager to restore.
Instead of restoring directly, LocalObjectManager (or the Python IO worker) sends an RPC to the remote raylet to restore
On object location found, ObjectManager pulls.

It seems like we don't need many (any?) changes for steps 1 and 3, and just need to implement step 2. Or maybe I'm missing something?

rkooo567 · 2021-01-12T20:31:13Z

src/ray/object_manager/object_manager.cc

  RAY_CHECK(!local_objects_.empty() || used_memory_ == 0);
-  ray::Status status =
-      object_directory_->ReportObjectRemoved(object_id, self_node_id_, object_info);
+  if (!is_object_spilled_locally_(object_id)) {


Without this, remote nodes will never be able to find that this node has an object because the location is removed when the object is evicted.

rkooo567 · 2021-01-12T20:32:05Z

src/ray/object_manager/object_manager.cc

    }
+    // If the object was spilled on this node, we should try pulling it.
+    // In this case, the object will be pulled from the external storage in a local node.
+    if (is_object_spilled_locally_(object_id)) {


This is essential because HandlePull assumes that the object already exists in the local node. In object spilling case, it is not the case, so we should have another pull here.

rkooo567 · 2021-01-12T20:32:37Z

I think that's one of the possible implementations. In that case, I think we should also store which node spilled the object of the URL (so that we can send RPC).

The current implementation is based on https://docs.google.com/document/d/1kN5D50kBkUp_xs8nTGHexxvn88zkZ3rKB22Vnc3rDMI/edit#heading=h.yy0u8n49j8ns, and it was chosen due to its simplicity (and more intuitive imo).

This implementation restores the object only when it is spilled locally. But it came out that there were some edge cases I found due to the fact that our Pull/Push logic assumes object already exists in the local object store, and that requires some changes in object manager.

That says, I am not so against your idea since the current implementation is more complex than expected. Thoughts @ericl?

ericl · 2021-01-12T20:37:50Z

Hmm I agree @stephanie-wang 's solution is simpler given this PR. If I understand, we just need to add a new RPC and keep retrying that right? Also, the raylet address can be encoded in the URL?

rkooo567 · 2021-01-12T20:43:15Z

@stephanie-wang I think we should also figure out how to not restoring the same object again and again when there are multiple requests from many nodes. This means we should keep tracking of which objects are restoring and return the restore request if it is already in progress.

@ericl Yeah that's also possible. I feel like just having one more field is probably simpler than encoding it to the URL (since we should pass the IP address to the io worker).

Besides, I don't think the current solution is too complicated compared to the other? I think both solutions kind of require changes in the pull mechanism (Stephanie's solution will make a new path to pull the object which was only done by Pull + debugging might be harder because it is done in 2 different moving components whereas the current solution needs to change in some pulling logic to take into account of spilled objects)

stephanie-wang · 2021-01-13T01:01:17Z

@stephanie-wang I think we should also figure out how to not restoring the same object again and again when there are multiple requests from many nodes. This means we should keep tracking of which objects are restoring and return the restore request if it is already in progress.

This seems orthogonal to multi-node file spilling. Doesn't this just mean we should deduplicate all restore requests in the LocalObjectManager, not just ones that are reloading from disk?

Besides, I don't think the current solution is too complicated compared to the other? I think both solutions kind of require changes in the pull mechanism (Stephanie's solution will make a new path to pull the object which was only done by Pull + debugging might be harder because it is done in 2 different moving components whereas the current solution needs to change in some pulling logic to take into account of spilled objects)

I think the advantage of what I suggested is that it keeps the logic related to spilling contained to the LocalObjectManager. This design seems to leak details related to spilling to the ObjectManager.

rkooo567 · 2021-01-13T01:21:40Z

This seems orthogonal to multi-node file spilling. Doesn't this just mean we should deduplicate all restore requests in the LocalObjectManager, not just ones that are reloading from disk?

Yeah, you are right that we need for every storage. But I am not sure if we need dedup with the current approach because I believe the pull should've been already deduped and we restore objects based on this pull (lmk if it is wrong. Then we need to add dedupe logic).

I agree the new approach has a better abstraction barrier, and I don't think there's any more edge case than this dedup stuff. What about this? I will keep this open, and try implementing the new approach, and see if it is simple enough. If so, I will make a PR with that and close it. If there are tricky edge cases, I will probably go with the current approach.

stephanie-wang · 2021-01-13T01:35:58Z

Yeah, you are right that we need for every storage. But I am not sure if we need dedup with the current approach because I believe the pull should've been already deduped and we restore objects based on this pull (lmk if it is wrong. Then we need to add dedupe logic).

I think either way, we should dedupe at the LocalObjectManager because the Pull itself can duplicate calls (e.g., if the timer fires again before the restoration is complete).

I agree the new approach has a better abstraction barrier, and I don't think there's any more edge case than this dedup stuff. What about this? I will keep this open, and try implementing the new approach, and see if it is simple enough. If so, I will make a PR with that and close it. If there are tricky edge cases, I will probably go with the current approach.

Sounds good, feel free to open another PR for the other approach and assign me again.

prototype done.

f4bab3e

rkooo567 assigned stephanie-wang Jan 12, 2021

rkooo567 changed the title ~~[WIP] Multi node file spilling~~ [Object Spilling] Multi node file spilling Jan 12, 2021

stephanie-wang added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Jan 12, 2021

rkooo567 commented Jan 12, 2021

View reviewed changes

rkooo567 mentioned this pull request Jan 19, 2021

[Object Spilling] Multi node file spilling V2. #13542

Merged

6 tasks

rkooo567 added 2 commits January 19, 2021 22:52

Merge branch 'master' into multi-node-file-spilling

12cbee5

fix issues.

d37b193

rkooo567 closed this Jan 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Object Spilling] Multi node file spilling#13336

[Object Spilling] Multi node file spilling#13336
rkooo567 wants to merge 3 commits intoray-project:masterfrom
rkooo567:multi-node-file-spilling

rkooo567 commented Jan 11, 2021 •

edited

Loading

Uh oh!

rkooo567 commented Jan 12, 2021

Uh oh!

stephanie-wang commented Jan 12, 2021

Uh oh!

rkooo567 Jan 12, 2021

Uh oh!

rkooo567 Jan 12, 2021

Uh oh!

rkooo567 commented Jan 12, 2021

Uh oh!

ericl commented Jan 12, 2021

Uh oh!

rkooo567 commented Jan 12, 2021 •

edited

Loading

Uh oh!

stephanie-wang commented Jan 13, 2021

Uh oh!

rkooo567 commented Jan 13, 2021

Uh oh!

stephanie-wang commented Jan 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rkooo567 commented Jan 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

rkooo567 commented Jan 12, 2021

Uh oh!

stephanie-wang commented Jan 12, 2021

Uh oh!

rkooo567 Jan 12, 2021

Choose a reason for hiding this comment

Uh oh!

rkooo567 Jan 12, 2021

Choose a reason for hiding this comment

Uh oh!

rkooo567 commented Jan 12, 2021

Uh oh!

ericl commented Jan 12, 2021

Uh oh!

rkooo567 commented Jan 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stephanie-wang commented Jan 13, 2021

Uh oh!

rkooo567 commented Jan 13, 2021

Uh oh!

stephanie-wang commented Jan 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rkooo567 commented Jan 11, 2021 •

edited

Loading

rkooo567 commented Jan 12, 2021 •

edited

Loading