[core][gpu-objects] Fix the performance regression by clearing object_ref for small and non-GPU objects#53692
Merged
stephanie-wang merged 7 commits intoray-project:masterfrom Jun 10, 2025
Conversation
Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com>
Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com>
8 tasks
Member
Author
|
The regression was fixed. https://buildkite.com/ray-project/release/builds/45142#019758ea-6d72-4590-a80c-d43a2ecaf2f2 |
Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com>
Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com>
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR addresses a performance regression in handling small and non-GPU objects by conditionally clearing the object_ref based on the tensor transport. The changes introduce a new TensorTransportGetter parameter across task submitters, dependency resolvers, and reference counting components while updating tests and mocks accordingly.
- Introduced TensorTransportGetter to determine whether to clear the object reference.
- Updated constructors and function signatures in production and test files.
- Added tests to validate behavior for mixed tensor transports.
Reviewed Changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/ray/core_worker/transport/normal_task_submitter.h | Added TensorTransportGetter to the constructor and passed it to the resolver. |
| src/ray/core_worker/transport/dependency_resolver.h | Introduced TensorTransportGetter type and stored it for dependency resolution. |
| src/ray/core_worker/transport/dependency_resolver.cc | Updated InlineDependencies to use the new tensor transport check before clearing object_ref. |
| src/ray/core_worker/transport/actor_task_submitter.h | Injected TensorTransportGetter into the actor task submitter. |
| src/ray/core_worker/test/normal_task_submitter_test.cc | Updated tests to provide a lambda returning OBJECT_STORE for tensor transport. |
| src/ray/core_worker/test/direct_actor_transport_mock_test.cc | Updated ActorTaskSubmitter instantiation in tests with the new parameter. |
| src/ray/core_worker/test/dependency_resolver_test.cc | Modified LocalDependencyResolver instantiation and added tests for mixed tensor transports. |
| src/ray/core_worker/test/actor_task_submitter_test.cc | Added TensorTransportGetter in test instantiation. |
| src/ray/core_worker/task_manager.cc | Propagated the tensor transport value when adding pending tasks. |
| src/ray/core_worker/reference_count.h and reference_count.cc | Updated object ownership functions with the new tensor_transport parameter and added accessor for tensor transport. |
| src/ray/core_worker/core_worker.cc | Modified ActorTaskSubmitter construction and lease submitter instantiation with the new tensor transport logic. |
| src/mock/ray/core_worker/reference_count.h | Updated mock method signature for AddOwnedObject to match the new parameters. |
kevin85421
commented
Jun 10, 2025
| ASSERT_EQ(resolver.NumPendingTasks(), 0); | ||
| } | ||
|
|
||
| TEST(LocalDependencyResolverTest, TestMixedTensorTransport) { |
Member
Author
There was a problem hiding this comment.
unit test for this PR
Contributor
There was a problem hiding this comment.
Nice, thanks for adding this.
stephanie-wang
approved these changes
Jun 10, 2025
Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu> Signed-off-by: Kai-Hsun Chen <kaihsun@apache.org>
elliot-barn
pushed a commit
that referenced
this pull request
Jun 18, 2025
…t_ref` for small and non-GPU objects (#53692) This PR is based on #53630. See #53623 for the issue. In this PR, we clear the object ref when the arg's tensor transport is not OBJECT_STORE. Closes #53623 --------- Signed-off-by: Stephanie wang <smwang@cs.washington.edu> Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com> Signed-off-by: Kai-Hsun Chen <kaihsun@apache.org> Co-authored-by: Stephanie wang <smwang@cs.washington.edu> Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
elliot-barn
pushed a commit
that referenced
this pull request
Jul 2, 2025
…t_ref` for small and non-GPU objects (#53692) This PR is based on #53630. See #53623 for the issue. In this PR, we clear the object ref when the arg's tensor transport is not OBJECT_STORE. Closes #53623 --------- Signed-off-by: Stephanie wang <smwang@cs.washington.edu> Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com> Signed-off-by: Kai-Hsun Chen <kaihsun@apache.org> Co-authored-by: Stephanie wang <smwang@cs.washington.edu> Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why are these changes needed?
This PR is based on #53630.
See #53623 for the issue. In this PR, we clear the object ref when the arg's tensor transport is not OBJECT_STORE.
Related issue number
Closes #53623
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.