-
Notifications
You must be signed in to change notification settings - Fork 7.4k
Closed
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'trdtRay Direct TransportRay Direct Transport
Description
What happened + What you expected to happen
The GPU objects POC PR (#52938) causes a performance degradation because it passes ObjectRef for small objects.
-
Before GPU objects POC PR,
mutable_arg->clear_object_ref();will be called. After GPU objects POC PR, the whole object ref will be transferred which slow down a very niche test stage2 in test_many_tasks.py. (See [DON'T MERGE] #53575 for more details)- Before GPU objects POC PR: stage_2_time 168.2283821105957 seconds
- After GPU objects POC PR: stage_2_time 287.988343000412 seconds
-
Clear owner_address ([core] Clear owner_address when inline an small object to speed up #53590) to reduce the size of data transfer because we only need object id.
stage_2_time= 198.5783519744873
Solution is to clear object ref if the object ref is not a GPU object ref.
Versions / Dependencies
nightly
Reproduction script
Trigger release tests with name:stress_test_many_tasks.aws
Issue Severity
None
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'trdtRay Direct TransportRay Direct Transport