Skip to content

[Core] Cancel lease requests before returning a PG bundle#46116

Merged
jjyao merged 6 commits intoray-project:masterfrom
jjyao:llkea
Jun 20, 2024
Merged

[Core] Cancel lease requests before returning a PG bundle#46116
jjyao merged 6 commits intoray-project:masterfrom
jjyao:llkea

Conversation

@jjyao
Copy link
Copy Markdown
Contributor

@jjyao jjyao commented Jun 18, 2024

Why are these changes needed?

Redo ##45919

Related issue number

Closes #45642

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@jjyao jjyao requested a review from a team as a code owner June 18, 2024 04:13
jjyao added 2 commits June 17, 2024 22:02
Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
@jjyao jjyao added the go add ONLY when ready to merge, run all tests label Jun 18, 2024
Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
Comment on lines +1908 to +1909
return (bundle_id.first == bundle_spec.PlacementGroupId()) &&
(work->GetState() == internal::WorkStatus::WAITING_FOR_WORKER);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix for the revert: CancelResourceReserve rpc might be called when prepare failed and in this case, we shouldn't cancel the lease request since the PG will be retried and created eventually. So this fix only limits the cancellation to the lease requests that are waiting for workers since this is the case where we need to free bundle resources.

jjyao added 2 commits June 20, 2024 07:09
Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Core] Raylet check failed: placement_group_resource_manager.cc:29: Check failed: ReturnBundle(*iter->second).ok()

2 participants