Skip to content

[Core] task scheduling hangs forever #19326

@scv119

Description

@scv119

Search before asking

  • I searched the issues and found no similar issues.

Ray Component

Ray Core

What happened + What you expected to happen

The dataset ingestion e2e test hangs with following error. This could either be dataset bug/scheduling error/reporting error.

2021-10-12 05:44:32,010	WARNING worker.py:1228 -- The actor or task with ID 38b7b52b8a7371117241b1a3f60f62e6f940fd5479d1b53b cannot be scheduled right now. It requires {node:172.31.57.230: 0.001000}, {CPU: 1.000000} for placement, however the cluster currently cannot provide the requested resources. The required resources may be added as autoscaling takes place or placement groups are scheduled. Otherwise, consider reducing the resource requirements of the task.

another case:

https://anyscaleteam.slack.com/archives/CLJQR6420/p1634346237198100

and another one:

#19207

Versions / Dependencies

latest ray

Reproduction script

N/A

Anything else

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Metadata

Metadata

Assignees

Labels

P2Important issue, but not time-criticalbugSomething that is supposed to be working; but isn't

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions