Skip to content

[release test] train_pytorch_linear_test hangs since Monday 2/21 #22595

@xwjiang2010

Description

@xwjiang2010

Search before asking

  • I searched the issues and found no similar issues.

Ray Component

Ray Train

What happened + What you expected to happen

fail: https://buildkite.com/ray-project/periodic-ci/builds/2877#7efe4aca-68fd-414f-9798-153a692cebd2
success: https://buildkite.com/ray-project/periodic-ci/builds/2841#9b925354-82d8-46bd-864b-508db98904db

the successful training finishes around 5min. The failed one reaches test timeout of 2 hours.

Versions / Dependencies

master

Reproduction script

NA

Anything else

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Metadata

Metadata

Labels

bugSomething that is supposed to be working; but isn'ttriageNeeds triage (eg: priority, bug/not-bug, and owning component)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions