Conversation
Signed-off-by: ujjawal-khare <khareu460@gmail.com>
Signed-off-by: ujjawal-khare <khareu460@gmail.com>
Signed-off-by: ujjawal-khare <khareu460@gmail.com>
Signed-off-by: ujjawal-khare <khareu460@gmail.com>
Signed-off-by: ujjawal-khare <khareu460@gmail.com>
Future-Outlier
left a comment
There was a problem hiding this comment.
Sorry, I remembered that this PR is not going to be added,
cc @EagleLo do you remember the reason?
Schedulers (ex: Volcano, YuniKorn, Kueue, scheduler-plugins, KAI scheduler ... etc) should ensure the gang scheduling. |
|
@Future-Outlier Last time we discussed that since this applies to InteractiveMode, it's up to the user to decide when to stop or clean up the job. So we decided not to enforce a hard cutoff time in this case. |
|
This for some reason popped in my notifications (really no idea why)... But anyway, @kevin85421, my original request has been indeed related to InteractiveMode. I'm not using a gang scheduler right now. But I don't want to have some of my interactive jobs hanging forever. |
Why are these changes needed?
Added a param WaitingTtlSeconds to Ray job. WaitingTtlSeconds is the TTL to mark RayJob as failed when it is waiting to be scheduled.
Related issue number
Closes #4037
Checks