Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/102426
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New FailuresAs of commit 9181bd5: NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Per the discussion with @clee2000, there will be another PR to update trymerge to handle unstable failures:
An unstable test job will have |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
After pytorch/pytorch#102426 and pytorch/pytorch#102784 landed, unstable jobs are now hidden correctly on HUD https://hud.pytorch.org and also won't block PR. Previously, this was done by moving unstable jobs to an unstable workflow. Now unstable jobs will stay in the same workflow, but have `unstable` in their names. This is very similar to how `rerun_disabled_tests` are ignored atm. ### Testing https://torchci-git-fork-huydhn-ignore-unstable-jobs-fbopensource.vercel.app/metrics
Per title, after #102426 landed, it makes sense to have a new category for UNSTABLE jobs and handle them accordingly in trymerge. * The simple approach is to check for `unstable` in the check (job) name. I plan to roll this out first and then see if we need to cover the more complicated, but less popular case, of unstable build job. Specifically, an unstable build job has no `unstable` in its name * An unstable job is ignored by trymerge. This is the same behavior we have atm when a job is moved to unstable. It's completely ignored * The update to Dr. CI will come later, so that unstable failures would also be hidden like broken trunk or flaky ### Testing Leverage the broken trunk Windows CPU job atm and mark Windows CPU jobs as unstable #102297 Pull Request resolved: #102784 Approved by: https://github.com/clee2000
After pytorch/pytorch#102426 and pytorch/pytorch#102784 landed, unstable jobs are now hidden correctly on HUD https://hud.pytorch.org and also won't block PR. Previously, this was done by moving unstable jobs to an unstable workflow. Now unstable jobs will stay in the same workflow, but have `unstable` in their names. This is very similar to how `rerun_disabled_tests` are ignored atm. ### Testing https://torchci-git-fork-huydhn-ignore-unstable-jobs-fbopensource.vercel.app/metrics
Allow CI jobs to be marked as unstable dynamically. This use the same mechanism to disable job but with a different issue title
UNSTABLE JOB_NAME.The action will output a
is-unstableflag to let the CI know if the current job it's running is unstable. This is similar to the waykeep-goingflag is exposed. Once this is merged, I will follow up with another PR to actually useis-unstableflag in CI.Testing
is-unstableset https://github.com/pytorch/pytorch/actions/runs/5114544576/jobs/9194921978#step:9:172is-unstableset https://github.com/pytorch/pytorch/actions/runs/5114544576/jobs/9194922158#step:9:190is-unstableset https://github.com/pytorch/pytorch/actions/runs/5114544572/jobs/9198630766#step:13:265