Skip to content

Get ARC jobs to run on both classic and ARC infra#124753

Closed
ZainRizvi wants to merge 4 commits intomainfrom
zainr/parallel-arc-jobs
Closed

Get ARC jobs to run on both classic and ARC infra#124753
ZainRizvi wants to merge 4 commits intomainfrom
zainr/parallel-arc-jobs

Conversation

@ZainRizvi
Copy link
Copy Markdown
Contributor

@ZainRizvi ZainRizvi commented Apr 23, 2024

ARC jobs are too unstable right now.

We're going to mitigate this by:

More details in pytorch/ci-infra#149

@ZainRizvi ZainRizvi requested a review from a team as a code owner April 23, 2024 18:25
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Apr 23, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124753

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e939c10 with merge base c82fcb7 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@seemethere
Copy link
Copy Markdown
Member

How are these actually marked as unstable?

@ZainRizvi
Copy link
Copy Markdown
Contributor Author

How are these actually marked as unstable?

They're not marked as unstable yet (I'll make sure that happens before merging this PR)

I'm currently looking into how exactly that part is done. It'll be one of the following:

  • I'll creating issues for each job that mark them as unstable
  • If that doesn't work, then I'll manually move this to the unstable.yml workflow instead

@kit1980
Copy link
Copy Markdown
Contributor

kit1980 commented Apr 23, 2024

If that doesn't work, then I'll manually move this to the unstable.yml workflow instead

I'd suggest to just put the jobs into unstable.yml from the beginning.
The issue mechanism is good for doing things fast for oncalls, but for this kind of changes I think normal files in CI are much better.

@ZainRizvi ZainRizvi added the ciflow/unstable Run all experimental or flaky jobs on PyTorch unstable workflow label Apr 23, 2024
@ZainRizvi
Copy link
Copy Markdown
Contributor Author

I'd suggest to just put the jobs into unstable.yml from the beginning. The issue mechanism is good for doing things fast for oncalls, but for this kind of changes I think normal files in CI are much better.

Yeah, that makes sense for now while we know the jobs don't work too well. When we have more faith in ARC and it's time to properly enable the jobs in prod again we can switch to using issues as part of a deployment plan

@ZainRizvi
Copy link
Copy Markdown
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 23, 2024
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

petrex pushed a commit to petrex/pytorch that referenced this pull request May 3, 2024
ARC jobs are too unstable right now.

We're going to mitigate this by:

- Reverting ARC jobs to run on the classic infra (pytorch#124748)
- Spin up new jobs in parallel to run on the new infra. (this PR)
- Mark these ARC jobs as unstable (will be done before merging this PR)

More details in pytorch/ci-infra#149
Pull Request resolved: pytorch#124753
Approved by: https://github.com/zxiiro, https://github.com/seemethere
@github-actions github-actions bot deleted the zainr/parallel-arc-jobs branch June 3, 2024 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-td-distributed ciflow/trunk Trigger trunk jobs on your pull request ciflow/unstable Run all experimental or flaky jobs on PyTorch unstable workflow Merged topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants