Skip to content

enable NVFuser by default#76006

Closed
davidberard98 wants to merge 21 commits intogh/davidberard98/97/basefrom
gh/davidberard98/97/head
Closed

enable NVFuser by default#76006
davidberard98 wants to merge 21 commits intogh/davidberard98/97/basefrom
gh/davidberard98/97/head

Conversation

@davidberard98
Copy link
Copy Markdown
Contributor

@davidberard98 davidberard98 commented Apr 19, 2022

Stack from ghstack:

Enable NVFuser in OSS.
Tests are passing, and we've also run tests in torchvision and torchaudio

Differential Revision: D35736977

Testing to see what breaks

[ghstack-poisoned]
@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Apr 19, 2022

🔗 Helpful links

❌ 1 New Failures

As of commit ad73bd7 (more details on the Dr. CI page):

Expand to see more
  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build pull / win-vs2019-cuda11.3-py3 / test (default, 2, 2, windows.8xlarge.nvidia.gpu) (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-07T02:35:04.5353894Z RuntimeError: test_linalg failed!
2022-05-07T02:35:03.7439498Z FAILED (errors=1, skipped=110, expected failures=3)
2022-05-07T02:35:03.7439685Z 
2022-05-07T02:35:03.7439803Z Generating XML reports...
2022-05-07T02:35:03.7440229Z Generated XML report: test-reports\python-unittest\test_linalg\TEST-TestLinalgCPU-20220507022901.xml
2022-05-07T02:35:03.7440790Z Generated XML report: test-reports\python-unittest\test_linalg\TEST-TestLinalgCUDA-20220507022901.xml
2022-05-07T02:35:04.5352310Z Traceback (most recent call last):
2022-05-07T02:35:04.5352793Z   File "run_test.py", line 1070, in <module>
2022-05-07T02:35:04.5353050Z     main()
2022-05-07T02:35:04.5353326Z   File "run_test.py", line 1048, in main
2022-05-07T02:35:04.5353630Z     raise RuntimeError(err_message)
2022-05-07T02:35:04.5353894Z RuntimeError: test_linalg failed!
2022-05-07T02:35:04.9393507Z 
2022-05-07T02:35:04.9394477Z (base) C:\actions-runner\_work\pytorch\pytorch\test>popd
2022-05-07T02:35:04.9401267Z 
2022-05-07T02:35:04.9401858Z (base) C:\actions-runner\_work\pytorch\pytorch>if ERRORLEVEL 1 exit /b 1 
2022-05-07T02:35:04.9433590Z + cleanup
2022-05-07T02:35:04.9433895Z + retcode=1
2022-05-07T02:35:04.9434493Z + set +x
2022-05-07T02:35:04.9473072Z ##[error]Process completed with exit code 1.
2022-05-07T02:35:04.9895836Z ##[group]Run pytorch/pytorch/.github/actions/get-workflow-job-id@master
2022-05-07T02:35:04.9896192Z with:

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Apr 19, 2022
@davidberard98 davidberard98 added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 19, 2022
Testing to see what breaks

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Apr 19, 2022
Testing to see what breaks

ghstack-source-id: 3fb9efc
Pull Request resolved: #76006
@davidberard98
Copy link
Copy Markdown
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@davidberard98
Copy link
Copy Markdown
Contributor Author

cc @jjsjann123 did you have a plan on how to flip the switch? I was trying to test it here, but looks like this strategy doesn't work (I think it's due to static initialization order where CUDAHooks needs to get registered before at::globakContext().hasCUDA() will work in cuda_graph_fuser.h)

Do you have another idea on how to do this? I think we could probably try to wrap the registration in some functions and call the functions to enforce registration order... but not sure if there's a nicer way to do it

Testing to see what breaks

Differential Revision: [D35736977](https://our.internmc.facebook.com/intern/diff/D35736977)

[ghstack-poisoned]
Testing to see what breaks

Differential Revision: [D35736977](https://our.internmc.facebook.com/intern/diff/D35736977)

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Apr 19, 2022
Testing to see what breaks

ghstack-source-id: 0e18b9b
Pull Request resolved: #76006
Testing to see what breaks

Differential Revision: [D35736977](https://our.internmc.facebook.com/intern/diff/D35736977)

[ghstack-poisoned]
Testing to see what breaks

Differential Revision: [D35736977](https://our.internmc.facebook.com/intern/diff/D35736977)

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Apr 20, 2022
Testing to see what breaks

ghstack-source-id: 0199581
Pull Request resolved: #76006
Testing to see what breaks

Differential Revision: [D35736977](https://our.internmc.facebook.com/intern/diff/D35736977)

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Apr 21, 2022
Testing to see what breaks

ghstack-source-id: f5a0d7a
Pull Request resolved: #76006
Testing to see what breaks

Differential Revision: [D35736977](https://our.internmc.facebook.com/intern/diff/D35736977)

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Apr 23, 2022
Testing to see what breaks

ghstack-source-id: 45950d1
Pull Request resolved: #76006
Testing to see what breaks

Differential Revision: [D35736977](https://our.internmc.facebook.com/intern/diff/D35736977)

[ghstack-poisoned]
@davidberard98
Copy link
Copy Markdown
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

1 similar comment
@davidberard98
Copy link
Copy Markdown
Contributor Author

@davidberard98 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Testing to see what breaks

Differential Revision: [D35736977](https://our.internmc.facebook.com/intern/diff/D35736977)

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request Apr 26, 2022
Testing to see what breaks

ghstack-source-id: d4e82cf
Pull Request resolved: #76006
Testing to see what breaks

Differential Revision: [D35736977](https://our.internmc.facebook.com/intern/diff/D35736977)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Copy Markdown
Contributor

@pytorchbot force merge this

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed due to Matched rule NVFuser, but PR has not been reviewed yet
Raised by https://github.com/pytorch/pytorch/actions/runs/2301021974

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@pytorchbot force merge this

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed due to Matched rule NVFuser, but PR has not been reviewed yet
Raised by https://github.com/pytorch/pytorch/actions/runs/2301167574

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@pytorchbot force merge this

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed due to Matched rule NVFuser, but PR has not been reviewed yet
Raised by https://github.com/pytorch/pytorch/actions/runs/2301240336

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@pytorchbot force merge this

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed due to Matched rule NVFuser, but PR has not been reviewed yet
Raised by https://github.com/pytorch/pytorch/actions/runs/2301316270

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@pytorchbot force merge this

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed due to Matched rule NVFuser, but PR has not been reviewed yet
Raised by https://github.com/pytorch/pytorch/actions/runs/2301390729

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@pytorchbot force merge this

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed due to Matched rule NVFuser, but PR has not been reviewed yet
Raised by https://github.com/pytorch/pytorch/actions/runs/2301689752

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@pytorchbot force merge this

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed due to Matched rule NVFuser, but PR has not been reviewed yet
Raised by https://github.com/pytorch/pytorch/actions/runs/2301733362

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@pytorchbot force merge this

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed due to Matched rule NVFuser, but PR has not been reviewed yet
Raised by https://github.com/pytorch/pytorch/actions/runs/2301762179

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@pytorchbot force merge this

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed due to Matched rule NVFuser, but PR has not been reviewed yet
Raised by https://github.com/pytorch/pytorch/actions/runs/2301795555

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@pytorchbot force merge this

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed due to Matched rule NVFuser, but PR has not been reviewed yet
Raised by https://github.com/pytorch/pytorch/actions/runs/2301855761

@bigfootjon
Copy link
Copy Markdown
Member

Apologies for the spam here. For Meta engineers the fix is D36284533

davidberard98 added a commit that referenced this pull request May 11, 2022
Enable NVFuser in OSS.

Tests are passing, and we've also run tests in [torchvision](pytorch/vision#5959) and [torchaudio](pytorch/audio#2372)

Retry of #76006, because that PR had GH1/ghstack issues.

[ghstack-poisoned]
davidberard98 added a commit that referenced this pull request May 11, 2022
Enable NVFuser in OSS.

Tests are passing, and we've also run tests in [torchvision](pytorch/vision#5959) and [torchaudio](pytorch/audio#2372)

Retry of #76006, because that PR had GH1/ghstack issues.

ghstack-source-id: b1c68c1
Pull Request resolved: #77213
@davidberard98
Copy link
Copy Markdown
Contributor Author

retrying in #77213

pytorchmergebot pushed a commit that referenced this pull request May 11, 2022
Enable NVFuser in OSS.

Tests are passing, and we've also run tests in [torchvision](pytorch/vision#5959) and [torchaudio](pytorch/audio#2372)

Retry of #76006, because that PR had GH1/ghstack issues.

Pull Request resolved: #77213

Approved by: https://github.com/eellison
facebook-github-bot pushed a commit that referenced this pull request May 13, 2022
Summary:
Enable NVFuser in OSS.

Tests are passing, and we've also run tests in [torchvision](pytorch/vision#5959) and [torchaudio](pytorch/audio#2372)

Retry of #76006, because that PR had GH1/ghstack issues.

Pull Request resolved: #77213

Approved by: https://github.com/eellison

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/24f7dcd8161544fc55a19e3b471bd95d37f9ec18

Reviewed By: H-Huang

Differential Revision: D36302686

Pulled By: H-Huang

fbshipit-source-id: 2c380e46d8a532dfe0f52fb28d24ca310ffbc43a
@facebook-github-bot facebook-github-bot deleted the gh/davidberard98/97/head branch June 10, 2022 14:17
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 25, 2026
Enable NVFuser in OSS.

Tests are passing, and we've also run tests in [torchvision](pytorch/vision#5959) and [torchaudio](pytorch/audio#2372)

Retry of pytorch#76006, because that PR had GH1/ghstack issues.

Pull Request resolved: pytorch#77213

Approved by: https://github.com/eellison
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request cla signed oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants