Closed
Conversation
db84c64 to
48db2b7
Compare
Contributor
|
I suspect this is related to some of the recently enabled tests cc @iotamudelta |
Collaborator
Author
|
Could be. I just kicked off a run with ROCm on the default stream, too, since that was changed recently. |
Collaborator
Author
|
Putting ROCm back on default stream appears to address the hang. I'm going to update the PR to just make that change. |
bddppq
approved these changes
Sep 18, 2019
Contributor
facebook-github-bot
left a comment
There was a problem hiding this comment.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Contributor
facebook-github-bot
pushed a commit
that referenced
this pull request
Sep 18, 2019
… test classes (#26375) Summary: - Adds dtypes, dtypesIfCPU, and dtypesIfCUDA decorators. - Eliminates the need for nontest members to be defined in an inherited base. - Updates one test to use the decorators and updates TestTorchDeviceType with helpers. This PR appears to be hanging the ROCm build, which is not entirely surprising. See #26394, which demonstrates that the ROCm build can be hung by commenting out a Python test that was never run on ROCm. gchanan - what type list, if any, do you want to expose? I imagine most test suites will define their own lists like today. SCALAR_TYPES, QUANTIZED_TYPES, and ALL_TYPES seem reasonable to me. DOCUMENTED_TENSOR_TYPES will be removed, of course. Pull Request resolved: #26375 Test Plan: Edit is to tests themselves. Differential Revision: D17462294 Pulled By: mruberry fbshipit-source-id: f8259ec66709749b1bf8077efc737676af901436
jeffdaily
added a commit
to ROCm/pytorch
that referenced
this pull request
Dec 2, 2020
laurentdupin
pushed a commit
to laurentdupin/pytorch
that referenced
this pull request
Apr 24, 2026
Summary: This PR has been updated. Since ORIGINAL PR comment below. ROCm CI builds have been hanging as we've been refactoring tests, even when these refactors seem entirely innocuous. This PR started by commenting out test_stft, for example, a Python test never run on ROCm, and that was sufficient to reliably hang the ROCm build in CI. Putting ROCm tests back on the default stream appears to remove this hang. So this PR now does that. This is likely to unblock development. ORIGINAL: Some test changes appear to be causing ROCm builds to hang in CI. This PR is an attempt to diagnose the source of the hang. Pull Request resolved: pytorch#26394 Test Plan: Change is to test themselves. Differential Revision: D17456678 Pulled By: mruberry fbshipit-source-id: 38d00d01c64b5055c1dfed01687ce3e1c9372887
laurentdupin
pushed a commit
to laurentdupin/pytorch
that referenced
this pull request
Apr 24, 2026
… test classes (pytorch#26375) Summary: - Adds dtypes, dtypesIfCPU, and dtypesIfCUDA decorators. - Eliminates the need for nontest members to be defined in an inherited base. - Updates one test to use the decorators and updates TestTorchDeviceType with helpers. This PR appears to be hanging the ROCm build, which is not entirely surprising. See pytorch#26394, which demonstrates that the ROCm build can be hung by commenting out a Python test that was never run on ROCm. gchanan - what type list, if any, do you want to expose? I imagine most test suites will define their own lists like today. SCALAR_TYPES, QUANTIZED_TYPES, and ALL_TYPES seem reasonable to me. DOCUMENTED_TENSOR_TYPES will be removed, of course. Pull Request resolved: pytorch#26375 Test Plan: Edit is to tests themselves. Differential Revision: D17462294 Pulled By: mruberry fbshipit-source-id: f8259ec66709749b1bf8077efc737676af901436
laurentdupin
pushed a commit
to laurentdupin/pytorch
that referenced
this pull request
Apr 24, 2026
Summary: Revert pytorch#26394. Fixes pytorch#27356. Not all MIOpen handles were setting their stream to the current stream prior to running the op. Pull Request resolved: pytorch#48424 Reviewed By: H-Huang Differential Revision: D25420384 Pulled By: mruberry fbshipit-source-id: 051683ba9e3d264b71162bd344031a0c58bf6a41
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR has been updated. Since ORIGINAL PR comment below.
ROCm CI builds have been hanging as we've been refactoring tests, even when these refactors seem entirely innocuous. This PR started by commenting out test_stft, for example, a Python test never run on ROCm, and that was sufficient to reliably hang the ROCm build in CI.
Putting ROCm tests back on the default stream appears to remove this hang. So this PR now does that. This is likely to unblock development.
ORIGINAL: Some test changes appear to be causing ROCm builds to hang in CI. This PR is an attempt to diagnose the source of the hang.