[Caffe2] Update hip files#9826
[Caffe2] Update hip files#9826rohithkrn wants to merge 19 commits intopytorch:masterfrom rohithkrn:update-hip-files
Conversation
|
@bddppq is there an update on the base docker img?
|
| hipStream_t GetStream(int gpu, int stream_id) { | ||
| vector<hipStream_t>& gpu_streams = hip_streams_[gpu]; | ||
| if (gpu_streams.size() <= stream_id) { | ||
| if (gpu_streams.size() <= (unsigned)stream_id) { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| const int D = X.size_from_dim(canonical_axis); | ||
|
|
||
| Y->ResizeLike(X); | ||
| auto* Y_data = Y->template mutable_data<T>(); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
There is currently an outage for caffe2 rocm builds. It will be fixed in the next hour. |
|
@pytorchbot retest this please |
| @@ -0,0 +1,309 @@ | |||
| #include "caffe2/core/THCCachingAllocator.h" | |||
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| const int D = X.size_from_dim(canonical_axis); | ||
|
|
||
| Y->ResizeLike(X); | ||
| auto* Y_data = Y->template mutable_data<T>(); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
@pytorchbot retest this please |
|
better to hipify THCCachingAllocator files |
|
@rohithkrn I just checked there are only two places in aten using cudaError instead of cudaErorr_t, I think we can simply change those two places and remove the "cudaError" -> "hipError_t" mapping |
|
@bddppq I have changed the mapping order to make it work in the current state. But yes, changing aten also should work. |
facebook-github-bot
left a comment
There was a problem hiding this comment.
bddppq has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: This was introduced in #9826 following the corresponding cuda file context_gpu.cu file, tests have passed in the PR, at that point master was 94439d7. However during the long landing process, a new master commit aebf3b4 has come in that removed the `CAFFE_KNOWN_TYPE(Tensor<HIPContext>)` in context_hip.cc file, which then has broken the HIP BlobStatGetter, and we did NOT run tests again during merge and so when #9826 later landed to master the rocm tests start breaking. Pull Request resolved: #9973 Differential Revision: D9040671 Pulled By: bddppq fbshipit-source-id: f3b16cabaf681fc0535ca733db0b48430868f922
Summary: The goal of this PR is to update the hip files to reflect relevant changes in cuda source files. Pull Request resolved: pytorch#9826 Differential Revision: D9032840 Pulled By: bddppq fbshipit-source-id: 504e55c46308eebfee3c9a7beea1f294fe03470f
) Summary: This was introduced in pytorch#9826 following the corresponding cuda file context_gpu.cu file, tests have passed in the PR, at that point master was 94439d7. However during the long landing process, a new master commit aebf3b4 has come in that removed the `CAFFE_KNOWN_TYPE(Tensor<HIPContext>)` in context_hip.cc file, which then has broken the HIP BlobStatGetter, and we did NOT run tests again during merge and so when pytorch#9826 later landed to master the rocm tests start breaking. Pull Request resolved: pytorch#9973 Differential Revision: D9040671 Pulled By: bddppq fbshipit-source-id: f3b16cabaf681fc0535ca733db0b48430868f922
Summary: The goal of this PR is to update the hip files to reflect relevant changes in cuda source files. Pull Request resolved: pytorch#9826 Differential Revision: D9032840 Pulled By: bddppq fbshipit-source-id: 504e55c46308eebfee3c9a7beea1f294fe03470f
) Summary: This was introduced in pytorch#9826 following the corresponding cuda file context_gpu.cu file, tests have passed in the PR, at that point master was 94439d7. However during the long landing process, a new master commit aebf3b4 has come in that removed the `CAFFE_KNOWN_TYPE(Tensor<HIPContext>)` in context_hip.cc file, which then has broken the HIP BlobStatGetter, and we did NOT run tests again during merge and so when pytorch#9826 later landed to master the rocm tests start breaking. Pull Request resolved: pytorch#9973 Differential Revision: D9040671 Pulled By: bddppq fbshipit-source-id: f3b16cabaf681fc0535ca733db0b48430868f922
The goal of this PR is to update the hip files to reflect relevant changes in cuda source files.