If CUDA is not found and USE_CUDA!=0, report error and ask user to set USE_CUDA=0 by xuhdev · Pull Request #24939 · pytorch/pytorch

xuhdev · 2019-08-20T23:47:42Z

Currently we emit a warning and set USE_CUDA=0 in our script by ourselves. Instead, we should report an error and let user do this explicitly, as whether CUDA is used or not would affect so much of PyTorch.

Best viewed with "hide whitespace changes".

Per request by @jithunnair-amd at #23884 (comment)

@jithunnair-amd

…t USE_CUDA=0 Currently we emit a warning and set USE_CUDA=0 in our script by ourselves. Instead, we should report an error and let user do this explicitly, as whether CUDA is used or not would affect so much of PyTorch. Per request by @jithunnair-amd

ezyang · 2019-08-21T09:53:06Z

cc @soumith @gchanan

ezyang · 2019-08-21T09:53:24Z

@pytorchbot rebase this please

gchanan · 2019-08-21T17:08:52Z

I don't have an opinion on this change, but I don't actually see anyone requesting this. The request linked, as I understand it, is about having a canonical variable that defines whether we are actually building with CUDA or not.

Edit: actually I think this is too user hostile, particularly when AFAICT no one is asking for it.

ezyang · 2019-08-21T20:48:53Z

@gchanan To be clear, this PR is to unblock the AMD patch which doesn't work because the cmake variable is not set appropriately early enough. So there is a use case, but not from end users.

Hope to resolve pytorch/pytorch#24939

Hope to resolve the CI failure in pytorch/pytorch#24939

xuhdev · 2019-08-22T19:54:08Z

Since pytorch/builder#345 is resolved, any idea when that will take effect so we can run a retest?

gchanan · 2019-08-22T21:54:41Z

@ezyang I understand, I believe that issue could be resolved in a different way.

xuhdev · 2019-08-22T22:21:25Z

The alternative I can think of would be to run all CUDA detection code before depending options are defined. Given the current status of the CUDA detection code, this might be difficult.

jithunnair-amd · 2019-08-23T23:19:49Z

@gchanan I sense an impasse. Can you please provide some direction, so the RCCL upstreaming PR can move forward (it's waiting on this)?

xuhdev · 2019-08-23T23:27:43Z

@pytorchbot rebase this please

gchanan · 2019-08-26T18:34:47Z

CC @kostmo.

@jithunnair-amd: is my summary above incorrect?

The request linked, as I understand it, is about having a canonical variable that defines whether we are actually building with CUDA or not.

So, I'd suggest we have a canonical way of saying whether we are actually building with CUDA or not.

Now, I don't know all the rules we've defined for flags, it seems like in general we are using USE_X for both did the user request x and are we building with x? Do you know, @kostmo?

jithunnair-amd · 2019-08-26T18:38:30Z

@jithunnair-amd: is my summary above incorrect?

The request linked, as I understand it, is about having a canonical variable that defines whether we are actually building with CUDA or not.

That sounds correct to me.

xuhdev · 2019-08-27T20:26:10Z

@gchanan I agree that we should be consistent regarding this point, but it doesn't seem to be consistent. For example, currently USE_CUDA does not automatically turn off when the version number of USE_CUDA is too small, but if it is simply not found, then USE_CUDA is automatically turned off:

pytorch/cmake/public/cuda.cmake

Lines 41 to 43 in f5a3d59

    
           if(CUDA_VERSION VERSION_LESS 9.0) 
        
             message(FATAL_ERROR "PyTorch requires CUDA 9.0 and above.") 
        
           endif()

gchanan · 2019-08-29T19:08:00Z

idk, that seems like a plausible choice.

xuhdev · 2019-09-10T21:37:57Z

Closing now

xuhdev requested a review from ezyang August 20, 2019 23:47

pytorchbot added the module: build Build system issues label Aug 20, 2019

xuhdev mentioned this pull request Aug 20, 2019

Changes to build files needed to enable NCCL path for PyTorch ROCm builds [REDUX] #23884

Closed

Merge remote-tracking branch 'origin/master' into HEAD

c130f92

xuhdev added a commit to xuhdev/pytorch-builder that referenced this pull request Aug 21, 2019

Explicitly set USE_CUDA in smoke test

2901e54

Hope to resolve pytorch/pytorch#24939

xuhdev mentioned this pull request Aug 21, 2019

Explicitly set USE_CUDA in smoke test pytorch/builder#345

Merged

xuhdev added a commit to xuhdev/pytorch-builder that referenced this pull request Aug 21, 2019

Explicitly set USE_CUDA in smoke test

86be707

Hope to resolve the CI failure in pytorch/pytorch#24939

ezyang pushed a commit to pytorch/builder that referenced this pull request Aug 22, 2019

Explicitly set USE_CUDA in smoke test (#345)

5023ecc

Hope to resolve the CI failure in pytorch/pytorch#24939

Merge remote-tracking branch 'origin/master' into HEAD

c2d939d

xuhdev closed this Sep 10, 2019

xuhdev mentioned this pull request Oct 1, 2019

Avoid configuring ROCm if USE_CUDA is on. #26910

Closed

Conversation

xuhdev commented Aug 20, 2019

Uh oh!

ezyang commented Aug 21, 2019

Uh oh!

ezyang commented Aug 21, 2019

Uh oh!

gchanan commented Aug 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ezyang commented Aug 21, 2019

Uh oh!

xuhdev commented Aug 22, 2019

Uh oh!

gchanan commented Aug 22, 2019

Uh oh!

xuhdev commented Aug 22, 2019

Uh oh!

jithunnair-amd commented Aug 23, 2019

Uh oh!

xuhdev commented Aug 23, 2019

Uh oh!

gchanan commented Aug 26, 2019

Uh oh!

jithunnair-amd commented Aug 26, 2019

Uh oh!

xuhdev commented Aug 27, 2019

Uh oh!

gchanan commented Aug 29, 2019

Uh oh!

xuhdev commented Sep 10, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

gchanan commented Aug 21, 2019 •

edited

Loading