Skip to content

Improve Windows Compatibility(for csrc/scripts)#2941

Merged
apaszke merged 30 commits intopytorch:masterfrom
peterjc123:csrc_script_fix
Nov 8, 2017
Merged

Improve Windows Compatibility(for csrc/scripts)#2941
apaszke merged 30 commits intopytorch:masterfrom
peterjc123:csrc_script_fix

Conversation

@peterjc123
Copy link
Collaborator

Win64 support for csrc and scripts

@soumith
Copy link
Collaborator

soumith commented Oct 3, 2017

@peterjc123 is there any chance you could rebase your commits instead of making merge commits? this branch cannot be merged into master, it still has conflicts.

@soumith
Copy link
Collaborator

soumith commented Oct 3, 2017

@pytorchbot add to whitelist

@soumith
Copy link
Collaborator

soumith commented Oct 3, 2017

if you think rebasing is too difficult, let me know i can try to do it.

@peterjc123
Copy link
Collaborator Author

I don't know where the conflicts are.

@peterjc123
Copy link
Collaborator Author

This time it's done using git diff and git apply. I think it should be rebased now.

from .env import check_env_flag

LINUX_HOME = '/usr/local/cuda'
WINDOWS_HOME = 'C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v8.0'

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

_original_pid = False
_cudart = None

CUDA_WINDOWS_LIB = 'cudart64_80'

This comment was marked as off-topic.

DEFAULT_PROTOCOL = 2

LONG_SIZE = struct.Struct('=l').size
LONG_SIZE = struct.Struct('=Q').size

This comment was marked as off-topic.

This comment was marked as off-topic.


LuaFunction = namedtuple('LuaFunction', ['size', 'dumped', 'upvalues'])

LONGLONG_TYPECODE = 'q' if sys.version[0] == '3' else 'l'

This comment was marked as off-topic.

Copy link
Contributor

@apaszke apaszke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the story for multiprocessing? Can you please write up a small summary explaining how does it work on Windows and if we still have two modes there?

setup.py Outdated

# Debug option for Windows
if IS_WINDOWS:
extra_link_args.append('/DEBUG:FULL')

This comment was marked as off-topic.

setup.py Outdated
ATEN_LIB = os.path.join(lib_path, 'ATen.lib')
_C_LIB = 'build/temp.win-amd64-' + str(sys.version_info[0]) + '.' + str(
sys.version_info[1]) + '/Release/torch/csrc/_C.cp' + str(
sys.version_info[0]) + str(sys.version_info[1]) + '-win_amd64.lib'

This comment was marked as off-topic.

setup.py Outdated
THCUNN_LIB,
make_relative_rpath('../lib'),
]
] + [_C_LIB] if _C_LIB is not None else []

This comment was marked as off-topic.

This comment was marked as off-topic.

test/common.py Outdated
expected_file += "-" + subname
expected_file += ".expect"
if sys.platform == 'win32' and os.path.exists(expected_file + '.expect.win'):
expected_file += ".expect.win"

This comment was marked as off-topic.

self.daemon = True

def run(self):
self.tensor.add_(3)

This comment was marked as off-topic.

This comment was marked as off-topic.

__assume(0);
#else
__builtin_unreachable();
#endif

This comment was marked as off-topic.

#include "torch/csrc/utils/auto_gpu.h"

#if defined(WITH_CUDA) && defined(_MSC_VER)
class THP_CLASS THCPAutoGPU : public AutoGPU {

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

size_t view_size = (size_t)THPUtils_unpackLong(_view_size);

long device = THPUtils_unpackLong(_device);
int device = (int) THPUtils_unpackLong(_device);

This comment was marked as off-topic.

int dim = 0;
#else
int64_t dim = 0;
#endif

This comment was marked as off-topic.

return self._read(LONGLONG_TYPECODE)
elif self.long_size is 8:
return self._read('q')
return self._read(LONGLONG_TYPECODE)

This comment was marked as off-topic.

@peterjc123
Copy link
Collaborator Author

@apaszke About multiprocessing, only spawn is supported. Shared file mapping(File sharing) was used to share data between processes.

@apaszke
Copy link
Contributor

apaszke commented Oct 3, 2017

But are there still two different methods (file system + file descriptor)? Or do they dispatch to the same thing?

@peterjc123
Copy link
Collaborator Author

peterjc123 commented Oct 3, 2017

@apaszke File descriptor is not supported in Windows. So only file system is used.

@soumith
Copy link
Collaborator

soumith commented Oct 5, 2017

@apaszke can you check that all the changes you requested are done?

@peterjc123
Copy link
Collaborator Author

@fmassa Yes, and the native size of 'l' is 4 in Windows but 8 in Unix. So there 'll be no option for py2 in Windows.

@fmassa
Copy link
Member

fmassa commented Oct 6, 2017

Does this mean that Windows Py2 people won't be able to load legacy models?

@peterjc123
Copy link
Collaborator Author

@fmassa Yes, however, nvcc doesn't compile with any compiler other than MSVC in Windows. So that one is not the only block for PyTorch on windows py2.

@peterjc123
Copy link
Collaborator Author

peterjc123 commented Oct 10, 2017

Are there more comments? I think that I have covered all the points above except for those can't fix.

@peterjc123 peterjc123 force-pushed the csrc_script_fix branch 7 times, most recently from a0bd882 to fb9d3aa Compare October 15, 2017 03:26
@peterjc123 peterjc123 force-pushed the csrc_script_fix branch 3 times, most recently from 22ea9e8 to 87c8d54 Compare October 20, 2017 09:38
@peterjc123
Copy link
Collaborator Author

@apaszke Fixed. Very sorry for my misunderstanding.

Copy link
Contributor

@apaszke apaszke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to apologise, my comment was unclear. Sorry for that. Last fix and it should be good to go

ws.data,
ws.size));
return best_algo = getBestAlgorithm<cudnnConvolutionBwdFilterAlgoPerf_t>(perfResults.release(), deterministic, n_algo);
return best_algo = getBestAlgorithm<cudnnConvolutionBwdFilterAlgoPerf_t>(perfResults.get(), deterministic, n_algo);

This comment was marked as off-topic.

out,
1,
&algoCount,
&algoCount.get(),

This comment was marked as off-topic.

@peterjc123
Copy link
Collaborator Author

@apaszke It finally passes the build phase.

Copy link
Contributor

@apaszke apaszke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks a lot!

@apaszke apaszke merged commit aa91193 into pytorch:master Nov 8, 2017
@apaszke
Copy link
Contributor

apaszke commented Nov 8, 2017

Alright, landed in master! Thank you so much!!

@soumith
Copy link
Collaborator

soumith commented Nov 8, 2017

woooohooo!!! finally!

@fmassa
Copy link
Member

fmassa commented Nov 8, 2017

Looks like this broke compilation on clang 8.0.0 with error messages like

torch/csrc/autograd/functions/init.cpp:222:29: error: address of overloaded function 'getValueAttr' does not match required type '_object *(_object *, void *)'
  {(char*)"groups", (getter)getValueAttr<ConvBackwardBackward, int, ConvParams,
                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
torch/csrc/autograd/functions/init.cpp:99:11: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'Convert'
PyObject* getValueAttr(PyObject* obj, void* _unused)
          ^

@lantiga
Copy link
Contributor

lantiga commented Nov 9, 2017

I'm on this

@lantiga
Copy link
Contributor

lantiga commented Nov 9, 2017

Ok, never mind, @ezyang already has the patch (#3573)

@peterjc123 peterjc123 deleted the csrc_script_fix branch November 9, 2017 04:50
@yf225 yf225 mentioned this pull request Dec 20, 2017
13 tasks
xuhdev added a commit to xuhdev/pytorch that referenced this pull request Jul 23, 2019
…ipt.

---

How does the current code subsume all detections in the deleted `nccl.py`?

- The dependency of `USE_NCCL` on the OS and `USE_CUDA` is handled as dependency options in `CMakeLists.txt`.

- The main NCCL detection happens in [FindNCCL.cmake](https://github.com/pytorch/pytorch/blob/8377d4b32c12206a0f9401e81a5e5796c8fc01a8/cmake/Modules/FindNCCL.cmake), which is called by [nccl.cmake](https://github.com/pytorch/pytorch/blob/8377d4b32c12206a0f9401e81a5e5796c8fc01a8/cmake/External/nccl.cmake). When `USE_SYSTEM_NCCL` is false, the previous Python code defer the detection to `find_package(NCCL)`. The change in `nccl.cmake` retains this.

- `USE_STATIC_NCCL` in the previous Python code simply changes the name of the detected library. This is done in `IF (USE_STATIC_NCCL)`.

- Now we only need to look at how the lines below line 20 in `nccl.cmake` are subsumed. These lines list paths to header and library directories that NCCL headers and libraries may reside in and try to search these directories for the key header and library files in turn. These are done by `find_path` for headers and `find_library` for the library files in `FindNCCL.cmake`.
  * The call of [find_path](https://cmake.org/cmake/help/v3.8/command/find_path.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for headers in `<prefix>/include` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. Like the Python code, this commit sets `CMAKE_PREFIX_PATH` to search for `<prefix>` in `NCCL_ROOT_DIR` and home to CUDA.  `CMAKE_SYSTEM_PREFIX_PATH` includes the standard directories such as `/usr/local` and `/usr`. `NCCL_INCLUDE_DIR` is also specifically handled.

  * Similarly, the call of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for libraries in directories including `<prefix>/lib` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. But it also handles the edge cases intended to be solved in the Python code more properly:
     - It only searches for `<prefix>/lib64` (and `<prefix>/lib32`) if it is appropriate on the system.
     - It only searches for `<prefix>/lib/<arch>` for the right `<arch>`, unlike the Python code searches for `lib/<arch>` in a generic way (e.g., the Python code searches for `/usr/lib/x86_64-linux-gnu` but in reality systems have `/usr/lib/x86_64-some-customized-name-linux-gnu`, see https://unix.stackexchange.com/a/226180/38242 ).

---

Regarding for relevant issues:

- pytorch#12063 and pytorch#2877: These are properly handled, as explained in the updated comment.
- pytorch#2941 does not changes NCCL detection specifically for Windows (it changed CUDA detection).
- b7e258f A versioned library detection is added, but the order is reversed: The unversioned library becomes preferred. This is because normally unversioned libraries are linked to versioned libraries and preferred by users, and local installation by users are often unversioned. Like the document of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) suggests:

> When using this to specify names with and without a version suffix, we recommend specifying the unversioned name first so that locally-built packages can be found before those provided by distributions.
facebook-github-bot pushed a commit that referenced this pull request Jul 23, 2019
…ipt. (#22930)

Summary:
 ---

How does the current code subsume all detections in the deleted `nccl.py`?

- The dependency of `USE_NCCL` on the OS and `USE_CUDA` is handled as dependency options in `CMakeLists.txt`.

- The main NCCL detection happens in [FindNCCL.cmake](https://github.com/pytorch/pytorch/blob/8377d4b32c12206a0f9401e81a5e5796c8fc01a8/cmake/Modules/FindNCCL.cmake), which is called by [nccl.cmake](https://github.com/pytorch/pytorch/blob/8377d4b32c12206a0f9401e81a5e5796c8fc01a8/cmake/External/nccl.cmake). When `USE_SYSTEM_NCCL` is false, the previous Python code defer the detection to `find_package(NCCL)`. The change in `nccl.cmake` retains this.

- `USE_STATIC_NCCL` in the previous Python code simply changes the name of the detected library. This is done in `IF (USE_STATIC_NCCL)`.

- Now we only need to look at how the lines below line 20 in `nccl.cmake` are subsumed. These lines list paths to header and library directories that NCCL headers and libraries may reside in and try to search these directories for the key header and library files in turn. These are done by `find_path` for headers and `find_library` for the library files in `FindNCCL.cmake`.
  * The call of [find_path](https://cmake.org/cmake/help/v3.8/command/find_path.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for headers in `<prefix>/include` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. Like the Python code, this commit sets `CMAKE_PREFIX_PATH` to search for `<prefix>` in `NCCL_ROOT_DIR` and home to CUDA.  `CMAKE_SYSTEM_PREFIX_PATH` includes the standard directories such as `/usr/local` and `/usr`. `NCCL_INCLUDE_DIR` is also specifically handled.

  * Similarly, the call of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for libraries in directories including `<prefix>/lib` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. But it also handles the edge cases intended to be solved in the Python code more properly:
     - It only searches for `<prefix>/lib64` (and `<prefix>/lib32`) if it is appropriate on the system.
     - It only searches for `<prefix>/lib/<arch>` for the right `<arch>`, unlike the Python code searches for `lib/<arch>` in a generic way (e.g., the Python code searches for `/usr/lib/x86_64-linux-gnu` but in reality systems have `/usr/lib/x86_64-some-customized-name-linux-gnu`, see https://unix.stackexchange.com/a/226180/38242 ).

 ---

Regarding for relevant issues:

- #12063 and #2877: These are properly handled, as explained in the updated comment.
- #2941 does not changes NCCL detection specifically for Windows (it changed CUDA detection).
- b7e258f A versioned library detection is added, but the order is reversed: The unversioned library becomes preferred. This is because normally unversioned libraries are linked to versioned libraries and preferred by users, and local installation by users are often unversioned. Like the document of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) suggests:

> When using this to specify names with and without a version suffix, we recommend specifying the unversioned name first so that locally-built packages can be found before those provided by distributions.
Pull Request resolved: #22930

Differential Revision: D16440275

Pulled By: ezyang

fbshipit-source-id: 11fe80743d4fe89b1ed6f96d5d996496e8ec01aa
zdevito pushed a commit to zdevito/ATen that referenced this pull request Jul 23, 2019
…ipt. (#22930)

Summary:
 ---

How does the current code subsume all detections in the deleted `nccl.py`?

- The dependency of `USE_NCCL` on the OS and `USE_CUDA` is handled as dependency options in `CMakeLists.txt`.

- The main NCCL detection happens in [FindNCCL.cmake](https://github.com/pytorch/pytorch/blob/8377d4b32c12206a0f9401e81a5e5796c8fc01a8/cmake/Modules/FindNCCL.cmake), which is called by [nccl.cmake](https://github.com/pytorch/pytorch/blob/8377d4b32c12206a0f9401e81a5e5796c8fc01a8/cmake/External/nccl.cmake). When `USE_SYSTEM_NCCL` is false, the previous Python code defer the detection to `find_package(NCCL)`. The change in `nccl.cmake` retains this.

- `USE_STATIC_NCCL` in the previous Python code simply changes the name of the detected library. This is done in `IF (USE_STATIC_NCCL)`.

- Now we only need to look at how the lines below line 20 in `nccl.cmake` are subsumed. These lines list paths to header and library directories that NCCL headers and libraries may reside in and try to search these directories for the key header and library files in turn. These are done by `find_path` for headers and `find_library` for the library files in `FindNCCL.cmake`.
  * The call of [find_path](https://cmake.org/cmake/help/v3.8/command/find_path.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for headers in `<prefix>/include` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. Like the Python code, this commit sets `CMAKE_PREFIX_PATH` to search for `<prefix>` in `NCCL_ROOT_DIR` and home to CUDA.  `CMAKE_SYSTEM_PREFIX_PATH` includes the standard directories such as `/usr/local` and `/usr`. `NCCL_INCLUDE_DIR` is also specifically handled.

  * Similarly, the call of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) (Search for `NO_DEFAULT_PATH` in the link) by default searches for libraries in directories including `<prefix>/lib` for each `<prefix>` in `CMAKE_PREFIX_PATH` and `CMAKE_SYSTEM_PREFIX_PATH`. But it also handles the edge cases intended to be solved in the Python code more properly:
     - It only searches for `<prefix>/lib64` (and `<prefix>/lib32`) if it is appropriate on the system.
     - It only searches for `<prefix>/lib/<arch>` for the right `<arch>`, unlike the Python code searches for `lib/<arch>` in a generic way (e.g., the Python code searches for `/usr/lib/x86_64-linux-gnu` but in reality systems have `/usr/lib/x86_64-some-customized-name-linux-gnu`, see https://unix.stackexchange.com/a/226180/38242 ).

 ---

Regarding for relevant issues:

- pytorch/pytorch#12063 and pytorch/pytorch#2877: These are properly handled, as explained in the updated comment.
- pytorch/pytorch#2941 does not changes NCCL detection specifically for Windows (it changed CUDA detection).
- b7e258f81ef61d19b884194cdbcd6c7089636d46 A versioned library detection is added, but the order is reversed: The unversioned library becomes preferred. This is because normally unversioned libraries are linked to versioned libraries and preferred by users, and local installation by users are often unversioned. Like the document of [find_library](https://cmake.org/cmake/help/v3.8/command/find_library.html) suggests:

> When using this to specify names with and without a version suffix, we recommend specifying the unversioned name first so that locally-built packages can be found before those provided by distributions.
Pull Request resolved: pytorch/pytorch#22930

Differential Revision: D16440275

Pulled By: ezyang

fbshipit-source-id: 11fe80743d4fe89b1ed6f96d5d996496e8ec01aa
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants