Introduce libtorch to setup.py build by anderspapitto · Pull Request #8792 · pytorch/pytorch

anderspapitto · 2018-06-22T17:11:38Z

Prior to this diff, there have been two ways of compiling the bulk of the torch codebase. There was no interaction between them - you had to pick one or the other.

with setup.py. This method

used the setuptools C extension functionality
worked on all platforms
did not build test_jit/test_api binaries
did not include the C++ api
always included python functionality
produced _C.so

with cpp_build. This method

used CMake
did not support Windows or ROCM
was capable of building the test binaries
included the C++ api
did not build the python functionality
produced libtorch.so

This diff combines the two.

cpp_build/CMakeLists.txt has become torch/CMakeLists.txt. This build

is CMake-based
works on all platforms
builds the test binaries
includes the C++ api
does not include the python functionality
produces libtorch.so

the setup.py build

compiles the python functionality
calls into the CMake build to build libtorch.so
produces _C.so, which has a dependency on libtorch.so

In terms of code changes, this mostly means extending the cmake build to support the full variety of environments and platforms. There are also a small number of changes related to the fact that there are now two shared objects - in particular, windows requires annotating some symbols with dllimport/dllexport, and doesn't allow exposing thread_local globals directly.

torch/CMakeLists.txt

setup.py

.jenkins/pytorch/build.sh

aten/CMakeLists.txt

torch/CMakeLists.txt

facebook-github-bot

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ezyang · 2018-07-08T23:44:42Z

build failures look legit

anderspapitto · 2018-07-12T14:43:34Z

This diff is ready for another round of review. I highlight two points of potential concern.

The ROCM builds consistently fail due to

timeouts (example https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang3.8-rocm1.7.1-ubuntu16.04-test/2182/console)
or
OOM (example https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang3.8-rocmnightly-ubuntu16.04-build/2775/console).

This is particularly annoying because I did fix legit ROCM issues along the way, so I would like to get a clear signal that everything is now correct. However, my initial feeling is that fixing these infra-level resource issues is outside the scope of this diff.

On windows, a shared library cannot expose global variables with dllexport. Straightforward application of the new TORCH_API macro results in errors messages like

C:\Jenkins\workspace\pytorch-builds\pytorch-win-ws2016-cuda9-cudnn7-py3-build\torch/csrc/autograd/profiler.h(172): error C2492: 'thread_id': data with thread storage duration may not have dll interface

seen here https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-win-ws2016-cuda9-cudnn7-py3-build/12062/console.

In this diff, I've addressed this issue by moving all the relevant thread_local global variables out of the header files, and instead exposing accessor methods across the shared-object boundary. I'm concerned that this may be a perf problem, if function call overhead is too high a cost to pay to access these profiling-related variables. So, is this a problem, and if so is there a suggested alternative approach?

@ezyang @goldsborough as existing reviewers (also please bring in anyone else with relevant knowledge. E.g. I'm not sure who is most familiar with the profiling code.)

edit: also @Jorghi12 regarding the ROCM tests

facebook-github-bot

@anderspapitto has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

goldsborough

Curious about some things, especially the newly disabled warnings

tools/autograd/templates/VariableType.h


 #include <ATen/ATen.h>

+#include "torch/csrc/WindowsTorchApiMacro.h"


torch/csrc/autograd/function.cpp

-thread_local uint64_t Function::next_sequence_nr_ = 0;
+/// Monotonically incrementing (thread local!) counter to supply sequence
+/// numbers.
+thread_local uint64_t Function_next_sequence_nr_ = 0;


torch/csrc/autograd/profiler.h

-    getEventList().record(EventKind::PopRange, std::string(), thread_id, state == ProfilerState::CUDA);
-  }
-}
+RangeEventList& getEventList();


torch/csrc/autograd/init.cpp


  py::class_<torch::autograd::profiler::Event>(m,"ProfilerEvent")
  .def("kind",&torch::autograd::profiler::Event::kind)
-  .def("name",&torch::autograd::profiler::Event::name)


torch/csrc/autograd/grad_mode.h

 namespace torch { namespace autograd {

 struct GradMode {
-  static bool is_enabled() {


torch/CMakeLists.txt

+    # This is required for Python 2 declarations that are deprecated
+    # in 3.
+    -Wno-deprecated-declarations
+    # Python 2.6 requires -fno-strict-aliasing see


torch/CMakeLists.txt

+    -Wno-unused-parameter
+    -Wno-missing-field-initializers
+    -Wno-write-strings
+#    -Wno-zero-length-array


torch/CMakeLists.txt

+
+  if (APPLE)
+    target_compile_options(test_api PRIVATE
+      -Wno-error=unknown-warning-option


aten/src/ATen/ATenGeneral.h


 #ifdef _WIN32
-# if defined(ATen_cpu_EXPORTS) || defined(caffe2_EXPORTS)
+# if defined(torch_EXPORTS)


setup.py


 # we specify exact lib names to avoid conflict with lua-torch installs
-CAFFE2_LIBS = [os.path.join(lib_path, 'libcaffe2.so')]
+CAFFE2_LIBS = [os.path.join(lib_path, 'libcaffe2.so'), os.path.join(lib_path, 'libtorch.so')]


tools/autograd/templates/VariableType.h

  virtual at::Tensor unsafeTensorFromTH(void * th_pointer, bool retain) const override;

-  static at::Type* getType(const at::Type& baseType);
+  TORCH_API static at::Type* getType(const at::Type& baseType);


tools/build_pytorch_libs.bat

                  -DBUILD_CAFFE2=OFF ^
+                  -DBUILD_TORCH="%BUILD_TORCH%" ^
+                  -DNVTOOLEXT_HOME="%NVTOOLEXT_HOME%" ^
+                  -DNO_API=ON ^


torch/CMakeLists.txt

  target_compile_options(torch PRIVATE -Werror)
 endif()

+if (MSVC)


ezyang · 2018-07-13T15:42:24Z

Would it be possible to get a detailed break down of the semantic changes that were made in plain English? E.g., something similar to the comment at #9358 ; it will help us reviewers out a lot.

torch/CMakeLists.txt

+  endif()
+
+  if(MSVC OR APPLE)
+    target_link_libraries(torch caffe2_gpu_library ${TORCH_CUDA_LIBRARIES})


torch/CMakeLists.txt

-if (TORCH_BUILD_TEST)
-  # JIT Tests. TODO: Put into test/cpp/jit folder
-
+# JIT Tests. TODO: Put into test/cpp/jit folder


torch/CMakeLists.txt

+endif()

-    target_link_libraries(test_api torch)
+if (NOT NO_API AND NOT USE_ROCM)


facebook-github-bot

@anderspapitto has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@anderspapitto has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@anderspapitto has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

setup.py now compiles libtorch.so via CMake, and _C.so takes a dependency on libtorch.so. Most cpp files that were previously built directly into _C.so have been removed from _C.so and added to libtorch.so.

facebook-github-bot

@anderspapitto has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ebetica · 2018-07-26T18:47:29Z

@anderspapitto I'm using the PyTorch C++ API in csrc/api, and I used to rely on ./build_all.sh to build the libs before linking (in open source). However, this PR got rid of that, and the command seems quite cumbersome:

export BUILD_TYPE=relwithdebinfo; export LIBTORCH_BUILDPATH=build/libtorch; ./build_caffe2.sh; ./build_libtorch.sh

Any chance it could be brought back and supported in the tests again so it does not break?

anderspapitto requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners June 22, 2018 17:11

anderspapitto force-pushed the build-libtorch-root-cmake branch 11 times, most recently from 1f7bcc0 to 88b0f78 Compare June 26, 2018 18:20

anderspapitto commented Jun 26, 2018

View reviewed changes

torch/CMakeLists.txt Outdated

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

anderspapitto commented Jun 26, 2018

View reviewed changes

setup.py Outdated

This comment was marked as off-topic.

Sign in to view

anderspapitto force-pushed the build-libtorch-root-cmake branch 4 times, most recently from 28de5e5 to 98c6a0a Compare June 26, 2018 19:14

anderspapitto changed the title ~~[WIP] Build libtorch root cmake~~ Introduce libtorch to setup.py build Jun 26, 2018

anderspapitto force-pushed the build-libtorch-root-cmake branch from 98c6a0a to d4c7bb3 Compare June 26, 2018 19:31

ezyang reviewed Jun 26, 2018

View reviewed changes

.jenkins/pytorch/build.sh Outdated

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

ezyang reviewed Jun 26, 2018

View reviewed changes

aten/CMakeLists.txt Outdated

This comment was marked as off-topic.

Sign in to view

ezyang reviewed Jun 26, 2018

View reviewed changes

torch/CMakeLists.txt Outdated

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

ezyang reviewed Jun 26, 2018

View reviewed changes

torch/CMakeLists.txt Outdated

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

ezyang approved these changes Jun 26, 2018

View reviewed changes

anderspapitto force-pushed the build-libtorch-root-cmake branch from 78b6a9f to 3933272 Compare July 3, 2018 15:21

anderspapitto requested a review from ebetica as a code owner July 3, 2018 15:21

anderspapitto force-pushed the build-libtorch-root-cmake branch from 6965cde to 8b1d5cb Compare July 5, 2018 19:51

anderspapitto mentioned this pull request Jul 5, 2018

[WIP] Code changes in preparation for libtorch #9191

Closed

anderspapitto force-pushed the build-libtorch-root-cmake branch 2 times, most recently from 274f619 to 07c01b9 Compare July 6, 2018 19:59

facebook-github-bot reviewed Jul 8, 2018

View reviewed changes

facebook-github-bot reviewed Jul 12, 2018

View reviewed changes

goldsborough reviewed Jul 12, 2018

View reviewed changes

ezyang reviewed Jul 13, 2018

View reviewed changes

torch/CMakeLists.txt Outdated

endif()

target_link_libraries(test_api torch)

if (NOT NO_API AND NOT USE_ROCM)

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

anderspapitto mentioned this pull request Jul 13, 2018

Make JIT tracing a thread-local property #9414

Closed

facebook-github-bot reviewed Jul 13, 2018

View reviewed changes

anderspapitto mentioned this pull request Jul 14, 2018

[Feature Request] enable C++ frontend support on Windows #9450

Closed

facebook-github-bot reviewed Jul 14, 2018

View reviewed changes

facebook-github-bot reviewed Jul 18, 2018

View reviewed changes

anderspapitto added 2 commits July 18, 2018 12:58

Introduce libtorch to setup.py build

c1dcaab

setup.py now compiles libtorch.so via CMake, and _C.so takes a dependency on libtorch.so. Most cpp files that were previously built directly into _C.so have been removed from _C.so and added to libtorch.so.

separately compile InternedStrings::InternedStrings() with -O0

e07054a

facebook-github-bot reviewed Jul 18, 2018

View reviewed changes

ezyang mentioned this pull request Dec 13, 2019

Stop using ctypes to interface with CUDA libraries. #31160

Closed


		#include <ATen/ATen.h>

		#include "torch/csrc/WindowsTorchApiMacro.h"

Conversation

anderspapitto commented Jun 22, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

ezyang commented Jul 8, 2018

Uh oh!

anderspapitto commented Jul 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

goldsborough left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

anderspapitto commented Jun 22, 2018 •

edited

Loading

anderspapitto commented Jul 12, 2018 •

edited

Loading