Skip to content

remove xla-specific stuff from codegen (minus CPU fallback)#58064

Closed
bdhirsh wants to merge 9 commits intogh/bdhirsh/115/basefrom
gh/bdhirsh/115/head
Closed

remove xla-specific stuff from codegen (minus CPU fallback)#58064
bdhirsh wants to merge 9 commits intogh/bdhirsh/115/basefrom
gh/bdhirsh/115/head

Conversation

@bdhirsh
Copy link
Copy Markdown
Collaborator

@bdhirsh bdhirsh commented May 11, 2021

Summary
This PR tries to remove all xla-specific logic from the codegen except for two places:

  • renaming the aten_xla_type.h/cpp template files; Going to do that in a separate PR just to make the diff easier to understand
  • CPU fallback logic (everything in aten_xla_type_default.h/cpp and gen_external_aten_fallbacks.py). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here.

Notable changes
The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding xla-side PR with the new yaml changes, which look like this:

per_op_log: XLA_FN_TRACK(3)
per_argument_log: TF_VLOG(3)
cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1)
extra_headers: >
     #include <tensorflow/compiler/xla/xla_client/debug_macros.h>
     #include <tensorflow/compiler/xla/xla_client/metrics.h>
     #include <tensorflow/compiler/xla/xla_client/tf_logging.h>
     #include <torch_xla/csrc/function_call_tracker.h>
     #include <torch_xla/csrc/aten_xla_type.h>
     #include <torch_xla/csrc/aten_xla_type_default.h>

Stack from ghstack:

Differential Revision: D28711095

@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented May 11, 2021

💊 CI failures summary and remediations

As of commit 82188c4 (more details on the Dr. CI page):


  • 2/2 failures introduced in this PR

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_backward_compatibility_check_test (1/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

May 26 00:16:35 The PR is introducing backward ...m to confirm whether this change is wanted or not.
May 26 00:16:35 processing existing schema:  alltoall(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, Tensor[] _2) -> (__torch__.torch.classes.dist_c10d.Work _0)
May 26 00:16:35 processing existing schema:  send(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, int _2, int _3) -> (__torch__.torch.classes.dist_c10d.Work _0)
May 26 00:16:35 processing existing schema:  recv(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, int _2, int _3) -> (__torch__.torch.classes.dist_c10d.Work _0)
May 26 00:16:35 processing existing schema:  recv_anysource(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, int _2) -> (__torch__.torch.classes.dist_c10d.Work _0)
May 26 00:16:35 processing existing schema:  barrier(__torch__.torch.classes.dist_c10d.ProcessGroup _0) -> (__torch__.torch.classes.dist_c10d.Work _0)
May 26 00:16:35 processing existing schema:  __init__(__torch__.torch.classes.dist_c10d.frontend _0) -> (NoneType _0)
May 26 00:16:35 processing existing schema:  new_process_group_helper(__torch__.torch.classes.dist_c10d.frontend _0, int _1, int _2, int[] _3, str _4, __torch__.torch.classes.dist_c10d.Store _5, str? _6, int _7) -> (__torch__.torch.classes.dist_c10d.ProcessGroup _0)
May 26 00:16:35 processing existing schema:  get_process_group_by_name(__torch__.torch.classes.dist_c10d.frontend _0, str _1) -> (__torch__.torch.classes.dist_c10d.ProcessGroup _0)
May 26 00:16:35 processing existing schema:  get_name_of_process_group(__torch__.torch.classes.dist_c10d.frontend _0, __torch__.torch.classes.dist_c10d.ProcessGroup _1) -> (str _0)
May 26 00:16:35 processing existing schema:  __init__(__torch__.torch.classes.dist_rpc.WorkerInfo _0, str _1, int _2) -> (NoneType _0)
May 26 00:16:35 The PR is introducing backward incompatible changes to the operator library. Please contact PyTorch team to confirm whether this change is wanted or not. 
May 26 00:16:35 
May 26 00:16:35 Broken ops: [
May 26 00:16:35 	aten::repeat_interleave.Tensor(Tensor repeats, int? output_size=None) -> (Tensor)
May 26 00:16:35 	aten::repeat_interleave.self_Tensor(Tensor self, Tensor repeats, int? dim=None, int? output_size=None) -> (Tensor)
May 26 00:16:35 	aten::repeat_interleave.self_int(Tensor self, int repeats, int? dim=None, int? output_size=None) -> (Tensor)
May 26 00:16:35 ]
May 26 00:16:35 =================== sccache compilation log ===================
May 26 00:16:35 =========== If your build fails, please take a look at the log above for possible reasons ===========
May 26 00:16:35 Compile requests                      0
May 26 00:16:35 Compile requests executed             0

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_build (2/2)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

May 26 00:39:44 torch_xla/csrc/init_python_bind... error: use of undeclared identifier 'AtenXlaType'
May 26 00:39:23 /var/lib/jenkins/workspace/torch/csrc/utils/python_strings.h:105:19: warning: unused function 'PyObject_FastGetAttrString' [-Wunused-function]
May 26 00:39:23 static py::object PyObject_FastGetAttrString(PyObject *obj, char *name)
May 26 00:39:23                   ^
May 26 00:39:26 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/init_python_bindings.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/init_python_bindings.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 26 00:39:29 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/op_by_op_executor.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/op_by_op_executor.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 26 00:39:31 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/softmax_builder.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/softmax_builder.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 26 00:39:35 1 warning generated.
May 26 00:39:35 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/nll_loss.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/nll_loss.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 26 00:39:40 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/function_call_tracker.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/function_call_tracker.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 26 00:39:44 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/ir_dump_util.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/ir_dump_util.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 26 00:39:44 torch_xla/csrc/init_python_bindings.cpp:712:16: error: use of undeclared identifier 'AtenXlaType'
May 26 00:39:44         []() { AtenXlaType::InitializeAtenBindings(); });
May 26 00:39:44                ^
May 26 00:39:50 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/ir.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/ir.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 26 00:39:52 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/shape_builder.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/shape_builder.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 26 00:39:54 1 error generated.
May 26 00:39:54 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/aten_xla_type_default.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/aten_xla_type_default.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 26 00:39:54 /opt/conda/lib/python3.6/site-packages/torch/utils/cpp_extension.py:370: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
May 26 00:39:54   warnings.warn(msg.format('we could not find ninja.'))
May 26 00:39:54 error: command 'clang-9' failed with exit status 1
May 26 00:39:54 + cleanup

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

// Base ATEN Type class where the XLA specific overrides should be defined.
class AtenXlaType {
public:
static void InitializeAtenBindings();
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that this function actually doesn't do anything, so I opted to kill it rather than leave it as a function in the template that all backends need to implement. @ailzhang does that seem fine?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup this looks like a legacy that we can safely kill :D

@bdhirsh bdhirsh requested review from ailzhang, bhosmer and ezyang May 12, 2021 17:12
# Logging macros that are inserted into wrappers. Only really used by external backends.
per_op_log: Optional[str] = None
per_argument_log: Optional[str] = None
cpu_fallback_counter: Optional[str] = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some tests exercising this?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

welp, I added some expect tests but forgot to commit them. Adding them back soon.

Copy link
Copy Markdown
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neato!

**Summary**
This PR tries to remove all xla-specific logic from the codegen except for two places:
- renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand
- CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here.

**Notable changes**
The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this:
```
per_op_log: XLA_FN_TRACK(3)
per_argument_log: TF_VLOG(3)
cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1)
extra_headers: >
     #include <tensorflow/compiler/xla/xla_client/debug_macros.h>
     #include <tensorflow/compiler/xla/xla_client/metrics.h>
     #include <tensorflow/compiler/xla/xla_client/tf_logging.h>
     #include <torch_xla/csrc/function_call_tracker.h>
     #include <torch_xla/csrc/aten_xla_type.h>
     #include <torch_xla/csrc/aten_xla_type_default.h>
```




[ghstack-poisoned]
Copy link
Copy Markdown
Contributor

@ailzhang ailzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks!

// Base ATEN Type class where the XLA specific overrides should be defined.
class AtenXlaType {
public:
static void InitializeAtenBindings();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup this looks like a legacy that we can safely kill :D

index: Dict['OperatorName', BackendMetadata]

# Logging macros that are inserted into wrappers. Only really used by external backends.
per_op_log: Optional[str] = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: renaming these to per_op_setup/per_argument_setup/cpu_fallback_setup to make these fields more general, wdyt?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm planning on removing these from the PR (but let me know if you agree). From Jack's comment, it sounds fine to remove the custom logging from the codegen, which is the only reason I added it in.

**Summary**
This PR tries to remove all xla-specific logic from the codegen except for two places:
- renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand
- CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here.

**Notable changes**
The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this:
```
per_op_log: XLA_FN_TRACK(3)
per_argument_log: TF_VLOG(3)
cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1)
extra_headers: >
     #include <tensorflow/compiler/xla/xla_client/debug_macros.h>
     #include <tensorflow/compiler/xla/xla_client/metrics.h>
     #include <tensorflow/compiler/xla/xla_client/tf_logging.h>
     #include <torch_xla/csrc/function_call_tracker.h>
     #include <torch_xla/csrc/aten_xla_type.h>
     #include <torch_xla/csrc/aten_xla_type_default.h>
```




[ghstack-poisoned]
**Summary**
This PR tries to remove all xla-specific logic from the codegen except for two places:
- renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand
- CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here.

**Notable changes**
The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this:
```
per_op_log: XLA_FN_TRACK(3)
per_argument_log: TF_VLOG(3)
cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1)
extra_headers: >
     #include <tensorflow/compiler/xla/xla_client/debug_macros.h>
     #include <tensorflow/compiler/xla/xla_client/metrics.h>
     #include <tensorflow/compiler/xla/xla_client/tf_logging.h>
     #include <torch_xla/csrc/function_call_tracker.h>
     #include <torch_xla/csrc/aten_xla_type.h>
     #include <torch_xla/csrc/aten_xla_type_default.h>
```




[ghstack-poisoned]
bdhirsh added 3 commits May 19, 2021 13:30
**Summary**
This PR tries to remove all xla-specific logic from the codegen except for two places:
- renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand
- CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here.

**Notable changes**
The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this:
```
per_op_log: XLA_FN_TRACK(3)
per_argument_log: TF_VLOG(3)
cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1)
extra_headers: >
     #include <tensorflow/compiler/xla/xla_client/debug_macros.h>
     #include <tensorflow/compiler/xla/xla_client/metrics.h>
     #include <tensorflow/compiler/xla/xla_client/tf_logging.h>
     #include <torch_xla/csrc/function_call_tracker.h>
     #include <torch_xla/csrc/aten_xla_type.h>
     #include <torch_xla/csrc/aten_xla_type_default.h>
```




[ghstack-poisoned]
**Summary**
This PR tries to remove all xla-specific logic from the codegen except for two places:
- renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand
- CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here.

**Notable changes**
The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this:
```
per_op_log: XLA_FN_TRACK(3)
per_argument_log: TF_VLOG(3)
cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1)
extra_headers: >
     #include <tensorflow/compiler/xla/xla_client/debug_macros.h>
     #include <tensorflow/compiler/xla/xla_client/metrics.h>
     #include <tensorflow/compiler/xla/xla_client/tf_logging.h>
     #include <torch_xla/csrc/function_call_tracker.h>
     #include <torch_xla/csrc/aten_xla_type.h>
     #include <torch_xla/csrc/aten_xla_type_default.h>
```




[ghstack-poisoned]
**Summary**
This PR tries to remove all xla-specific logic from the codegen except for two places:
- renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand
- CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here.

**Notable changes**
The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this:
```
per_op_log: XLA_FN_TRACK(3)
per_argument_log: TF_VLOG(3)
cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1)
extra_headers: >
     #include <tensorflow/compiler/xla/xla_client/debug_macros.h>
     #include <tensorflow/compiler/xla/xla_client/metrics.h>
     #include <tensorflow/compiler/xla/xla_client/tf_logging.h>
     #include <torch_xla/csrc/function_call_tracker.h>
     #include <torch_xla/csrc/aten_xla_type.h>
     #include <torch_xla/csrc/aten_xla_type_default.h>
```




[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request May 24, 2021
bdhirsh added a commit that referenced this pull request May 25, 2021
**Summary**
This PR tries to remove all xla-specific logic from the codegen except for two places:
- renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand
- CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here.

**Notable changes**
The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this:
```
per_op_log: XLA_FN_TRACK(3)
per_argument_log: TF_VLOG(3)
cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1)
extra_headers: >
     #include <tensorflow/compiler/xla/xla_client/debug_macros.h>
     #include <tensorflow/compiler/xla/xla_client/metrics.h>
     #include <tensorflow/compiler/xla/xla_client/tf_logging.h>
     #include <torch_xla/csrc/function_call_tracker.h>
     #include <torch_xla/csrc/aten_xla_type.h>
     #include <torch_xla/csrc/aten_xla_type_default.h>
```




[ghstack-poisoned]
@bdhirsh
Copy link
Copy Markdown
Collaborator Author

bdhirsh commented May 26, 2021

@bdhirsh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@bdhirsh merged this pull request in 86ce295.

@bdhirsh bdhirsh mentioned this pull request May 27, 2021
deniskokarev pushed a commit to deniskokarev/pytorch that referenced this pull request Jun 9, 2021
…58064)

Summary:
Pull Request resolved: pytorch#58064

**Summary**
This PR tries to remove all xla-specific logic from the codegen except for two places:
- renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand
- CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here.

**Notable changes**
The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this:
```
per_op_log: XLA_FN_TRACK(3)
per_argument_log: TF_VLOG(3)
cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1)
extra_headers: >
     #include <tensorflow/compiler/xla/xla_client/debug_macros.h>
     #include <tensorflow/compiler/xla/xla_client/metrics.h>
     #include <tensorflow/compiler/xla/xla_client/tf_logging.h>
     #include <torch_xla/csrc/function_call_tracker.h>
     #include <torch_xla/csrc/aten_xla_type.h>
     #include <torch_xla/csrc/aten_xla_type_default.h>
```

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D28711095

Pulled By: bdhirsh

fbshipit-source-id: 90a48440f2e865a948184e2fb167ea240ada47bb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants