-
Notifications
You must be signed in to change notification settings - Fork 75.2k
Description
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Arch Linux
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: n/A
- TensorFlow installed from (source or binary): source
- TensorFlow version: 2.0.0
- Python version: 3.7.4
- Installed using virtualenv? pip? conda?: n/A
- Bazel version (if compiling from source): bazel 0.29.1- (@Non-Git)
- GCC/Compiler version (if compiling from source): gcc-8 (GCC) 8.3.0
- CUDA/cuDNN version: 10.1.243-1/7.6.4.38-1
- GPU model and memory: NVIDIA GeForce GTX 760 4GB
Describe the problem
When building tensorflow 2.0.0 on a system with glibc version 2.30, the build fails due to a function name clash issue in grpc, which is already fixed (in grpc/grpc#18950) and there are grpc releases available with this fix. I believe updating the grpc dependency should fix this issue in Tensorflow.
Provide the exact sequence of commands / steps that you executed before running into the problem
On Arch Linux:
cd $(mktemp -d /tmp/tensorflow-test-build-XXX)
curl -L -o PKGBUILD 'https://git.archlinux.org/svntogit/community.git/plain/trunk/PKGBUILD?h=packages/tensorflow'
makepkg
If you're not on Arch Linux, have a look at the build script in the PKGBUILD file - it contains all the environment variable definitions and build commands, and is very readable.
Any other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
End of build log:
...
INFO: From Compiling external/llvm/lib/DebugInfo/CodeView/TypeRecordMapping.cpp:
external/llvm/lib/DebugInfo/CodeView/TypeRecordMapping.cpp: In member function 'virtual llvm::Error llvm::codeview::TypeRecordMapping::visitKnownRecord(llvm::codeview::CVType&, llvm::codeview::VFTableShapeRecord&)':
external/llvm/lib/DebugInfo/CodeView/TypeRecordMapping.cpp:293:61: warning: 'Byte' may be used uninitialized in this function [-Wmaybe-uninitialized]
293 | Record.Slots.push_back(static_cast<VFTableSlotKind>(Byte >> 4));
| ^~~~
ERROR: /tmp/bazel/michiel/output/41c10338046435fcb3c7d7f27ec34951/external/grpc/BUILD:507:1: C++ compilation of rule '@grpc//:gpr_base' failed (Exit 1)
external/grpc/src/core/lib/gpr/log_linux.cc:43:13: error: ambiguating new declaration of 'long int gettid()'
43 | static long gettid(void) { return syscall(__NR_gettid); }
| ^~~~~~
In file included from /usr/include/unistd.h:1170,
from external/grpc/src/core/lib/gpr/log_linux.cc:41:
/usr/include/bits/unistd_ext.h:34:16: note: old declaration '__pid_t gettid()'
34 | extern __pid_t gettid (void) __THROW;
| ^~~~~~
external/grpc/src/core/lib/gpr/log_linux.cc:43:13: warning: 'long int gettid()' defined but not used [-Wunused-function]
43 | static long gettid(void) { return syscall(__NR_gettid); }
| ^~~~~~
INFO: Elapsed time: 744.187s, Critical Path: 41.65s
INFO: 3096 processes: 3096 local.
FAILED: Build did NOT complete successfully
==> ERROR: A failure occurred in build().
Aborting...
What I've tried so far to fix this
I tried to emulate the grpc version update from 061c359 and bump grpc to v1.24.3. Since v1.19.x, which the currently referenced grpc version seems to be from, grpc has added a dependency on https://github.com/protocolbuffers/upb, and it wasn't clear to me how to add and initialise this dependency correctly in a way that's consistent with tensorflow's use of bazel. I specifically wasn't sure where to call grpc_deps()/upb_deps() from, or what equivalent action was required instead.
Context
In case it matters, I'm trying to build the Arch Linux Package from source, so that I can add 3.0 to TF_CUDA_COMPUTE_CAPABILITIES, which is required for my graphics card. I managed to do this for an earlier version of tensorflow some time ago without too much difficulty.