-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-8147: [C++] add GCS library to ThirdpartyToolchain #11268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-8147: [C++] add GCS library to ThirdpartyToolchain #11268
Conversation
|
|
|
The "Java JNI" test passes locally with The appveyor failure seems to be a race condition in a test. The test has a In any case, if I am wrong do let me know and I will investigate / change things as needed. |
This adds the GCS library to ThirdpartyToolchain, and enables the library in some of the builds. A future PR will add the library to the package manager configuration files and to any remaining builds where it needs to go.
| -DCMAKE_CXX_STANDARD=11 | ||
| "-DCMAKE_CXX_FLAGS=${EP_CXX_FLAGS}" | ||
| -DCMAKE_INSTALL_LIBDIR=lib | ||
| "-DCMAKE_INSTALL_PREFIX=<INSTALL_DIR>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use EP_COMMON_CMAKE_ARGS here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Also did the same for nlohmann_json (add EP_COMMON_CMAKE_ARGS and sort the arguments), but maybe it has no effect there.
| -DCMAKE_INSTALL_RPATH=$ORIGIN | ||
| -DCMAKE_INSTALL_LIBDIR=lib | ||
| -DGOOGLE_CLOUD_CPP_ENABLE=storage | ||
| -DBUILD_TESTING=OFF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you sort this list in alphabetical order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| if(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "AUTO") | ||
| find_package(google_cloud_cpp_storage QUIET) | ||
| if(NOT google_cloud_cpp_storage_FOUND) | ||
| build_google_cloud_cpp() | ||
| endif() | ||
| elseif(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "BUNDLED") | ||
| build_google_cloud_cpp() | ||
| elseif(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "SYSTEM") | ||
| find_package(google_cloud_cpp_storage REQUIRED) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use resolve_dependency() here?
| if(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "AUTO") | |
| find_package(google_cloud_cpp_storage QUIET) | |
| if(NOT google_cloud_cpp_storage_FOUND) | |
| build_google_cloud_cpp() | |
| endif() | |
| elseif(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "BUNDLED") | |
| build_google_cloud_cpp() | |
| elseif(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "SYSTEM") | |
| find_package(google_cloud_cpp_storage REQUIRED) | |
| endif() | |
| resolve_dependency(google_cloud_cpp_storage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry. My suggestion was insufficient.
We need to add google_cloud_cpp_storage entry to build_dependency():
diff --git a/cpp/cmake_modules/ThirdpartyToolchain.cmake b/cpp/cmake_modules/ThirdpartyToolchain.cmake
index 36c8c33ab..e12abefc5 100644
--- a/cpp/cmake_modules/ThirdpartyToolchain.cmake
+++ b/cpp/cmake_modules/ThirdpartyToolchain.cmake
@@ -151,6 +151,8 @@ macro(build_dependency DEPENDENCY_NAME)
build_gflags()
elseif("${DEPENDENCY_NAME}" STREQUAL "GLOG")
build_glog()
+ elseif("${DEPENDENCY_NAME}" STREQUAL "google_cloud_cpp_storage")
+ build_google_cloud_cpp_storage()
elseif("${DEPENDENCY_NAME}" STREQUAL "gRPC")
build_grpc()
elseif("${DEPENDENCY_NAME}" STREQUAL "GTest")And we should use google_cloud_cpp_storage_SOURCE and build_google_cloud_cpp_storage for variable/function names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I think.
| -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} | ||
| "-DCMAKE_CXX_FLAGS=${EP_CXX_FLAGS}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove them because EP_COMMON_CMAKE_ARGS include them?
| -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} | |
| "-DCMAKE_CXX_FLAGS=${EP_CXX_FLAGS}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| PROPERTIES IMPORTED_LOCATION | ||
| "${GOOGLE_CLOUD_CPP_STATIC_LIBRARY_COMMON}" | ||
| INTERFACE_INCLUDE_DIRECTORIES | ||
| "${GOOGLE_CLOUD_CPP_INSTALL_PREFIX}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a typo?
| "${GOOGLE_CLOUD_CPP_INSTALL_PREFIX}") | |
| "${GOOGLE_CLOUD_CPP_INCLUDE_DIR}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks.
| Threads::Threads | ||
| OpenSSL::SSL | ||
| OpenSSL::Crypto) | ||
| add_dependencies(google-cloud-cpp::storage GOOGLE_CLOUD_CPP_ep) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use lowercase here?
| add_dependencies(google-cloud-cpp::storage GOOGLE_CLOUD_CPP_ep) | |
| add_dependencies(google-cloud-cpp::storage google_cloud_cpp_ep) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| if(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "AUTO") | ||
| find_package(google_cloud_cpp_storage QUIET) | ||
| if(NOT google_cloud_cpp_storage_FOUND) | ||
| build_google_cloud_cpp() | ||
| endif() | ||
| elseif(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "BUNDLED") | ||
| build_google_cloud_cpp() | ||
| elseif(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "SYSTEM") | ||
| find_package(google_cloud_cpp_storage REQUIRED) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry. My suggestion was insufficient.
We need to add google_cloud_cpp_storage entry to build_dependency():
diff --git a/cpp/cmake_modules/ThirdpartyToolchain.cmake b/cpp/cmake_modules/ThirdpartyToolchain.cmake
index 36c8c33ab..e12abefc5 100644
--- a/cpp/cmake_modules/ThirdpartyToolchain.cmake
+++ b/cpp/cmake_modules/ThirdpartyToolchain.cmake
@@ -151,6 +151,8 @@ macro(build_dependency DEPENDENCY_NAME)
build_gflags()
elseif("${DEPENDENCY_NAME}" STREQUAL "GLOG")
build_glog()
+ elseif("${DEPENDENCY_NAME}" STREQUAL "google_cloud_cpp_storage")
+ build_google_cloud_cpp_storage()
elseif("${DEPENDENCY_NAME}" STREQUAL "gRPC")
build_grpc()
elseif("${DEPENDENCY_NAME}" STREQUAL "GTest")And we should use google_cloud_cpp_storage_SOURCE and build_google_cloud_cpp_storage for variable/function names.
| set(CRC32C_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/crc32c_ep-install") | ||
| set(CRC32C_CMAKE_ARGS | ||
| ${EP_COMMON_CMAKE_ARGS} | ||
| -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. FWIW, there are many other uses of this in the file, you may want to clean that up in a separate bug+PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right.
| ${EP_COMMON_CMAKE_ARGS} | ||
| -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} | ||
| -DCMAKE_CXX_STANDARD=11 | ||
| "-DCMAKE_CXX_FLAGS=${EP_CXX_FLAGS}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "-DCMAKE_CXX_FLAGS=${EP_CXX_FLAGS}" |
| set(GOOGLE_CLOUD_CPP_CMAKE_ARGS | ||
| ${EP_COMMON_CMAKE_ARGS} | ||
| -DBUILD_TESTING=OFF | ||
| -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| if(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "AUTO") | ||
| find_package(google_cloud_cpp_storage QUIET) | ||
| if(NOT google_cloud_cpp_storage_FOUND) | ||
| build_google_cloud_cpp_storage() | ||
| endif() | ||
| elseif(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "BUNDLED") | ||
| build_google_cloud_cpp_storage() | ||
| elseif(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "SYSTEM") | ||
| find_package(google_cloud_cpp_storage REQUIRED) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolve_dependency() does this.
| if(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "AUTO") | |
| find_package(google_cloud_cpp_storage QUIET) | |
| if(NOT google_cloud_cpp_storage_FOUND) | |
| build_google_cloud_cpp_storage() | |
| endif() | |
| elseif(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "BUNDLED") | |
| build_google_cloud_cpp_storage() | |
| elseif(GOOGLE_CLOUD_CPP_SOURCE STREQUAL "SYSTEM") | |
| find_package(google_cloud_cpp_storage REQUIRED) | |
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat, thanks. Removed.
| ARROW_WITH_ZSTD=ON \ | ||
| ASAN_SYMBOLIZER_PATH=/usr/lib/llvm-${llvm}/bin/llvm-symbolizer \ | ||
| AWSSDK_SOURCE=BUNDLED \ | ||
| GOOGLE_CLOUD_CPP_SOURCE=BUNDLED \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| GOOGLE_CLOUD_CPP_SOURCE=BUNDLED \ | |
| google_cloud_cpp_storage_SOURCE=BUNDLED \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Note that this could be a nuisance if you ever want to use Bigtable or Spanner as data sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need to implement custom resolve_dependency() like feature for them when we also want to use other Google Cloud C++ libraries.
If Google Cloud C++ provides google_cloud_cpp CMake package with storage, bigtable and so on components, we can use google_cloud_cpp_SOURCE and find_package(google_cloud_cpp COMPONENTS storage bigtable ...).
ci/scripts/cpp_build.sh
Outdated
| -DCMAKE_INSTALL_PREFIX=${CMAKE_INSTALL_PREFIX:-${ARROW_HOME}} \ | ||
| -DCMAKE_UNITY_BUILD=${CMAKE_UNITY_BUILD:-OFF} \ | ||
| -Dgflags_SOURCE=${gflags_SOURCE:-} \ | ||
| -DGOOGLE_CLOUD_CPP_SOURCE=${GOOGLE_CLOUD_CPP_SOURCE:-} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| -DGOOGLE_CLOUD_CPP_SOURCE=${GOOGLE_CLOUD_CPP_SOURCE:-} \ | |
| -Dgoogle_cloud_cpp_storage_SOURCE=${google_cloud_cpp_storage_SOURCE:-} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| -DCRC32C_BUILD_TESTS=OFF | ||
| -DCRC32C_BUILD_BENCHMARKS=OFF | ||
| -DCRC32C_USE_GLOG=OFF) | ||
| set(CRC32C_BUILD_BYPRODUCTS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| set(CRC32C_BUILD_BYPRODUCTS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooops, thanks. Removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Thanks.
|
Oh, arm64 build on Travis CI is failed: https://app.travis-ci.com/github/apache/arrow/jobs/541265250#L2124 |
| ARROW_WITH_ZSTD=ON \ | ||
| ASAN_SYMBOLIZER_PATH=/usr/lib/llvm-${llvm}/bin/llvm-symbolizer \ | ||
| AWSSDK_SOURCE=BUNDLED \ | ||
| google_cloud_cpp_storage_SOURCE=BUNDLED \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was GCS_SOURCE too ambiguous? This is quite a long parameter name...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use CMake's package name for XXX_SOURCE.
If we don't follow the convention, we can choose GCS_SOURCE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know how to proceed, I can do either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's the CMake name then fair enough.
|
Forgot to update |
That seems to be fixed upstream in crc32c, but the fix is not in any tag or release. Would you be Okay with me pinning to a SHA of this dependency? Can I reproduce these builds myself to verify I have fixed such problems? |
Yes. We sometimes use this approach.
Yes.
BTW, it's strange that Travis CI jobs weren't run for this pull request... |
In #11268 I neglected to add `google_cloud_cpp_storage` to the third-party dependencies. Closes #11320 from coryan/ARROW-14223-add-google-cloud-cpp-to-thirdparty-dependencies Authored-by: Carlos O'Ryan <coryan@google.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
This adds the GCS library to ThirdpartyToolchain, and enables the library in some of the builds. A future PR will add the library to the package manager configuration files and to any remaining builds where it needs to go. Closes apache#11268 from coryan/ARROW-8147-add-google-cloud-cpp-to-ThirdpartyToolchain Authored-by: Carlos O'Ryan <coryan@google.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
In apache#11268 I neglected to add `google_cloud_cpp_storage` to the third-party dependencies. Closes apache#11320 from coryan/ARROW-14223-add-google-cloud-cpp-to-thirdparty-dependencies Authored-by: Carlos O'Ryan <coryan@google.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
This adds the GCS library to ThirdpartyToolchain, and enables the
library in some of the builds. A future PR will add the library to the
package manager configuration files and to any remaining builds where it
needs to go.