Skip to content

[Arrow] Update to 0.17.1#11472

Merged
dan-shaw merged 9 commits intomicrosoft:masterfrom
GPSnoopy:Arrow-0.17.1
Jun 9, 2020
Merged

[Arrow] Update to 0.17.1#11472
dan-shaw merged 9 commits intomicrosoft:masterfrom
GPSnoopy:Arrow-0.17.1

Conversation

@GPSnoopy
Copy link
Copy Markdown
Contributor

@GPSnoopy GPSnoopy commented May 20, 2020

Updates Arrow from 0.17.0 to the recently released 0.17.1.

Built and successfully tested on Windows (x64-static), Linux (x64) and macOS (x64) via ParquetSharp CI:
G-Research/ParquetSharp#136

Copy link
Copy Markdown
Contributor

@NancyLi1013 NancyLi1013 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed this port was set as fail with the following triplets:

arrow:arm64-windows=fail
arrow:x64-linux=fail
arrow:x86-windows=fail

So does this port also not support for arm or arm64 and linux?

I checked the build process on osx pipeline, it didn't build successfully.

Are you clear the specific supports?

@GPSnoopy
Copy link
Copy Markdown
Contributor Author

@NancyLi1013 Where do you see these status?

The ParquetSharp above shows that the arrow port does build from scratch for x64-windows-static, x64-linux and x64-osx on GitHub Actions / Azure.

I'm not sure if Apache Arrow supports x86 (e.g. PyArrow has no wheels for x86, https://pypi.org/project/pyarrow/#files). And I've never tried Arm64.

@GPSnoopy
Copy link
Copy Markdown
Contributor Author

GPSnoopy commented May 21, 2020

The real question is why does the vcpkg CI display a green build when it contains the following?
Error: Building package arrow:x64-linux failed with: BUILD_FAILED

@NancyLi1013
Copy link
Copy Markdown
Contributor

NancyLi1013 commented May 22, 2020

Hi @GPSnoopy

I got these info from ci.baseline.txt
https://github.com/microsoft/vcpkg/blob/master/scripts/ci.baseline.txt#L86-L88

If arrow doesn't support arm64, linux and osx, x86, we need to add vcpkg_fail_port_install(ON_ARCH "x86" "arm64" ON_TARGET "linux" "osx") to the top of portfile.cmake.

If not, we need to fix the problems on these triplets.

As for the CI question you mentioned above, since we have set arrow:x64-linux=fail in ci.baseline.txt.

The CI check results depend on the status of triplets for the ports in ci.baseline.txt and build results. If the two statuses are consistent, the final check will show pass, namely display a green.

@NancyLi1013 NancyLi1013 added the category:port-update The issue is with a library, which is requesting update new revision label May 22, 2020
GPSnoopy added 3 commits May 22, 2020 10:02
Add explicit tool dependencies on Flex and Bison for Linux and OSX.
…check.

Remove thrift:x64-osx=fail from ci.baseline.txt (we know arrow depends on it, and arrow:x64-osx has been shown to work in 3rd party project).
@GPSnoopy
Copy link
Copy Markdown
Contributor Author

GPSnoopy commented May 22, 2020

@NancyLi1013 I've removed the fail status for both linux and osx. I've also replaced the adhoc check on arch with vcpkg_fail_port_install, such that the build will fail with x86, arm and arm64.

The CI is properly showing the failures, as you explained. Now, we know that these targets all pass when run via ParquetSharp CI. So what's different with vcpkg CI?

x64-linux

This one really baffles me. It seems to go wrong in Arrow's FindBrotli.

-- Checking for modules 'libbrotlicommon;libbrotlienc;libbrotlidec'
--   Found libbrotlicommon, version 1.0.7
--   Found libbrotlienc, version 1.0.7
--   Found libbrotlidec, version 1.0.7
-- Configuring incomplete, errors occurred!
CMake Error at /mnt/_work/1/s/downloads/tools/cmake-3.17.2-linux/cmake-3.17.2-Linux-x86_64/share/cmake-3.17/Modules/FindPackageHandleStandardArgs.cmake:164 (message):
  Could NOT find Brotli (missing: BROTLI_COMMON_LIBRARY BROTLI_ENC_LIBRARY
  BROTLI_DEC_LIBRARY)
Call Stack (most recent call first):
  /mnt/_work/1/s/downloads/tools/cmake-3.17.2-linux/cmake-3.17.2-Linux-x86_64/share/cmake-3.17/Modules/FindPackageHandleStandardArgs.cmake:445 (_FPHSA_FAILURE_MESSAGE)
  cmake_modules/FindBrotli.cmake:107 (find_package_handle_standard_args)
  /mnt/_work/1/s/scripts/buildsystems/vcpkg.cmake:329 (_find_package)
  cmake_modules/ThirdpartyToolchain.cmake:156 (find_package)
  cmake_modules/ThirdpartyToolchain.cmake:874 (resolve_dependency)
  CMakeLists.txt:458 (include)

However I cannot reproduce this error. It works absolutely fine when run on my machine and via GitHub Actions (see PR link in the description); note that both are running Ubuntu 18.04. This is what I get in my logs instead:

-- Checking for modules 'libbrotlicommon;libbrotlienc;libbrotlidec'
--   Found libbrotlicommon, version 1.0.7
--   Found libbrotlienc, version 1.0.7
--   Found libbrotlidec, version 1.0.7
-- Found Brotli: /home/gpsnoopy/Development/vcpkg.tanguyf/packages/brotli_x64-linux/debug/lib/libbrotlicommon-static.a

Not being able to reproduce the issue locally is making this incredibly difficult to figure out. Any ideas?

x64-osx

[1/95] cd /Volumes/data/work/1/s/buildtrees/thrift/src/401ebbec17-573d9ca574/compiler/cpp && /usr/bin/bison -d -o /Volumes/data/buildtrees/thrift/x64-osx-dbg/compiler/cpp/thrift/thrifty.cc /Volumes/data/work/1/s/buildtrees/thrift/src/401ebbec17-573d9ca574/compiler/cpp/src/thrift/thrifty.yy
FAILED: compiler/cpp/thrift/thrifty.cc compiler/cpp/thrift/thrifty.hh 
cd /Volumes/data/work/1/s/buildtrees/thrift/src/401ebbec17-573d9ca574/compiler/cpp && /usr/bin/bison -d -o /Volumes/data/buildtrees/thrift/x64-osx-dbg/compiler/cpp/thrift/thrifty.cc /Volumes/data/work/1/s/buildtrees/thrift/src/401ebbec17-573d9ca574/compiler/cpp/src/thrift/thrifty.yy
/Volumes/data/work/1/s/buildtrees/thrift/src/401ebbec17-573d9ca574/compiler/cpp/src/thrift/thrifty.yy:1.1-5: invalid directive: `%code'
/Volumes/data/work/1/s/buildtrees/thrift/src/401ebbec17-573d9ca574/compiler/cpp/src/thrift/thrifty.yy:1.7-14: syntax error, unexpected identifier

I believe the issue here is the Thrift dependency on Bison. The default version included in macOS is super old (2007 IIRC). This is why ParquetSharp CI does brew install bison on macOS before running vcpkg.

What's the best way to solve this? Sort of feels like this should be in the CI rather than in Thrift port.

@jgiannuzzi
Copy link
Copy Markdown
Contributor

jgiannuzzi commented May 22, 2020

x64-osx

[1/95] cd /Volumes/data/work/1/s/buildtrees/thrift/src/401ebbec17-573d9ca574/compiler/cpp && /usr/bin/bison -d -o /Volumes/data/buildtrees/thrift/x64-osx-dbg/compiler/cpp/thrift/thrifty.cc /Volumes/data/work/1/s/buildtrees/thrift/src/401ebbec17-573d9ca574/compiler/cpp/src/thrift/thrifty.yy
FAILED: compiler/cpp/thrift/thrifty.cc compiler/cpp/thrift/thrifty.hh 
cd /Volumes/data/work/1/s/buildtrees/thrift/src/401ebbec17-573d9ca574/compiler/cpp && /usr/bin/bison -d -o /Volumes/data/buildtrees/thrift/x64-osx-dbg/compiler/cpp/thrift/thrifty.cc /Volumes/data/work/1/s/buildtrees/thrift/src/401ebbec17-573d9ca574/compiler/cpp/src/thrift/thrifty.yy
/Volumes/data/work/1/s/buildtrees/thrift/src/401ebbec17-573d9ca574/compiler/cpp/src/thrift/thrifty.yy:1.1-5: invalid directive: `%code'
/Volumes/data/work/1/s/buildtrees/thrift/src/401ebbec17-573d9ca574/compiler/cpp/src/thrift/thrifty.yy:1.7-14: syntax error, unexpected identifier

I believe the issue here is the Thrift dependency on Bison. The default version included in macOS is super old (2007 IIRC). This is why ParquetSharp CI does brew install bison on macOS before running vcpkg.

I can confirm that this issue is because the default Bison is too old on macOS. This is why a newer version of Bison is downloaded by vcpkg on Windows:

if(CMAKE_HOST_WIN32)
set(PROGNAME win_bison)
set(SUBDIR win_bison-2.5.16)
set(PATHS ${DOWNLOADS}/tools/win_bison/${SUBDIR})
set(URL "https://sourceforge.net/projects/winflexbison/files/winflexbison-2.5.16.zip/download")
set(ARCHIVE "win_flex_bison-2.5.16.zip")
set(HASH 0a14154bff5d998feb23903c46961528f8ccb4464375d5384db8c4a7d230c0c599da9b68e7a32f3217a0a0735742242eaf3769cb4f03e00931af8640250e9123)
if(NOT EXISTS "${PATHS}/data/m4sugar/m4sugar.m4")
file(REMOVE_RECURSE "${PATHS}")
endif()
else()

I wondered whether this approach could work for macOS too, but could not find any official build of Bison for macOS, except for Homebrew bottles (binary packages), but then we're probably better off just using Homebrew.

See also my comment here: #7533 (comment)

@NancyLi1013
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@adamreeve
Copy link
Copy Markdown
Contributor

I can reproduce the CI failure locally by using the vcpkg ci command: ./vcpkg ci x64-linux --x-xunit="$PWD/xml-results/x64-linux.xml" --exclude=${SKIPLIST} --binarycaching --purge-tombstones (after setting up SKIPLIST to skip most things except arrow and its dependencies). However, running ./vcpkg install arrow:x64-linux on the same machine from a separate clean checkout does run successfully.

It looks like the problem is that when the ci build runs, the packages directory gets cleared after each package is installed, whereas running the install command doesn’t do that, and the logs show that arrow is using Brotli from in the packages directory rather than the installed directory:

Found Brotli: /mnt/c/Users/adree/gr/vcpkg/packages/brotli_x64-linux/debug/lib/libbrotlicommon-static.a

I’m not very familiar with CMake, but FindBrotli.cmake in arrow appears to use pkg-config to locate Brotli with the pkg_check_modules function. However, the pkg-config file for Brotli at vcpkg/installed/x64-linux/lib/pkgconfig/libbrotlicommon.pc for example references the Brotli directories within the vcpkg/packages directory that no longer exist when using the ci command:

prefix=/mnt/c/Users/adree/gr/vcpkg/packages/brotli_x64-linux
exec_prefix=/mnt/c/Users/adree/gr/vcpkg/packages/brotli_x64-linux
libdir=${prefix}/lib
includedir=${prefix}/include

Name: libbrotlicommon
URL: https://github.com/google/brotli
Description: Brotli common dictionary library
Version: 1.0.7
Libs: -L${libdir} -lbrotlicommon
Cflags: -I${includedir}

So it seems that because vcpkg moves files from packages into installed, the pkg-config files are no longer correct and arrow then looks in the wrong place for Brotli.

I guess the fix here might be to patch FindBrotli.cmake in the arrow port so it uses a different method to locate the Brotli libraries, but it seems a bit sad to have to do that to work around vcpkg behaviour. Would it make sense for vcpkg to understand pkg-config files and fix paths in them when it moves files from packages into installed?

adamreeve and others added 4 commits May 29, 2020 10:39
This is incompatible with vcpkg as these files refer to paths in the
packages directory rather than the installed directory, so this only
works if the packages haven't been cleaned.
@GPSnoopy
Copy link
Copy Markdown
Contributor Author

@NancyLi1013 Thanks to @adamreeve changes, the x64-linux CI build is now working.

I've marked back x64-osx as still failing until a proper solution for Bison tooling on macOS can be devised. Open to ideas.

In the meantime, I'm happy for the PR to be merged if you are.

@GPSnoopy
Copy link
Copy Markdown
Contributor Author

GPSnoopy commented Jun 1, 2020

Hi @NancyLi1013, thanks for the review and the patch.

The osx failure seems to be unrelated to this PR, I think it's safe to merge.

@NancyLi1013
Copy link
Copy Markdown
Contributor

NancyLi1013 commented Jun 1, 2020

Thanks for your kindly reminder.
The failure on x64-osx is not related with this PR. We will try to fix this.
Once fixed, I will rerun this PR.

@NancyLi1013
Copy link
Copy Markdown
Contributor

All features have passed with the following triplets:

  • x64-windows
  • x64-windows-static

Note: This port only supports x64.

@NancyLi1013 NancyLi1013 added the info:needs-maintainer-attention Lets the current 'on rotation' vcpkg maintainer know they need to look at this. label Jun 5, 2020
@GPSnoopy
Copy link
Copy Markdown
Contributor Author

GPSnoopy commented Jun 5, 2020

x64-linux has passed too. Or did I miss something?

Let us all pray to the merger gods now :-)

@NancyLi1013
Copy link
Copy Markdown
Contributor

LGTM now, @GPSnoopy thanks for this PR.

@dan-shaw dan-shaw merged commit 8f34b4b into microsoft:master Jun 9, 2020
@GPSnoopy GPSnoopy deleted the Arrow-0.17.1 branch June 10, 2020 11:34
cenit pushed a commit to cenit/vcpkg that referenced this pull request Jun 11, 2020
* [Arrow] Update to 0.17.1

* Remove arrow:x64-linux=fail from ci.baseline.txt.
Add explicit tool dependencies on Flex and Bison for Linux and OSX.

* Revert arrow dependency on Flex/Bison, it's Thrift that needs them and its portfile is already fine.

* Use vcpkg_fail_port_install(ON_ARCH x86 arm arm64) instead of custom check.
Remove thrift:x64-osx=fail from ci.baseline.txt (we know arrow depends on it, and arrow:x64-osx has been shown to work in 3rd party project).

* Disable using pkg-config files to locate dependencies in arrow

This is incompatible with vcpkg as these files refer to paths in the
packages directory rather than the installed directory, so this only
works if the packages haven't been cleaned.

* Mark thrift:x64-osx as still failing until a proper solution for Bison can be found.

* Update ports/arrow/portfile.cmake

Co-authored-by: Adam Reeve <adreeve@gmail.com>
Co-authored-by: NancyLi1013 <46708020+NancyLi1013@users.noreply.github.com>
JangBoo pushed a commit to JangBoo/vcpkg that referenced this pull request Jun 18, 2020
* [Arrow] Update to 0.17.1

* Remove arrow:x64-linux=fail from ci.baseline.txt.
Add explicit tool dependencies on Flex and Bison for Linux and OSX.

* Revert arrow dependency on Flex/Bison, it's Thrift that needs them and its portfile is already fine.

* Use vcpkg_fail_port_install(ON_ARCH x86 arm arm64) instead of custom check.
Remove thrift:x64-osx=fail from ci.baseline.txt (we know arrow depends on it, and arrow:x64-osx has been shown to work in 3rd party project).

* Disable using pkg-config files to locate dependencies in arrow

This is incompatible with vcpkg as these files refer to paths in the
packages directory rather than the installed directory, so this only
works if the packages haven't been cleaned.

* Mark thrift:x64-osx as still failing until a proper solution for Bison can be found.

* Update ports/arrow/portfile.cmake

Co-authored-by: Adam Reeve <adreeve@gmail.com>
Co-authored-by: NancyLi1013 <46708020+NancyLi1013@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category:port-update The issue is with a library, which is requesting update new revision info:needs-maintainer-attention Lets the current 'on rotation' vcpkg maintainer know they need to look at this.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants