Skip to content

GH-32292: [R][Packaging] Use binaries built on CentOS 7 for Ubuntu < 22.04#34048

Merged
assignUser merged 21 commits intoapache:mainfrom
kou:r-binaries-centos-7
Mar 22, 2023
Merged

GH-32292: [R][Packaging] Use binaries built on CentOS 7 for Ubuntu < 22.04#34048
assignUser merged 21 commits intoapache:mainfrom
kou:r-binaries-centos-7

Conversation

@kou
Copy link
Copy Markdown
Member

@kou kou commented Feb 6, 2023

Rationale for this change

Binaries built on CentOS 7 uses _GLIBCXX_USE_CXX11_ABI=0. So we can't use them on Ubuntu 20.04 because
Ubuntu 20.04 uses _GLIBCXX_USE_CXX11_ABI=1. If we use them on Ubuntu 20.04, our R package must use
_GLIBCXX_USE_CXX11_ABI=0 too.

There is another problem for Ubuntu 20.04. CentOS 7 uses OpenSSL 1.0 but Ubuntu 20.04 uses OpenSSL 1.1.
OpenSSL 1.0 and 1.1 are incompatible. So we can't use binaries built on CentOS 7. We need binaries for OpenSSL
1.0 and OpenSSL 1.1. We can't use the same binaries on CentOS 7 (OpenSSL 1.0) and Ubuntu 20.04 (OpenSSL 1.1).

What changes are included in this PR?

This changes add -D_GLIBCXX_USE_CXX11_ABI=0 to Cflags in arrow.pc. Our R package uses Cflags in
arrow.pc. So users don't need to specify -D_GLIBCXX_USE_CXX11_ABI=0 explicitly.

This changes replace binaries built on Ubuntu 18.04 with binaries built on CentOS 7 with OpenSSL 1.1.
(The "ubuntu-18.04" binaries are replaced with the "linux-openssl-1.1" binaries.)

The "centos-7" binaries are renamed to the "linux-openssl-1.0" binaries.

The "ubuntu-22.04" binaries are renamed to the "linux-openssl-3.0" binaries.

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@kou
Copy link
Copy Markdown
Member Author

kou commented Feb 6, 2023

@github-actions crossbow submit r-binary-packages

@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 6, 2023

@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 6, 2023

⚠️ GitHub issue #32292 has been automatically assigned in GitHub to PR creator.

@github-actions

This comment was marked as outdated.

@kou
Copy link
Copy Markdown
Member Author

kou commented Feb 6, 2023

@github-actions crossbow submit r-binary-packages

@github-actions

This comment was marked as outdated.

@kou
Copy link
Copy Markdown
Member Author

kou commented Feb 6, 2023

@github-actions crossbow submit r-binary-packages

@github-actions

This comment was marked as outdated.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How common is it to be on ubuntu and not have pkg-config installed? Does this mean we would require pkg-config in order for the binary to be used?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How common is it to be on ubuntu and not have pkg-config installed?

Hmm. I'm not sure preference of R users on Ubuntu but I think that users who have GCC also have pkg-config.

But rocker/r-ver:4.0.0 and rocker/r-ver:3.6.3 doesn't have pkg-config. So we need this.

Does this mean we would require pkg-config in order for the binary to be used?

Yes.

We can fallback to source build when a user doesn't have pkg-config:

diff --git a/r/configure b/r/configure
index ff6a9dacc4..08fa58d327 100755
--- a/r/configure
+++ b/r/configure
@@ -176,6 +176,7 @@ else
       if [ "${ARROW_DEPENDENCY_SOURCE}" = "AUTO" ] && \
            [ "${PKG_CONFIG_AVAILABLE}" = "false" ]; then
         export ARROW_DEPENDENCY_SOURCE=BUNDLED
+        export LIBARROW_BINARY=false
         echo "**** pkg-config not installed, setting ARROW_DEPENDENCY_SOURCE=BUNDLED"
       fi
 

Do you want me to add this change to this PR?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather ship a binary that just works. Would it be terrible to grep for the CXX11_ABI in arrow.pc and add it to PKG_CFLAGS?

If we were to set LIBARROW_BINARY=false if no pkg-config, it should be in its own "${PKG_CONFIG_AVAILABLE}" = "false" check, not connected to whatever the value of ARROW_DEPENDENCY_SOURCE is since that's about building libarrow from source.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather ship a binary that just works. Would it be terrible to grep for the CXX11_ABI in arrow.pc and add it to PKG_CFLAGS?

It will work. I've added the logic but I'm afraid of increasing ad-hoc logics again.

If we were to set LIBARROW_BINARY=false if no pkg-config, it should be in its own "${PKG_CONFIG_AVAILABLE}" = "false" check, not connected to whatever the value of ARROW_DEPENDENCY_SOURCE is since that's about building libarrow from source.

It makes sense.

@kou
Copy link
Copy Markdown
Member Author

kou commented Feb 7, 2023

We can't use the centos-7 binary on Ubuntu 20.04 because centos-7 binary built with OpenSSL 1.0 and Ubuntu 20.04 uses OpenSSL 1.1.

https://github.com/ursacomputing/crossbow/actions/runs/4103886242/jobs/7078891973#step:12:266

  /opt/R/4.1.3/lib/R/library/00LOCK-arrow/00new/arrow/libs/arrow.so: undefined symbol: HMAC_CTX_cleanup

FYI: HMAC_CTX_cleanup() is referred by aws-sdk-cpp.

Here are solution candidates:

  1. Add centos-7-openssl-1.1 binary (We can use OpenSSL 1.1 on CentOS 7 because EPEL provides `openssl11-devel)
  2. Add ubuntu-20.04 binary
  3. Updating aws-sdk-cpp (GH-20272: [C++] Bump version of bundled AWS SDK #33808) may resolve this because OpenSSL related code is changed in the latest aws-sdk-cpp (hint: s2n-tls)

I'll try 3. first but is there any preference?

@nealrichardson
Copy link
Copy Markdown
Member

If upgrading aws-sdk-cpp solves it, great, but doesn't google-cloud also require openssl?

Could we bundle openssl rather than require it from the system? I think the answer from last time I asked this was "no because security", but also as I understand it, the python wheels include openssl in them.

@kou kou force-pushed the r-binaries-centos-7 branch from 0b8afc6 to 34c5ec7 Compare February 9, 2023 06:28
@kou
Copy link
Copy Markdown
Member Author

kou commented Feb 9, 2023

@github-actions crossbow submit r-binary-packages

@github-actions

This comment was marked as outdated.

@kou
Copy link
Copy Markdown
Member Author

kou commented Feb 9, 2023

Upgrading aws-sdk-cpp doesn't solve the problem.

Could we bundle openssl rather than require it from the system? I think the answer from last time I asked this was "no because security", but also as I understand it, the python wheels include openssl in them.

Hmm. I want to stop bundling OpenSSL in wheel because we can't release a fixed version in a timely manner for now.
For example, we released 10.0.1 for https://nvd.nist.gov/vuln/detail/CVE-2022-3786 .
CVE-2022-3786 was published on 2022-11-01,
the 10.0.1 vote https://lists.apache.org/thread/rlkrj9lnfmwgn7kq8hvmzf06l5z6w30k was started on 2022-11-17 and
the vote was carried on 2022-11-23 https://lists.apache.org/thread/ozo2k7l7jhc0wj3wsrpqgsvy4fosprv8 .

Anyway, I try bundling OpenSSL.

@kou
Copy link
Copy Markdown
Member Author

kou commented Feb 9, 2023

@github-actions crossbow submit r-binary-packages

@github-actions

This comment was marked as outdated.

@kou
Copy link
Copy Markdown
Member Author

kou commented Feb 10, 2023

@github-actions crossbow submit r-binary-packages

@github-actions

This comment was marked as outdated.

@kou
Copy link
Copy Markdown
Member Author

kou commented Feb 10, 2023

@github-actions crossbow submit r-binary-packages

@github-actions

This comment was marked as outdated.

@kou
Copy link
Copy Markdown
Member Author

kou commented Feb 10, 2023

We can't use static linking for OpenSSL:

https://github.com/ursacomputing/crossbow/actions/runs/4140935566/jobs/7160057196#step:6:1857

/arrow/r/libarrow/dist/s2n_tls_ep-prefix/src/s2n_tls_ep/crypto/s2n_fips.c:21:6: error: #error "Interning with OpenSSL fips-validated libcrypto is not currently supported. See https://github.com/aws/s2n-tls/issues/2741"
     #error "Interning with OpenSSL fips-validated libcrypto is not currently supported. See https://github.com/aws/s2n-tls/issues/2741"
      ^~~~~

@kou kou force-pushed the r-binaries-centos-7 branch from 07452f9 to 514f27c Compare March 21, 2023 14:22
@kou
Copy link
Copy Markdown
Member Author

kou commented Mar 21, 2023

@github-actions crossbow submit -g cpp r-binary-packages

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Mar 21, 2023
@github-actions

This comment was marked as outdated.

@kou
Copy link
Copy Markdown
Member Author

kou commented Mar 21, 2023

@github-actions crossbow submit -g cpp r-binary-packages

@github-actions
Copy link
Copy Markdown

Revision: b7dd9a4

Submitted crossbow builds: ursacomputing/crossbow @ actions-ce74ddcb8c

Task Status
r-binary-packages Github Actions
test-alpine-linux-cpp Github Actions
test-build-cpp-fuzz Github Actions
test-conda-cpp Github Actions
test-conda-cpp-valgrind Azure
test-cuda-cpp Github Actions
test-debian-10-cpp-amd64 Github Actions
test-debian-10-cpp-i386 Github Actions
test-debian-11-cpp-amd64 Github Actions
test-debian-11-cpp-i386 Github Actions
test-fedora-35-cpp Github Actions
test-ubuntu-18.04-cpp Github Actions
test-ubuntu-18.04-cpp-release Github Actions
test-ubuntu-18.04-cpp-static Github Actions
test-ubuntu-20.04-cpp Github Actions
test-ubuntu-20.04-cpp-20 Github Actions
test-ubuntu-20.04-cpp-bundled Github Actions
test-ubuntu-20.04-cpp-thread-sanitizer Github Actions
test-ubuntu-22.04-cpp Github Actions

@kou
Copy link
Copy Markdown
Member Author

kou commented Mar 21, 2023

@github-actions crossbow submit r-binary-packages

@github-actions
Copy link
Copy Markdown

Revision: 955aa34

Submitted crossbow builds: ursacomputing/crossbow @ actions-0f2d95fc8a

Task Status
r-binary-packages Github Actions

@kou
Copy link
Copy Markdown
Member Author

kou commented Mar 22, 2023

I'll merge this tomorrow if nobody objects it.

Copy link
Copy Markdown
Member

@assignUser assignUser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@assignUser assignUser merged commit 9288275 into apache:main Mar 22, 2023
@assignUser
Copy link
Copy Markdown
Member

assignUser commented Mar 22, 2023

(I merged now so the change affects the next packaging-nightly run)

@kou kou deleted the r-binaries-centos-7 branch March 22, 2023 04:07
@ursabot
Copy link
Copy Markdown

ursabot commented Mar 22, 2023

Benchmark runs are scheduled for baseline = 5b49d0f and contender = 9288275. 9288275 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.36% ⬆️0.0%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.03% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 92882751 ec2-t3-xlarge-us-east-2
[Finished] 92882751 test-mac-arm
[Finished] 92882751 ursa-i9-9960x
[Finished] 92882751 ursa-thinkcentre-m75q
[Finished] 5b49d0f1 ec2-t3-xlarge-us-east-2
[Failed] 5b49d0f1 test-mac-arm
[Finished] 5b49d0f1 ursa-i9-9960x
[Finished] 5b49d0f1 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

ArgusLi pushed a commit to Bit-Quill/arrow that referenced this pull request May 15, 2023
…ntu < 22.04 (apache#34048)

### Rationale for this change

Binaries built on CentOS 7 uses `_GLIBCXX_USE_CXX11_ABI=0`. So we can't use them on Ubuntu 20.04 because
Ubuntu 20.04 uses `_GLIBCXX_USE_CXX11_ABI=1`. If we use them on Ubuntu 20.04, our R package must use
`_GLIBCXX_USE_CXX11_ABI=0` too.

There is another problem for Ubuntu 20.04. CentOS 7 uses OpenSSL 1.0 but Ubuntu 20.04 uses OpenSSL 1.1.
OpenSSL 1.0 and 1.1 are incompatible. So we can't use binaries built on CentOS 7. We need binaries for OpenSSL
1.0 and OpenSSL 1.1. We can't use the same binaries on CentOS 7 (OpenSSL 1.0) and Ubuntu 20.04 (OpenSSL 1.1).

### What changes are included in this PR?

This changes add `-D_GLIBCXX_USE_CXX11_ABI=0` to `Cflags` in `arrow.pc`. Our R package uses `Cflags` in
`arrow.pc`. So users don't need to specify `-D_GLIBCXX_USE_CXX11_ABI=0` explicitly.

This changes replace binaries built on Ubuntu 18.04 with binaries built on CentOS 7 with OpenSSL 1.1.
(The "ubuntu-18.04" binaries are replaced with the "linux-openssl-1.1" binaries.)

The "centos-7" binaries are renamed to the "linux-openssl-1.0" binaries.

The "ubuntu-22.04" binaries are renamed to the "linux-openssl-3.0" binaries.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

* Closes: apacheGH-33091
* Closes: apache#32292

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Jacob Wujciak-Jens <jacob@wujciak.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[R][CI] Remove ubuntu-18.04 from nixlibs & prebuilt binaries [R] Build linux binaries on older image (like manylinux2014)

4 participants