ci: Temporarily use clang in valgrind tasks#34589
ci: Temporarily use clang in valgrind tasks#34589maflcko wants to merge 3 commits intobitcoin:masterfrom
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
l0rinc
left a comment
There was a problem hiding this comment.
concept ACK
Could you please add a link to the bug for more context and split the change to commits that explain all the different changes.
And it's "temporary", can we add a todo to the code to make it official :)?
It is not clear what to do, so a tracking issue seems more appropriate than a vague todo.
The upstream bugs are https://bugs.kde.org/show_bug.cgi?id=485276 and https://bugs.kde.org/show_bug.cgi?id=472329 |
fa173ec to
fafbcc8
Compare
fafbcc8 to
fa978b6
Compare
|
rebased and ready for nack/ack again |
fa978b6 to
fa6fdff
Compare
This allows to run the test under valgrind:
./bld-cmake/test/functional/feature_dbcrash.py --timeout-factor=10 --valgrind
For testing, the same test can be run multiple times in parallel:
./bld-cmake/test/functional/test_runner.py -j 10 $( printf 'feature_dbcrash.py %.0s' {1..10} ) --timeout-factor=10 --valgrind
(Running the test under valgrind may take several hours!)
I found that before this commit, 9 out of the 10 runs failed via:
```
...
TestFramework (INFO): Iteration 36, generating 2500 transactions [11, 5, 6]
TestFramework (ERROR): Unexpected exception
Traceback (most recent call last):
File "/b-c/test/functional/test_framework/test_framework.py", line 142, in main
self.run_test()
~~~~~~~~~~~~~^^
File "/b-c/bld-cmake/test/functional/feature_dbcrash.py", line 262, in run_test
self.sync_node3blocks(block_hashes)
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "/b-c/bld-cmake/test/functional/feature_dbcrash.py", line 151, in sync_node3blocks
nodei_utxo_hash = self.restart_node(i, block_hash)
File "/b-c/bld-cmake/test/functional/feature_dbcrash.py", line 102, in restart_node
raise AssertionError(f"Unable to successfully restart node {node_index} in allotted time")
AssertionError: Unable to successfully restart node 0 in allotted time
```
With this commit, all 10 runs passed.
valgrind currently does not work on GCC -O2 compiled executables, which contain std::optional use, due to an upstream bug. See https://bugs.kde.org/show_bug.cgi?id=472329 One workaround could be to use -O1. However, that seems brittle, as variantions of the bug were seen with -O1 as well. So temporarily use clang in the valgrind CI tasks, because this also allows to drop a false-positive suppression for: -DCMAKE_CXX_FLAGS='-Wno-error=array-bounds' Also, bump the container CI_IMAGE_NAME_TAG, while touching the config. This bumps the valgrind version to be more recent: https://packages.ubuntu.com/resolute/valgrind vs https://packages.debian.org/trixie/valgrind Also, update the comment in contrib/valgrind.supp to remove GCC and aarch64: * GCC wasn't tested with the suppressions file, due to the mentioned bug * Clang-17 (or later) on aarch64 wasn't tested due to bug bitcoin#29635 and the minimum supported clang version is clang-17 right now. This means the only tested config right now is the one mentioned in the suppression file: "Tested on x86_64 with Debian Trixie system libs, using clang, without gui."
fa6fdff to
fa6ba11
Compare
|
Code review ACK fa6ba11 |
| # Note that suppressions may depend on OS and/or library versions. | ||
| # Tested on aarch64 and x86_64 with Ubuntu Noble system libs, using clang-16 | ||
| # and GCC, without gui. | ||
| # Tested on x86_64 with Debian Trixie system libs, using clang, without gui. |
There was a problem hiding this comment.
fa6ba11 ci: Temporarily use clang in valgrind tasks:
Was it actually re-tested after the rebase with Debian Trixie
There was a problem hiding this comment.
No, I have not tested the latest push at all. I've only tested this on Ubuntu 26.04. It should be easy to re-test by manually spinning up a x86_64 box and running the two CI valgrind configs, which use Debian Trixie.
| @@ -17,7 +17,9 @@ export TEST_RUNNER_EXTRA="--exclude rpc_bind --exclude feature_bind_extra" | |||
| export GOAL="install" | |||
| # TODO enable GUI | |||
There was a problem hiding this comment.
I wonder why this isn't enabled. Let me try to enable it ...
Edit: Running with system libs gives:
$ valgrind --gen-suppressions=all --exit-on-first-error=yes --error-exitcode=1 --quiet ./bld-cmake/bin/test_bitcoin-qt
...
==906896== Conditional jump or move depends on uninitialised value(s)
==906896== at 0x243F0E26: ???
==906896== by 0x1F08261F: ???
==906896==
{
<insert_a_suppression_name_here>
Memcheck:Cond
obj:*
obj:*
}
==906896==
==906896== Exit program on first error (--exit-on-first-error=yes)
There was a problem hiding this comment.
Only seems to pass with DEBUG=1 depends.
For reference, the following command did pass for me:
( cd depends && make DEBUG=1 -j $(nproc) ) && cmake --fresh -B ./bld-cmake --toolchain depends/*/toolchain.cmake -DAPPEND_CXXFLAGS='-O3 -g2' -DAPPEND_CFLAGS='-O3 -g2' -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXE_LINKER_FLAGS=-fuse-ld=mold -DCMAKE_C_COMPILER='clang' -DCMAKE_CXX_COMPILER='clang++' --preset=dev-mode -DCMAKE_EXPORT_COMPILE_COMMANDS=ON && cmake --build ./bld-cmake --parallel $(nproc) && valgrind --exit-on-first-error=yes --error-exitcode=1 --quiet ./bld-cmake/bin/test_bitcoin-qt
Pushed a new commit to remove the TODO for now.
There was a problem hiding this comment.
I've also tried:
diff --git a/ci/test/00_setup_env_native_msan.sh b/ci/test/00_setup_env_native_msan.sh
index 4a17fb4..0e1697b 100755
--- a/ci/test/00_setup_env_native_msan.sh
+++ b/ci/test/00_setup_env_native_msan.sh
@@ -16,14 +16,13 @@ export MSAN_AND_LIBCXX_FLAGS="${MSAN_FLAGS} ${LIBCXX_FLAGS}"
export CONTAINER_NAME="ci_native_msan"
export PACKAGES="clang-${APT_LLVM_V} llvm-${APT_LLVM_V} llvm-${APT_LLVM_V}-dev libclang-${APT_LLVM_V}-dev libclang-rt-${APT_LLVM_V}-dev python3-pip"
export PIP_PACKAGES="--break-system-packages pycapnp"
-export DEP_OPTS="DEBUG=1 NO_QT=1 CC=clang CXX=clang++ CFLAGS='${MSAN_FLAGS}' CXXFLAGS='${MSAN_AND_LIBCXX_FLAGS}'"
+export DEP_OPTS="DEBUG=1 CC=clang CXX=clang++ CFLAGS='${MSAN_FLAGS}' CXXFLAGS='${MSAN_AND_LIBCXX_FLAGS}'"
export GOAL="install"
export CI_LIMIT_STACK_SIZE=1
# Setting CMAKE_{C,CXX}_FLAGS_DEBUG flags to an empty string ensures that the flags set in MSAN_FLAGS remain unaltered.
# _FORTIFY_SOURCE is not compatible with MSAN.
export BITCOIN_CONFIG="\
--preset=dev-mode \
- -DBUILD_GUI=OFF \
-DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_C_FLAGS_DEBUG='' \
-DCMAKE_CXX_FLAGS_DEBUG='' \Which prints:
All tests passed.
~InitExecutor : Stopping thread
~InitExecutor : Stopped thread
==87986==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x56145624af4f in QLibraryStore::cleanup() /usr/qtbase/src/corelib/plugin/qlibrary.cpp:364:26
#1 0x56145623d05c in qlibraryCleanup() /usr/qtbase/src/corelib/plugin/qlibrary.cpp:378:5
#2 0x56145623d05c in (anonymous namespace)::qlibraryCleanup_dtor_class_::~qlibraryCleanup_dtor_class_() /usr/qtbase/src/corelib/plugin/qlibrary.cpp:380:23
#3 0x561450f4884f in MSanCxaAtExitWrapper(void*) (/ci_container_base/ci/scratch/build-x86_64-pc-linux-gnu/bin/test_bitcoin-qt+0x2df84f) (BuildId: 0a3f6add7bb2dd2a7bfc097a6edf88a12aebb62f)
#4 0x7fb04940da75 (/lib/x86_64-linux-gnu/libc.so.6+0x47a75) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
#5 0x7fb04940dbbd in exit (/lib/x86_64-linux-gnu/libc.so.6+0x47bbd) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
#6 0x7fb0493f01d0 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1d0) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
#7 0x7fb0493f028a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
#8 0x561450f16fb4 in _start (/ci_container_base/ci/scratch/build-x86_64-pc-linux-gnu/bin/test_bitcoin-qt+0x2adfb4) (BuildId: 0a3f6add7bb2dd2a7bfc097a6edf88a12aebb62f)
Member fields were destroyed
#0 0x561450f47de1 in __sanitizer_dtor_callback_fields (/ci_container_base/ci/scratch/build-x86_64-pc-linux-gnu/bin/test_bitcoin-qt+0x2dede1) (BuildId: 0a3f6add7bb2dd2a7bfc097a6edf88a12aebb62f)
#1 0x561455b213ae in QLoggingCategory::~QLoggingCategory() /usr/qtbase/src/corelib/io/qloggingcategory.h:41:32
#2 0x561455b213ae in QLoggingCategory::~QLoggingCategory() /usr/qtbase/src/corelib/io/qloggingcategory.cpp:194:1
#3 0x561450f4884f in MSanCxaAtExitWrapper(void*) (/ci_container_base/ci/scratch/build-x86_64-pc-linux-gnu/bin/test_bitcoin-qt+0x2df84f) (BuildId: 0a3f6add7bb2dd2a7bfc097a6edf88a12aebb62f)
#4 0x7fb04940dbbd in exit (/lib/x86_64-linux-gnu/libc.so.6+0x47bbd) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
#5 0x7fb0493f01d0 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1d0) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
#6 0x7fb0493f028a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
#7 0x561450f16fb4 in _start (/ci_container_base/ci/scratch/build-x86_64-pc-linux-gnu/bin/test_bitcoin-qt+0x2adfb4) (BuildId: 0a3f6add7bb2dd2a7bfc097a6edf88a12aebb62f)
SUMMARY: MemorySanitizer: use-of-uninitialized-value /usr/qtbase/src/corelib/plugin/qlibrary.cpp:364:26 in QLibraryStore::cleanup()
Exiting
99% tests passed, 1 tests failed out of 157
Total Test time (real) = 608.27 sec
The following tests FAILED:
7 - test_bitcoin-qt (Failed)
Errors while running CTest
Command '['/ci_container_base/ci/test/03_test_script.sh']' returned non-zero exit status 8.
A build with system libs (or with a normal depends build) will fail with: ```sh $ valgrind --exit-on-first-error=yes --error-exitcode=1 --quiet ./bld-cmake/bin/test_bitcoin-qt Detected locale "C" with character encoding "ANSI_X3.4-1968", which is not UTF-8. Qt depends on a UTF-8 locale, and has switched to "C.UTF-8" instead. If this causes problems, reconfigure your locale. See the locale(1) manual for more information. ********* Start testing of AppTests ********* Config: Using QtTest library 6.10.2, Qt 6.10.2 (x86_64-little_endian-lp64 shared (dynamic) release build; by GCC 15.2.0), ubuntu 26.04 PASS : AppTests::initTestCase() QINFO : AppTests::appTests() Backing up GUI settings to "/tmp/test_common bitcoin/60d474ffae390f81657d/regtest/guisettings.ini.bak" ==18007== Conditional jump or move depends on uninitialised value(s) ==18007== at 0x12655E26: ??? ==18007== by 0xCB28E7F: ??? ==18007== ==18007== ==18007== Exit program on first error (--exit-on-first-error=yes) ``` A DEBUG=1 depends build would work, but that seems tedious for questionable benefit.
|
Code review ACK fac9ae4 |
valgrind currently does not work on GCC compiled executables, due to an upstream bug. https://bugs.kde.org/show_bug.cgi?id=472329
So temporarily switch to clang, so that a long term solution can be figured out in the meantime.