Skip to content

win32_deps_build: skip patching removed boost files#52427

Merged
idryomov merged 2 commits intoceph:mainfrom
petrutlucian94:fix_win32_boost
Jul 20, 2023
Merged

win32_deps_build: skip patching removed boost files#52427
idryomov merged 2 commits intoceph:mainfrom
petrutlucian94:fix_win32_boost

Conversation

@petrutlucian94
Copy link
Contributor

@petrutlucian94 petrutlucian94 commented Jul 13, 2023

We're attempting to patch some Python files that have been removed from recent Boost versions.

Now that the Boost version has been bumped, we'll need to address this in order to unblock the Windows build.

Note that we'll also have to skip test_back_trace when using mingw-gcc as it triggers an ICE with recent Boost versions. However, this issue does not affect mingw-llvm.

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@petrutlucian94 petrutlucian94 added win32 Specifix changes for the windows platform and removed build/ops labels Jul 13, 2023
@petrutlucian94 petrutlucian94 mentioned this pull request Jul 13, 2023
14 tasks
@idryomov
Copy link
Contributor

Triggered an ICE:

[571/963] /usr/bin/x86_64-w64-mingw32-g++-posix -DBOOST_ASIO_DISABLE_THREAD_KEYWORD_EXTENSION -DBOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT -DBOOST_CHRONO_NO_LIB -DBOOST_DATE_TIME_NO_LIB -DBOOST_IOSTREAMS_NO_LIB -DBOOST_PROGRAM_OPTIONS_NO_LIB -DBOOST_RANDOM_NO_LIB -DBOOST_SYSTEM_NO_LIB -DBOOST_THREAD_NO_LIB -DBOOST_THREAD_PROVIDES_GENERIC_SHARED_MUTEX_ON_WIN -DBOOST_THREAD_V2_SHARED_MUTEX -DFMT_USE_TZSET=0 -DHAVE_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2 -D_POSIX=1 -D_POSIX_=1 -D_POSIX_C_SOURCE=1 -D_POSIX_THREADS=1 -D_REENTRANT -D_THREAD_SAFE -D_WIN32_WINNT=0x0A00 -D__CEPH__ -D__STDC_FORMAT_MACROS -I/home/ubuntu/ceph/build/src/include -I/home/ubuntu/ceph/src -I/home/ubuntu/ceph/src/include/win32 -isystem /home/ubuntu/ceph/build.deps/mingw/boost/include -isystem /home/ubuntu/ceph/build/include -isystem /home/ubuntu/ceph/src/xxHash -isystem /home/ubuntu/ceph/src/fmt/include -isystem /home/ubuntu/ceph/src/googletest/googlemock/include -isystem /home/ubuntu/ceph/src/googletest/googlemock -isystem /home/ubuntu/ceph/src/googletest/googletest/include -isystem /home/ubuntu/ceph/src/googletest/googletest -isystem /home/ubuntu/ceph/build.deps/mingw/zlib/include -isystem /home/ubuntu/ceph/build.deps/mingw/openssl/include -g1 -O3 -DNDEBUG   -U_FORTIFY_SOURCE -include winsock_wrapper.h -include win32_errno.h -DBOOST_PHOENIX_STL_TUPLE_H_ -Wall -fno-strict-aliasing -fsigned-char -Wtype-limits -Wignored-qualifiers -Wpointer-arith -Werror=format-security -Winit-self -Wno-unknown-pragmas -Wnon-virtual-dtor -Wno-ignored-qualifiers -ftemplate-depth-1024 -Wpessimizing-move -Wredundant-move -fpermissive -Wstrict-null-sentinel -Woverloaded-virtual -fstack-protector-strong -fdiagnostics-color=auto -std=c++2a -fno-inline -MD -MT src/test/common/CMakeFiles/unittest_back_trace.dir/test_back_trace.cc.obj -MF src/test/common/CMakeFiles/unittest_back_trace.dir/test_back_trace.cc.obj.d -o src/test/common/CMakeFiles/unittest_back_trace.dir/test_back_trace.cc.obj -c /home/ubuntu/ceph/src/test/common/test_back_trace.cc
FAILED: src/test/common/CMakeFiles/unittest_back_trace.dir/test_back_trace.cc.obj 
/usr/bin/x86_64-w64-mingw32-g++-posix -DBOOST_ASIO_DISABLE_THREAD_KEYWORD_EXTENSION -DBOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT -DBOOST_CHRONO_NO_LIB -DBOOST_DATE_TIME_NO_LIB -DBOOST_IOSTREAMS_NO_LIB -DBOOST_PROGRAM_OPTIONS_NO_LIB -DBOOST_RANDOM_NO_LIB -DBOOST_SYSTEM_NO_LIB -DBOOST_THREAD_NO_LIB -DBOOST_THREAD_PROVIDES_GENERIC_SHARED_MUTEX_ON_WIN -DBOOST_THREAD_V2_SHARED_MUTEX -DFMT_USE_TZSET=0 -DHAVE_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2 -D_POSIX=1 -D_POSIX_=1 -D_POSIX_C_SOURCE=1 -D_POSIX_THREADS=1 -D_REENTRANT -D_THREAD_SAFE -D_WIN32_WINNT=0x0A00 -D__CEPH__ -D__STDC_FORMAT_MACROS -I/home/ubuntu/ceph/build/src/include -I/home/ubuntu/ceph/src -I/home/ubuntu/ceph/src/include/win32 -isystem /home/ubuntu/ceph/build.deps/mingw/boost/include -isystem /home/ubuntu/ceph/build/include -isystem /home/ubuntu/ceph/src/xxHash -isystem /home/ubuntu/ceph/src/fmt/include -isystem /home/ubuntu/ceph/src/googletest/googlemock/include -isystem /home/ubuntu/ceph/src/googletest/googlemock -isystem /home/ubuntu/ceph/src/googletest/googletest/include -isystem /home/ubuntu/ceph/src/googletest/googletest -isystem /home/ubuntu/ceph/build.deps/mingw/zlib/include -isystem /home/ubuntu/ceph/build.deps/mingw/openssl/include -g1 -O3 -DNDEBUG   -U_FORTIFY_SOURCE -include winsock_wrapper.h -include win32_errno.h -DBOOST_PHOENIX_STL_TUPLE_H_ -Wall -fno-strict-aliasing -fsigned-char -Wtype-limits -Wignored-qualifiers -Wpointer-arith -Werror=format-security -Winit-self -Wno-unknown-pragmas -Wnon-virtual-dtor -Wno-ignored-qualifiers -ftemplate-depth-1024 -Wpessimizing-move -Wredundant-move -fpermissive -Wstrict-null-sentinel -Woverloaded-virtual -fstack-protector-strong -fdiagnostics-color=auto -std=c++2a -fno-inline -MD -MT src/test/common/CMakeFiles/unittest_back_trace.dir/test_back_trace.cc.obj -MF src/test/common/CMakeFiles/unittest_back_trace.dir/test_back_trace.cc.obj.d -o src/test/common/CMakeFiles/unittest_back_trace.dir/test_back_trace.cc.obj -c /home/ubuntu/ceph/src/test/common/test_back_trace.cc
during IPA pass: inline
/home/ubuntu/ceph/src/test/common/test_back_trace.cc:44:1: internal compiler error: Segmentation fault
   44 | }
      | ^
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://gcc.gnu.org/bugs/> for instructions.

@idryomov
Copy link
Contributor

jenkins test windows

@idryomov
Copy link
Contributor

jenkins test make check

@petrutlucian94
Copy link
Contributor Author

jenkins test windows

@idryomov
Copy link
Contributor

The ICE when compiling src/test/common/test_back_trace.cc persists.

@petrutlucian94
Copy link
Contributor Author

The ICE when compiling src/test/common/test_back_trace.cc persists.

Indeed, I'll have to investigate it further.

@github-actions github-actions bot added the tests label Jul 14, 2023
@petrutlucian94
Copy link
Contributor Author

petrutlucian94 commented Jul 14, 2023

@idryomov looks like the ICE only occurs with mingw-gcc and recent Boost versions, mingw-llvm works fine.

Considering that we're planning on switching to mingw-llvm, I've added another commit that just skips the test when using mingw-gcc. Talking of mingw-llvm, @adamemerson could you please review the timespan changes as requested by Ilya? #51197 (comment)

@petrutlucian94
Copy link
Contributor Author

jenkins test windows

@idryomov
Copy link
Contributor

jenkins test make check

@idryomov
Copy link
Contributor

jenkins test windows

@idryomov
Copy link
Contributor

Windows check is failing as follows repeatedly:

Reading package lists...
Building dependency tree...
Reading state information...
dpkg-dev is already the newest version (1.21.1ubuntu2.2).
dpkg-dev set to manually installed.
0 upgraded, 0 newly installed, 0 to remove and 3 not upgraded.
504+ EXIT_CODE=22
+ [[ 22 -eq 124 ]]
+ return 22

real	1m6.647s
user	0m0.069s
sys	0m0.153s
Build step 'Execute shell' marked build as failure

@petrutlucian94
Copy link
Contributor Author

@idryomov @adamemerson The new boost shaman url doesn't work, I reproduced the install-deps.sh failure locally:

244c5eb#diff-1335d27b188d672bf9a9b572f40df7c4699b8d9998fb16ba284f4acb4e77077c

sudo curl --silent --fail --write-out '%{http_code}' --location https://shaman.ceph.com/api/repos/libboost/master/2804368f5b807ba8334b0ccfeb8af191edeb996f/ubuntu/jammy/repo --output /etc/apt/sources.list.d/libboost.list
504

I don't see the new hash here: https://shaman.ceph.com/api/repos/libboost/master/

@idryomov
Copy link
Contributor

I don't see the new hash here: https://shaman.ceph.com/api/repos/libboost/master/

I think this is supposed to come from https://github.com/ceph/ceph-boost and the last time this was done David and/or Kefu were involved. @adamemerson I see that you had made some changes to that repo as part of switching to 1.82, could you please follow up on the missing build?

@adamemerson
Copy link
Contributor

I don't see the new hash here: https://shaman.ceph.com/api/repos/libboost/master/

I think this is supposed to come from https://github.com/ceph/ceph-boost and the last time this was done David and/or Kefu were involved. @adamemerson I see that you had made some changes to that repo as part of switching to 1.82, could you please follow up on the missing build?

Working on it. We had a build uploaded, I don't know why it went away. I'm talking to the infrastructure team right now.

@adamemerson
Copy link
Contributor

I don't see the new hash here: https://shaman.ceph.com/api/repos/libboost/master/

I think this is supposed to come from https://github.com/ceph/ceph-boost and the last time this was done David and/or Kefu were involved. @adamemerson I see that you had made some changes to that repo as part of switching to 1.82, could you please follow up on the missing build?

Working on it. We had a build uploaded, I don't know why it went away. I'm talking to the infrastructure team right now.

Infrastructure team says that old builds are periodically removed, they said they'll have to look into how to protect them. I'm re-uploading.

@adamemerson
Copy link
Contributor

jenkins test make check

@petrutlucian94
Copy link
Contributor Author

jenkins test windows

@petrutlucian94
Copy link
Contributor Author

jenkins test make check

@idryomov
Copy link
Contributor

make check is impacted by an unrelated issue across PRs but the Windows check also failed:

[2023-07-18T07:40:30.000Z] [isolated][googletest] ceph_test_libcephfs failed. Error: Command returned non-zero code(3): "cmd /c 'C:\ceph\ceph_test_libcephfs.exe --gtest_output=xml:C:\workspace\test_results\out\ceph_test_libcephfs\ceph_test_libcephfs_results.xml --gtest_filter="*" >> C:\workspace\test_results\out\ceph_test_libcephfs\ceph_test_libcephfs_results.log 2>&1'".

@petrutlucian94
Copy link
Contributor Author

petrutlucian94 commented Jul 18, 2023

make check is impacted by an unrelated issue across PRs but the Windows check also failed:

[2023-07-18T07:40:30.000Z] [isolated][googletest] ceph_test_libcephfs failed. Error: Command returned non-zero code(3): "cmd /c 'C:\ceph\ceph_test_libcephfs.exe --gtest_output=xml:C:\workspace\test_results\out\ceph_test_libcephfs\ceph_test_libcephfs_results.xml --gtest_filter="*" >> C:\workspace\test_results\out\ceph_test_libcephfs\ceph_test_libcephfs_results.log 2>&1'".

I'll trigger a recheck to see if it's a flaky test:

/home/ubuntu/ceph/src/test/libcephfs/deleg.cc:225: Failure
Expected equality of these values:
  opened.load()
    Which is: true
  false
terminate called without an active exception

LE: the test keeps failing, looking into it.

@petrutlucian94
Copy link
Contributor Author

jenkins test windows

We're attempting to patch some Python files that have been removed
from recent Boost versions.

Now that the Boost version has been bumped, we'll need to address
this in order to unblock the Windows build.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
We're getting an ICE when trying to compile this test using
mingw-gcc and recent Boost versions. Note that mingw-llvm works fine.

    during IPA pass: inline
    /mnt/data/workspace/ceph.pr/src/test/common/test_back_trace.cc:44:1:
    internal compiler error: Segmentation fault
       44 | }
          | ^
    0x7f9c4a86c51f ???
            ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
    0x7f9c4a853d8f __libc_start_call_main
            ../sysdeps/x86/libc-start.c:58
    0x7f9c4a853e3f __libc_start_main_impl
            ../sysdeps/nptl/libc_start_call_main.h:392

For now, we'll just skip the test when using mingw-gcc. Note
that we're planing to switch to mingw-llvm anyway:
ceph#51197

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
@petrutlucian94
Copy link
Contributor Author

petrutlucian94 commented Jul 18, 2023

ASSERT_EQ(opened.load(), false) occasionally fails when running LibCephFS.DelegMultiClient:

/home/ubuntu/ceph/src/test/libcephfs/deleg.cc:225: Failure
Expected equality of these values:
  opened.load()
    Which is: true
  false
terminate called without an active exception

The test performs a r/w open and expects the delegation to be recalled, so far so good. The problem is that it waits 1ms and then asserts that the opened flag hasn't been set, which is flaky. I'd have a few questions, maybe @jtlayton or @vshankar could help:

  • why are we asserting that opened is false after the delegation was recalled? I assume we're ensuring that the delegation is revoked before the open returns.
  • what's the purpose of the 1ms sleep in between? The test passes if we drop that line.

For what is worth, we're running this test in a multi-node setup (well, separate vms). It used to pass, so maybe the timing has changed slightly as a result of a recent commit. I guess we just have to update the assertions or maybe drop that usleep.

LE: I've added another commit which drops that particular usleep call, let me know if you're ok with it.
LLE: the job fails even without the usleep call (somehow worked locally). I've submitted a PR that skips the flaky tests for now, we need to unblock the Windows job as soon as possible: ceph/ceph-win32-tests#15.

// ensure r/o open does not break a r/o delegation
sprintf(filename, "deleg.rwro.%x", getpid());
ASSERT_EQ(ceph_ll_create(cmount, root, filename, 0666,
O_RDONLY|O_CREAT|O_EXCL, &file, &fh, &stx, 0, 0, perms), 0);
recalled.store(false);
ASSERT_EQ(ceph_ll_delegation_wait(cmount, fh, CEPH_DELEGATION_RD, dummy_deleg_cb, &recalled), 0);
std::thread breaker3(open_breaker_func, tcmount, filename, O_RDONLY, &opened);
breaker3.join();
ASSERT_EQ(recalled.load(), false);
// ensure that r/w open breaks r/o delegation
opened.store(false);
std::thread breaker4(open_breaker_func, tcmount, filename, O_WRONLY, &opened);
wait_for_atomic_bool(recalled);
usleep(1000);
ASSERT_EQ(opened.load(), false);
ASSERT_EQ(ceph_ll_delegation(cmount, fh, CEPH_DELEGATION_NONE, dummy_deleg_cb, &recalled), 0);
breaker4.join();
ASSERT_EQ(ceph_ll_close(cmount, fh), 0);
ASSERT_EQ(ceph_ll_unlink(cmount, root, filename, perms), 0);

@github-actions github-actions bot added the cephfs Ceph File System label Jul 18, 2023
@petrutlucian94
Copy link
Contributor Author

The last Windows job failed because of git errors, issuing a recheck.

fatal: unable to access 'https://github.com/ceph/seastar.git/': The requested URL returned error: 500
fatal: clone of 'https://github.com/ceph/seastar.git' into submodule path '/home/ubuntu/ceph/src/seastar' failed
Failed to clone 'src/seastar' a second time, aborting
+ EXIT_CODE=1
+ [[ 1 -eq 124 ]]

@petrutlucian94
Copy link
Contributor Author

jenkins test windows

@vshankar
Copy link
Contributor

ASSERT_EQ(opened.load(), false) occasionally fails when running LibCephFS.DelegMultiClient:

/home/ubuntu/ceph/src/test/libcephfs/deleg.cc:225: Failure
Expected equality of these values:
  opened.load()
    Which is: true
  false
terminate called without an active exception

The test performs a r/w open and expects the delegation to be recalled, so far so good. The problem is that it waits 1ms and then asserts that the opened flag hasn't been set, which is flaky. I'd have a few questions, maybe @jtlayton or @vshankar could help:

  • why are we asserting that opened is false after the delegation was recalled? I assume we're ensuring that the delegation is revoked before the open returns.

Just before this assert there is a check

wait_for_atomic_bool(recalled);

which checks if the delegation was recalled and the test is probably assuming that just after the recall the file is still under open. This is racy, especially since the earlier tests does

// ensure r/o open breaks a r/w delegation
...
...
wait_for_atomic_bool(recalled);
ASSERT_EQ(opened.load(), false);
...
...
  • what's the purpose of the 1ms sleep in between? The test passes if we drop that line.

For what is worth, we're running this test in a multi-node setup (well, separate vms). It used to pass, so maybe the timing has changed slightly as a result of a recent commit. I guess we just have to update the assertions or maybe drop that usleep.

LE: I've added another commit which drops that particular usleep call, let me know if you're ok with it. LLE: the job fails even without the usleep call (somehow worked locally). I've submitted a PR that skips the flaky tests for now, we need to unblock the Windows job as soon as possible: ceph/ceph-win32-tests#15.

Maybe the check should be that opened is true, however, that's a racy check too, but, we could first wait for the thread to be joined and check if the thread could successfully open() the file.

// ensure r/o open does not break a r/o delegation
sprintf(filename, "deleg.rwro.%x", getpid());
ASSERT_EQ(ceph_ll_create(cmount, root, filename, 0666,
O_RDONLY|O_CREAT|O_EXCL, &file, &fh, &stx, 0, 0, perms), 0);
recalled.store(false);
ASSERT_EQ(ceph_ll_delegation_wait(cmount, fh, CEPH_DELEGATION_RD, dummy_deleg_cb, &recalled), 0);
std::thread breaker3(open_breaker_func, tcmount, filename, O_RDONLY, &opened);
breaker3.join();
ASSERT_EQ(recalled.load(), false);
// ensure that r/w open breaks r/o delegation
opened.store(false);
std::thread breaker4(open_breaker_func, tcmount, filename, O_WRONLY, &opened);
wait_for_atomic_bool(recalled);
usleep(1000);
ASSERT_EQ(opened.load(), false);
ASSERT_EQ(ceph_ll_delegation(cmount, fh, CEPH_DELEGATION_NONE, dummy_deleg_cb, &recalled), 0);
breaker4.join();
ASSERT_EQ(ceph_ll_close(cmount, fh), 0);
ASSERT_EQ(ceph_ll_unlink(cmount, root, filename, perms), 0);

@petrutlucian94
Copy link
Contributor Author

jenkins test make check

1 similar comment
@petrutlucian94
Copy link
Contributor Author

jenkins test make check

@idryomov
Copy link
Contributor

make check is blocked on https://tracker.ceph.com/issues/62082

@idryomov
Copy link
Contributor

jenkins test make check

@idryomov idryomov merged commit cce5996 into ceph:main Jul 20, 2023
@vshankar vshankar mentioned this pull request Jul 21, 2023
14 tasks
@petrutlucian94
Copy link
Contributor Author

petrutlucian94 commented Jul 21, 2023

Looks like we still have some racy libcephfs tests, I'm preparing a separate PR with the changes proposed by @vshankar.

/home/ubuntu/ceph/src/test/libcephfs/deleg.cc:392: Failure
Expected equality of these values:
  opened.load()
    Which is: true
  false

LE - submitted this: #52569.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build/ops cephfs Ceph File System tests win32 Specifix changes for the windows platform

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants