Skip to content

qa/workunits/rados: pull librados files from "main" instead of "master"#47642

Merged
ljflores merged 1 commit intoceph:mainfrom
ljflores:wip-librados-fix
Aug 23, 2022
Merged

qa/workunits/rados: pull librados files from "main" instead of "master"#47642
ljflores merged 1 commit intoceph:mainfrom
ljflores:wip-librados-fix

Conversation

@ljflores
Copy link
Member

@ljflores ljflores commented Aug 16, 2022

Fixes: https://tracker.ceph.com/issues/57122
Signed-off-by: Laura Flores lflores@redhat.com

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@ljflores ljflores requested a review from a team as a code owner August 16, 2022 18:06
@ljflores
Copy link
Member Author

Testing the fix here on a main test branch that recently experienced this failure:
http://pulpito.front.sepia.ceph.com/lflores-2022-08-16_18:14:56-rados:singleton-nomsgr-wip-yuri4-testing-2022-08-15-0951-distro-default-smithi/

./teuthology/virtualenv/bin/teuthology-suite -v --suite-repo https://github.com/ljflores/ceph.git -c wip-yuri4-testing-2022-08-15-0951 -m smithi -s rados/singleton-nomsgr --suite-branch wip-librados-fix --filter-all "librados_hello_world" -p 70

@kamoltat
Copy link
Member

jenkins test make check arm64

@ljflores
Copy link
Member Author

One of the tests (ubuntu), we're still hitting this in the teuthology log:

2022-08-16T18:33:06.123 INFO:tasks.workunit.client.0.smithi153.stderr:+ ./hello_world_cpp -c /etc/ceph/ceph.conf
2022-08-16T18:33:06.136 INFO:tasks.workunit.client.0.smithi153.stdout:we just set up a rados cluster object
2022-08-16T18:33:06.138 INFO:tasks.workunit.client.0.smithi153.stdout:we just parsed our config options
2022-08-16T18:33:06.146 INFO:tasks.workunit.client.0.smithi153.stderr:free(): invalid pointer
2022-08-16T18:33:06.349 INFO:tasks.workunit.client.0.smithi153.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/rados/test_librados_build.sh: line 65: 16604 Aborted                 (core dumped) ./$b -c /etc/ceph/ceph.conf
2022-08-16T18:33:06.350 INFO:tasks.workunit.client.0.smithi153.stderr:+ cleanup
2022-08-16T18:33:06.350 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.351 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/hello_world_c
2022-08-16T18:33:06.352 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.353 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/hello_world_cpp
2022-08-16T18:33:06.353 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.354 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/hello_radosstriper_cpp
2022-08-16T18:33:06.354 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.354 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/hello_radosstriper.cc
2022-08-16T18:33:06.357 DEBUG:teuthology.orchestra.run:got remote process result: 134
2022-08-16T18:33:06.358 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.358 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/hello_world_c.c
2022-08-16T18:33:06.358 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.358 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/hello_world.cc
2022-08-16T18:33:06.359 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.359 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/Makefile
2022-08-16T18:33:06.359 INFO:tasks.workunit:Stopping ['rados/test_librados_build.sh'] on client.0...

gdb shows:

# gdb hello_world_cpp /teuthology/lflores-2022-08-16_18:14:56-rados:singleton-nomsgr-wip-yuri4-testing-2022-08-15-0951-distro-default-smithi/6975743/remote/smithi153/coredump/1660674786.16604.core
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from hello_world_cpp...

warning: exec file is newer than core file.
[New LWP 16609]
[New LWP 16604]
[New LWP 16605]
[New LWP 16606]
[New LWP 16607]
[New LWP 16608]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./hello_world_cpp -c /etc/ceph/ceph.conf'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f67251d3700 (LWP 16609))]
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f672a072859 in __GI_abort () at abort.c:79
#2  0x00007f672a0dd26e in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f672a207298 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3  0x00007f672a0e52fc in malloc_printerr (str=str@entry=0x7f672a2054c1 "free(): invalid pointer") at malloc.c:5347
#4  0x00007f672a0e6b2c in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4173
#5  0x00007f672a3198ca in std::locale::_Impl::~_Impl() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007f672a319b17 in std::locale::~locale() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007f672a4be848 in StackStringStream<4096ul>::~StackStringStream() () from /lib/libradosstriper.so.1
#8  0x00007f672967d2b9 in CachedStackStringStream::Cache::~Cache() () from /usr/lib/ceph/libceph-common.so.2
#9  0x00007f672a0972bf in __GI___call_tls_dtors () at cxa_thread_atexit_impl.c:155
#10 0x00007f67292d9617 in start_thread (arg=<optimized out>) at pthread_create.c:485
#11 0x00007f672a16f133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Somehow, there is a pointer getting misused. This could have to do with the recent c++20 changes.

@ljflores
Copy link
Member Author

jenkins retest this please

@ljflores
Copy link
Member Author

One of the tests (ubuntu), we're still hitting this in the teuthology log:

2022-08-16T18:33:06.123 INFO:tasks.workunit.client.0.smithi153.stderr:+ ./hello_world_cpp -c /etc/ceph/ceph.conf
2022-08-16T18:33:06.136 INFO:tasks.workunit.client.0.smithi153.stdout:we just set up a rados cluster object
2022-08-16T18:33:06.138 INFO:tasks.workunit.client.0.smithi153.stdout:we just parsed our config options
2022-08-16T18:33:06.146 INFO:tasks.workunit.client.0.smithi153.stderr:free(): invalid pointer
2022-08-16T18:33:06.349 INFO:tasks.workunit.client.0.smithi153.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/rados/test_librados_build.sh: line 65: 16604 Aborted                 (core dumped) ./$b -c /etc/ceph/ceph.conf
2022-08-16T18:33:06.350 INFO:tasks.workunit.client.0.smithi153.stderr:+ cleanup
2022-08-16T18:33:06.350 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.351 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/hello_world_c
2022-08-16T18:33:06.352 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.353 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/hello_world_cpp
2022-08-16T18:33:06.353 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.354 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/hello_radosstriper_cpp
2022-08-16T18:33:06.354 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.354 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/hello_radosstriper.cc
2022-08-16T18:33:06.357 DEBUG:teuthology.orchestra.run:got remote process result: 134
2022-08-16T18:33:06.358 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.358 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/hello_world_c.c
2022-08-16T18:33:06.358 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.358 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/hello_world.cc
2022-08-16T18:33:06.359 INFO:tasks.workunit.client.0.smithi153.stderr:+ for f in $BINARIES$SOURCES
2022-08-16T18:33:06.359 INFO:tasks.workunit.client.0.smithi153.stderr:+ rm -f /home/ubuntu/cephtest/mnt.0/client.0/tmp/Makefile
2022-08-16T18:33:06.359 INFO:tasks.workunit:Stopping ['rados/test_librados_build.sh'] on client.0...

gdb shows:

# gdb hello_world_cpp /teuthology/lflores-2022-08-16_18:14:56-rados:singleton-nomsgr-wip-yuri4-testing-2022-08-15-0951-distro-default-smithi/6975743/remote/smithi153/coredump/1660674786.16604.core
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from hello_world_cpp...

warning: exec file is newer than core file.
[New LWP 16609]
[New LWP 16604]
[New LWP 16605]
[New LWP 16606]
[New LWP 16607]
[New LWP 16608]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./hello_world_cpp -c /etc/ceph/ceph.conf'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f67251d3700 (LWP 16609))]
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f672a072859 in __GI_abort () at abort.c:79
#2  0x00007f672a0dd26e in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f672a207298 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3  0x00007f672a0e52fc in malloc_printerr (str=str@entry=0x7f672a2054c1 "free(): invalid pointer") at malloc.c:5347
#4  0x00007f672a0e6b2c in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4173
#5  0x00007f672a3198ca in std::locale::_Impl::~_Impl() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007f672a319b17 in std::locale::~locale() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007f672a4be848 in StackStringStream<4096ul>::~StackStringStream() () from /lib/libradosstriper.so.1
#8  0x00007f672967d2b9 in CachedStackStringStream::Cache::~Cache() () from /usr/lib/ceph/libceph-common.so.2
#9  0x00007f672a0972bf in __GI___call_tls_dtors () at cxa_thread_atexit_impl.c:155
#10 0x00007f67292d9617 in start_thread (arg=<optimized out>) at pthread_create.c:485
#11 0x00007f672a16f133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Somehow, there is a pointer getting misused. This could have to do with the recent c++20 changes.

gdb also reveals:

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fb242eba859 in __GI_abort () at abort.c:79
#2  0x00007fb242f2526e in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7fb24304f298 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3  0x00007fb242f2d2fc in malloc_printerr (str=str@entry=0x7fb24304d4c1 "free(): invalid pointer") at malloc.c:5347
#4  0x00007fb242f2eb2c in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4173
#5  0x00007fb2431618ca in char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007fb23e01a908 in ?? ()
#7  0x00007fb242e3d610 in ?? () from /usr/lib/ceph/libceph-common.so.2
#8  0x00007fb243161b17 in std::string::replace(unsigned long, unsigned long, char const*, unsigned long) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x0000000000000000 in ?? ()

@rzarzynski any ideas on how we can resolve this?

@Matan-B
Copy link
Contributor

Matan-B commented Aug 18, 2022

Tested a hypothesis I had, run each test separately (hello_world_c and hello_world_cpp) and no failure occurred.
hello_world_cpp: http://pulpito.front.sepia.ceph.com/matan-2022-08-18_09:22:44-rados:singleton-nomsgr-main-distro-default-smithi/
hello_world_c: http://pulpito.front.sepia.ceph.com/matan-2022-08-18_09:25:10-rados:singleton-nomsgr-main-distro-default-smithi/

@ljflores
Copy link
Member Author

Ahh, interesting observation @Matan-B!

@ljflores
Copy link
Member Author

@ronen-fr
Copy link
Contributor

@ljflores, @Matan-B : take a look at #46891 (comment)

It was always a dist-specific issue...

@ljflores
Copy link
Member Author

ljflores commented Aug 18, 2022

Same test passes on quincy, when run on ubuntu:

The difference I'm seeing is that on main, we're installing librados-dev 17.0.0-14348-g370d960f-1focal:

2022-08-18T17:18:09.004 INFO:tasks.workunit.client.0.smithi078.stderr:++ source /etc/os-release
2022-08-18T17:18:09.004 INFO:tasks.workunit.client.0.smithi078.stderr:+++ NAME=Ubuntu
2022-08-18T17:18:09.005 INFO:tasks.workunit.client.0.smithi078.stderr:+++ VERSION='20.04.4 LTS (Focal Fossa)'
2022-08-18T17:18:09.006 INFO:tasks.workunit.client.0.smithi078.stderr:+++ ID=ubuntu
2022-08-18T17:18:09.006 INFO:tasks.workunit.client.0.smithi078.stderr:+++ ID_LIKE=debian
2022-08-18T17:18:09.006 INFO:tasks.workunit.client.0.smithi078.stderr:+++ PRETTY_NAME='Ubuntu 20.04.4 LTS'
2022-08-18T17:18:09.007 INFO:tasks.workunit.client.0.smithi078.stderr:+++ VERSION_ID=20.04
2022-08-18T17:18:09.007 INFO:tasks.workunit.client.0.smithi078.stderr:+++ HOME_URL=https://www.ubuntu.com/
2022-08-18T17:18:09.008 INFO:tasks.workunit.client.0.smithi078.stderr:+++ SUPPORT_URL=https://help.ubuntu.com/
2022-08-18T17:18:09.008 INFO:tasks.workunit.client.0.smithi078.stderr:+++ BUG_REPORT_URL=https://bugs.launchpad.net/ubuntu/
2022-08-18T17:18:09.009 INFO:tasks.workunit.client.0.smithi078.stderr:+++ PRIVACY_POLICY_URL=https://www.ubuntu.com/legal/terms-and-policies/privacy-policy
2022-08-18T17:18:09.009 INFO:tasks.workunit.client.0.smithi078.stderr:+++ VERSION_CODENAME=focal
2022-08-18T17:18:09.010 INFO:tasks.workunit.client.0.smithi078.stderr:+++ UBUNTU_CODENAME=focal
2022-08-18T17:18:09.010 INFO:tasks.workunit.client.0.smithi078.stderr:++ echo ubuntu
2022-08-18T17:18:09.011 INFO:tasks.workunit.client.0.smithi078.stderr:+ sudo env DEBIAN_FRONTEND=noninteractive apt-get install -y librados-dev
2022-08-18T17:18:09.067 INFO:tasks.workunit.client.0.smithi078.stdout:Reading package lists...
2022-08-18T17:18:09.221 INFO:tasks.workunit.client.0.smithi078.stdout:Building dependency tree...
2022-08-18T17:18:09.222 INFO:tasks.workunit.client.0.smithi078.stdout:Reading state information...
2022-08-18T17:18:09.363 INFO:tasks.workunit.client.0.smithi078.stdout:librados-dev is already the newest version (17.0.0-14348-g370d960f-1focal).
2022-08-18T17:18:09.363 INFO:tasks.workunit.client.0.smithi078.stdout:The following packages were automatically installed and are no longer required:
2022-08-18T17:18:09.364 INFO:tasks.workunit.client.0.smithi078.stdout:  libboost-iostreams1.71.0 libboost-thread1.71.0
2022-08-18T17:18:09.364 INFO:tasks.workunit.client.0.smithi078.stdout:Use 'sudo apt autoremove' to remove them.
2022-08-18T17:18:09.400 INFO:tasks.workunit.client.0.smithi078.stdout:0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
2022-08-18T17:18:09.401 INFO:tasks.workunit.client.0.smithi078.stderr:+ get_sources
2022-08-18T17:18:09.401 INFO:tasks.workunit.client.0.smithi078.stderr:+ for s in $SOURCES
2022-08-18T17:18:09.402 INFO:tasks.workunit.client.0.smithi078.stderr:+ curl --progress-bar --output hello_radosstriper.cc 'http://git.ceph.com/?p=ceph.git;a=blob_plain;hb=main;f=examples/librados/hello_radosstriper.cc'
2022-08-18T17:18:09.553 INFO:tasks.workunit.client.0.smithi078.stderr:#=#=#
2022-08-18T17:18:09.554 INFO:tasks.workunit.client.0.smithi078.stderr:+ for s in $SOURCES
2022-08-18T17:18:09.554 INFO:tasks.workunit.client.0.smithi078.stderr:+ curl --progress-bar --output hello_world_c.c 'http://git.ceph.com/?p=ceph.git;a=blob_plain;hb=main;f=examples/librados/hello_world_c.c'
2022-08-18T17:18:09.685 INFO:tasks.workunit.client.0.smithi078.stderr:#=#=#
2022-08-18T17:18:09.686 INFO:tasks.workunit.client.0.smithi078.stderr:+ for s in $SOURCES
2022-08-18T17:18:09.687 INFO:tasks.workunit.client.0.smithi078.stderr:+ curl --progress-bar --output hello_world.cc 'http://git.ceph.com/?p=ceph.git;a=blob_plain;hb=main;f=examples/librados/hello_world.cc'
2022-08-18T17:18:09.817 INFO:tasks.workunit.client.0.smithi078.stderr:#=#=#
2022-08-18T17:18:09.819 INFO:tasks.workunit.client.0.smithi078.stderr:+ for s in $SOURCES
2022-08-18T17:18:09.819 INFO:tasks.workunit.client.0.smithi078.stderr:+ curl --progress-bar --output Makefile 'http://git.ceph.com/?p=ceph.git;a=blob_plain;hb=main;f=examples/librados/Makefile'

While on quincy, we're installing librados-dev 17.2.3-482-g687c45bd-1focal:

2022-08-18T18:54:15.785 INFO:tasks.workunit.client.0.smithi182.stderr:++ source /etc/os-release
2022-08-18T18:54:15.785 INFO:tasks.workunit.client.0.smithi182.stderr:+++ NAME=Ubuntu
2022-08-18T18:54:15.786 INFO:tasks.workunit.client.0.smithi182.stderr:+++ VERSION='20.04.4 LTS (Focal Fossa)'
2022-08-18T18:54:15.786 INFO:tasks.workunit.client.0.smithi182.stderr:+++ ID=ubuntu
2022-08-18T18:54:15.787 INFO:tasks.workunit.client.0.smithi182.stderr:+++ ID_LIKE=debian
2022-08-18T18:54:15.787 INFO:tasks.workunit.client.0.smithi182.stderr:+++ PRETTY_NAME='Ubuntu 20.04.4 LTS'
2022-08-18T18:54:15.788 INFO:tasks.workunit.client.0.smithi182.stderr:+++ VERSION_ID=20.04
2022-08-18T18:54:15.788 INFO:tasks.workunit.client.0.smithi182.stderr:+++ HOME_URL=https://www.ubuntu.com/
2022-08-18T18:54:15.789 INFO:tasks.workunit.client.0.smithi182.stderr:+++ SUPPORT_URL=https://help.ubuntu.com/
2022-08-18T18:54:15.789 INFO:tasks.workunit.client.0.smithi182.stderr:+++ BUG_REPORT_URL=https://bugs.launchpad.net/ubuntu/
2022-08-18T18:54:15.790 INFO:tasks.workunit.client.0.smithi182.stderr:+++ PRIVACY_POLICY_URL=https://www.ubuntu.com/legal/terms-and-policies/privacy-policy
2022-08-18T18:54:15.790 INFO:tasks.workunit.client.0.smithi182.stderr:+++ VERSION_CODENAME=focal
2022-08-18T18:54:15.791 INFO:tasks.workunit.client.0.smithi182.stderr:+++ UBUNTU_CODENAME=focal
2022-08-18T18:54:15.791 INFO:tasks.workunit.client.0.smithi182.stderr:++ echo ubuntu
2022-08-18T18:54:15.791 INFO:tasks.workunit.client.0.smithi182.stderr:+ sudo env DEBIAN_FRONTEND=noninteractive apt-get install -y librados-dev
2022-08-18T18:54:15.848 INFO:tasks.workunit.client.0.smithi182.stdout:Reading package lists...
2022-08-18T18:54:16.026 INFO:tasks.workunit.client.0.smithi182.stdout:Building dependency tree...
2022-08-18T18:54:16.028 INFO:tasks.workunit.client.0.smithi182.stdout:Reading state information...
2022-08-18T18:54:16.171 INFO:tasks.workunit.client.0.smithi182.stdout:librados-dev is already the newest version (17.2.3-482-g687c45bd-1focal).
2022-08-18T18:54:16.172 INFO:tasks.workunit.client.0.smithi182.stdout:The following packages were automatically installed and are no longer required:
2022-08-18T18:54:16.172 INFO:tasks.workunit.client.0.smithi182.stdout:  libboost-iostreams1.71.0 libboost-thread1.71.0
2022-08-18T18:54:16.173 INFO:tasks.workunit.client.0.smithi182.stdout:Use 'sudo apt autoremove' to remove them.
2022-08-18T18:54:16.209 INFO:tasks.workunit.client.0.smithi182.stdout:0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
2022-08-18T18:54:16.211 INFO:tasks.workunit.client.0.smithi182.stderr:+ get_sources
2022-08-18T18:54:16.211 INFO:tasks.workunit.client.0.smithi182.stderr:+ for s in $SOURCES
2022-08-18T18:54:16.211 INFO:tasks.workunit.client.0.smithi182.stderr:+ curl --progress-bar --output hello_radosstriper.cc 'http://git.ceph.com/?p=ceph.git;a=blob_plain;hb=quincy;f=examples/librados/hello_radosstriper.cc'
2022-08-18T18:54:16.366 INFO:tasks.workunit.client.0.smithi182.stderr:#=#=#
2022-08-18T18:54:16.368 INFO:tasks.workunit.client.0.smithi182.stderr:+ for s in $SOURCES
2022-08-18T18:54:16.368 INFO:tasks.workunit.client.0.smithi182.stderr:+ curl --progress-bar --output hello_world_c.c 'http://git.ceph.com/?p=ceph.git;a=blob_plain;hb=quincy;f=examples/librados/hello_world_c.c'
2022-08-18T18:54:16.476 INFO:tasks.workunit.client.0.smithi182.stderr:
2022-08-18T18:54:16.478 INFO:tasks.workunit.client.0.smithi182.stderr:+ for s in $SOURCES
2022-08-18T18:54:16.478 INFO:tasks.workunit.client.0.smithi182.stderr:+ curl --progress-bar --output hello_world.cc 'http://git.ceph.com/?p=ceph.git;a=blob_plain;hb=quincy;f=examples/librados/hello_world.cc'
2022-08-18T18:54:16.589 INFO:tasks.workunit.client.0.smithi182.stderr:
2022-08-18T18:54:16.591 INFO:tasks.workunit.client.0.smithi182.stderr:+ for s in $SOURCES
2022-08-18T18:54:16.591 INFO:tasks.workunit.client.0.smithi182.stderr:+ curl --progress-bar --output Makefile 'http://git.ceph.com/?p=ceph.git;a=blob_plain;hb=quincy;f=examples/librados/Makefile'

@ljflores
Copy link
Member Author

ljflores commented Aug 18, 2022

Wow, I've looked into this for hours today and got nowhere except that I think this is a dependency issue. Here are some things I know for sure:

  1. The failure is "free: invalid pointer()" when the librados/hello_world.cc binary is run (./hello_world_cpp -c /etc/ceph/ceph.conf)
  2. This fails only on main, and only on ubuntu (quincy is working fine)
  3. Nothing has changed code-wise in librados/hello_world.cc between main and quincy

It would help if we could reproduce it locally, but when I tried running the test locally, it did not like parsing my vstart ceph.conf file. If anyone is able to reproduce locally, let me know.

One thing we could do in the meantime is merge this PR to address https://tracker.ceph.com/issues/57122, which will at least fix centos and rhel tests. And then address the "free: invalid pointer()" issue separately.

@Matan-B @ronen-fr what do you think?

(Btw Matan, I tried your suggestion of running one binary at a time, but the test proved to fail again when run on Ubuntu).

@Matan-B
Copy link
Contributor

Matan-B commented Aug 21, 2022

  1. Nothing has changed code-wise in librados/hello_world.cc between main and quincy

Related changes can also be found in the librados implementation (e.g librados_cxx.cc, RadosClient.cc). With similar issues such as e465cfc.

It would help if we could reproduce it locally, but when I tried running the test locally, it did not like parsing my vstart ceph.conf file. If anyone is able to reproduce locally, let me know.

I tried reproducing locally but no errors for me. (Ping me if you need help with the vstart parsing)

One thing we could do in the meantime is merge this PR to address https://tracker.ceph.com/issues/57122, which will at least fix centos and rhel tests. And then address the "free: invalid pointer()" issue separately.

Since the free(): invalid pointer issue is not related to this this change, I agree that we can address this issue separately.

Copy link
Contributor

@ronen-fr ronen-fr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We should follow on that unrelated test failure, though.

@ronen-fr
Copy link
Contributor

jenkins retest this please

@ljflores
Copy link
Member Author

jenkins test api

@ljflores
Copy link
Member Author

jenkins test make check

@ljflores
Copy link
Member Author

jenkins test make check arm64

1 similar comment
@Matan-B
Copy link
Contributor

Matan-B commented Aug 22, 2022

jenkins test make check arm64

@Matan-B
Copy link
Contributor

Matan-B commented Aug 22, 2022

jenkins test make check

@ljflores
Copy link
Member Author

There was an instance of free() invalid pointer recorded on a different librados test (see https://tracker.ceph.com/issues/57163#note-2), so we can safely say that is an entirely separate issue. I will move forward with merging this PR so we can focus on fixing that separately.

@ljflores ljflores merged commit 4cf097d into ceph:main Aug 23, 2022
@ljflores ljflores deleted the wip-librados-fix branch August 23, 2022 15:32
@ljflores ljflores restored the wip-librados-fix branch August 26, 2022 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants