Skip to content

Allow append with gap#2

Closed
xiaoxichen wants to merge 95 commits intoliewegas:wip-newstorefrom
xiaoxichen:patch-2
Closed

Allow append with gap#2
xiaoxichen wants to merge 95 commits intoliewegas:wip-newstorefrom
xiaoxichen:patch-2

Conversation

@xiaoxichen
Copy link

Allow append with gap

Currently we recover objects directly into position by deleting and then
overwriting the target object.  This means that we lose the object if we
are recovering in multiple steps and we fail partway through.

This is also the last user of collection_move(), which we would like to
deprecate.

Instead, generate a unique temp object name (pgid, object version, snap
is unique), and recover to that.  Use the existing temp object cleanup
machinery to throw out a partial recovery result.

Signed-off-by: Sage Weil <sage@redhat.com>
You will not be missed!

Signed-off-by: Sage Weil <sage@redhat.com>
Only call these if we have values to set on this iteration.  This is
rarely true for omap.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Previously, all temp objects had poolid == -1.  Instead, use -2 - poolid.
This, when combined with the PG hash, provides a unique temp namespace
per pg.

This has no impact on upgrade, since we delete all temp objects on startup
by collection (coll_t::is_temp()).

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
The temp objects have distinct pool ids.  Old temp objects are already
blown away on OSD restart.  This patch removes all the futzing with
temp_coll and puts the temp objects in the same collection as everything
else.

Interesting, collection_move_rename is now always using the same source
and dest collection.  Hmm!

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
We are working with ghobject_t's; use that max!

Signed-off-by: Sage Weil <sage@redhat.com>
We are including pool in a (more) significant part of the sort and using
negative pool IDs; the default must be the most negative (min) so that
it is a useful starting point for a search.

Signed-off-by: Sage Weil <sage@redhat.com>
Make pool more significant than hash (and nspace, object, and everything
else).

This won't break anything because we only care about hobject_t sort order
within the context of a PG (or possibly split) where pool is always
fixed.

Signed-off-by: Sage Weil <sage@redhat.com>
Put the most significant fields to the left so that it matches the sort
order.  Also use unambiguous separator when the nspace is present
(like we do with the key).

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Go from (hobj, shard, gen) -> (max, shard, hobj, gen)

This makes the ghobject_t's sort in a way that groups them by PG, which
will likely be useful in the future for ObjectStore implementations.
Notably, we get a ghobject_t MAX value that is distinct from the
hobject_t one.

Signed-off-by: Sage Weil <sage@redhat.com>
The hash is no longer the most significant field; set everything that is
more significant, too.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Add a simple list test, and a second one that mixes different pool ids
into the same collection.  The latter confirms that we can deal with
ghobject_t's in the temp pool space within a single collection.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
These are no longer needed or called.

Signed-off-by: Sage Weil <sage@redhat.com>
Generate the temp collection for a pg collection.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Call out those creating collections with strings.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
The min ghobject has shard NO_SHARD, and is the default constructed value.
That initial value is assumed in uncounted ways across the code base when
users do

  ghobject_t foo;
  foo.this = that;

such that changing it is dangerous.  It is safer to change the shard_id_t
sort order such that NO_SHARD is signed instead of unsigned.  The value
doesn't actually change (still 0xff), but the sorting does.  Note that
only a single comparison triggers a signed/unsigned warning from this
change, and it assumes that the shard is not NO_SHARD (ec pool) and we
case it to preserve the old behavior anyway.

In PGBackend we change the minimum value for the objects_list_partial()
method to start with NO_SHARD.

Signed-off-by: Sage Weil <sage@redhat.com>
liewegas pushed a commit that referenced this pull request Nov 4, 2016
liewegas pushed a commit that referenced this pull request Nov 18, 2016
Fix for:

CID 1297885 (#1 of 2): Result is not floating-point (UNINTENDED_INTEGER_DIVISION)
 integer_division: Dividing integer expressions g_conf->mon_pool_quota_warn_threshold
 and 100, and then converting the integer quotient to type float. Any remainder,
 or fractional part of the quotient, is ignored.

CID 1297885 (#2 of 2): Result is not floating-point (UNINTENDED_INTEGER_DIVISION)
 integer_division: Dividing integer expressions g_conf->mon_pool_quota_crit_threshold
 and 100, and then converting the integer quotient to type float. Any remainder,
  or fractional part of the quotient, is ignored.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
(cherry picked from commit be7e07a)
liewegas pushed a commit that referenced this pull request Feb 9, 2017
CID 1395784 (#2-1 of 2): Resource leak (RESOURCE_LEAK)
 leaked_storage: Variable state going out of scope
 leaks the storage it points to.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
liewegas pushed a commit that referenced this pull request Jun 26, 2017
…mageReplayer

Fixes the Coverity Scan Report:
CID 1412614 (#2-1 of 2): Uninitialized scalar field (UNINIT_CTOR)
7. uninit_member: Non-static class member m_do_resync is not initialized in this constructor nor in any functions that it calls.

Signed-off-by: Jos Collin <jcollin@redhat.com>
liewegas pushed a commit that referenced this pull request Aug 1, 2017
I'm seeing sporadic single thread deadlocks on fio stat_mutex during krbd
thrash runs:

  (gdb) info threads
    Id   Target Id         Frame
  * 1    Thread 0x7f89ee730740 (LWP 15604) 0x00007f89ed9f41bd in __lll_lock_wait () from /lib64/libpthread.so.0
  (gdb) bt
  #0  0x00007f89ed9f41bd in __lll_lock_wait () from /lib64/libpthread.so.0
  #1  0x00007f89ed9f17b2 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  #2  0x00000000004429b9 in fio_mutex_down (mutex=0x7f89ee72d000) at mutex.c:170
  #3  0x0000000000459704 in thread_main (data=<optimized out>) at backend.c:1639
  #4  0x000000000045b013 in fork_main (offset=0, shmid=<optimized out>, sk_out=0x0) at backend.c:1778
  #5  run_threads (sk_out=sk_out@entry=0x0) at backend.c:2195
  #6  0x000000000045b47f in fio_backend (sk_out=sk_out@entry=0x0) at backend.c:2400
  #7  0x000000000040cb0c in main (argc=2, argv=0x7fffad3e3888, envp=<optimized out>) at fio.c:63
  (gdb) up 2
  170                     pthread_cond_wait(&mutex->cond, &mutex->lock);
  (gdb) p mutex.lock.__data.__owner
  $1 = 15604

Upgrading to 2.21 seems to make these go away.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
liewegas pushed a commit that referenced this pull request Sep 27, 2017
** 396154 Uninitialized pointer field
CID 1396154 (#1 of 1): Uninitialized pointer field (UNINIT_CTOR)
2. uninit_member: Non-static class member on_finish is not initialized
in this constructor nor in any functions that it calls.

** 1396158 Uninitialized pointer field
2. uninit_member: Non-static class member snap_id is not initialized
in this constructor nor in any functions that it calls.
4. uninit_member: Non-static class member force is not initialized
in this constructor nor in any functions that it calls.
CID 1396158 (#1 of 1): Uninitialized pointer field (UNINIT_CTOR)
6. uninit_member: Non-static class member on_finish is not initialized
in this constructor nor in any functions that it calls.

** 1399593 Uninitialized pointer field
2. uninit_member: Non-static class member locker is not initialized
in this constructor nor in any functions that it calls.
CID 1399593 (#2 of 2): Uninitialized pointer field (UNINIT_CTOR)
4. uninit_member: Non-static class member on_finish is not initialized
in this constructor nor in any functions that it calls.

Signed-off-by: Amit Kumar <amitkuma@redhat.com>
liewegas pushed a commit that referenced this pull request Nov 8, 2017
…letion

We have a race condition:

 1. RGW client #1: requests an object be deleted.
 2. RGW client #1: sends a prepare op to bucket index OSD #1.
 3. OSD #1:        prepares the op, adding pending ops to the bucket dir entry
 4. RGW client #2: sends a list bucket to OSD #1
 5. RGW client #2: sees that there are pending operations on bucket
                   dir entry, and calls check_disk_state
 6. RGW client #2: check_disk_state sees that the object still exists, so it
                   sends CEPH_RGW_UPDATE to bucket index OSD (#1)
 7. RGW client #1: sends a delete object to object OSD (#2)
 8. OSD #2:        deletes the object
 9. RGW client #2: sends a complete op to bucket index OSD (#1)
10. OSD #1:        completes the op
11. OSD #1:        receives the CEPH_RGW_UPDATE and updates the bucket index
                   entry, thereby **RECREATING** it

Solution implemented:

At step #5 the object's dir entry exists. If we get to beginning of
step #11 and the object's dir entry no longer exists, we know that the
dir entry was just actively being modified, and ignore the
CEPH_RGW_UPDATE operation, thereby NOT recreating it.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
liewegas pushed a commit that referenced this pull request Nov 10, 2017
…letion

We have a race condition:

 1. RGW client #1: requests an object be deleted.
 2. RGW client #1: sends a prepare op to bucket index OSD #1.
 3. OSD #1:        prepares the op, adding pending ops to the bucket dir entry
 4. RGW client #2: sends a list bucket to OSD #1
 5. RGW client #2: sees that there are pending operations on bucket
                   dir entry, and calls check_disk_state
 6. RGW client #2: check_disk_state sees that the object still exists, so it
                   sends CEPH_RGW_UPDATE to bucket index OSD (#1)
 7. RGW client #1: sends a delete object to object OSD (#2)
 8. OSD #2:        deletes the object
 9. RGW client #2: sends a complete op to bucket index OSD (#1)
10. OSD #1:        completes the op
11. OSD #1:        receives the CEPH_RGW_UPDATE and updates the bucket index
                   entry, thereby **RECREATING** it

Solution implemented:

At step #5 the object's dir entry exists. If we get to beginning of
step #11 and the object's dir entry no longer exists, we know that the
dir entry was just actively being modified, and ignore the
CEPH_RGW_UPDATE operation, thereby NOT recreating it.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit b33f529)
liewegas pushed a commit that referenced this pull request Nov 17, 2017
Fixes the coverity issue:

CID 1395303 (#2 of 2): Argument cannot be negative (NEGATIVE_RETURNS)
79. negative_returns: creat(pathname.c_str(), 384U) is passed
to a parameter that cannot be negative.
Signed-off-by: Amit Kumar amitkuma@redhat.com
liewegas pushed a commit that referenced this pull request Jul 6, 2018
…letion

We have a race condition:

 1. RGW client #1: requests an object be deleted.
 2. RGW client #1: sends a prepare op to bucket index OSD #1.
 3. OSD #1:        prepares the op, adding pending ops to the bucket dir entry
 4. RGW client #2: sends a list bucket to OSD #1
 5. RGW client #2: sees that there are pending operations on bucket
                   dir entry, and calls check_disk_state
 6. RGW client #2: check_disk_state sees that the object still exists, so it
                   sends CEPH_RGW_UPDATE to bucket index OSD (#1)
 7. RGW client #1: sends a delete object to object OSD (#2)
 8. OSD #2:        deletes the object
 9. RGW client #2: sends a complete op to bucket index OSD (#1)
10. OSD #1:        completes the op
11. OSD #1:        receives the CEPH_RGW_UPDATE and updates the bucket index
                   entry, thereby **RECREATING** it

Solution implemented:

At step #5 the object's dir entry exists. If we get to beginning of
step #11 and the object's dir entry no longer exists, we know that the
dir entry was just actively being modified, and ignore the
CEPH_RGW_UPDATE operation, thereby NOT recreating it.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit b33f529)

Conflicts: (backported substantial changes only; omitted cleanups)
        src/cls/rgw/cls_rgw.cc
	src/rgw/rgw_rados.cc
liewegas added a commit that referenced this pull request Jun 19, 2019
I am seeing this trace, which matches except for the
'fun:_ZN15AsyncConnection7processEv' frame.

<error>
  <unique>0x2399</unique>
  <tid>11</tid>
  <threadname>msgr-worker-1</threadname>
  <kind>UninitCondition</kind>
  <what>Conditional jump or move depends on uninitialised value(s)</what>
  <stack>
    <frame>
      <ip>0x5366B18</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>ceph::crypto::onwire::AES128GCM_OnWireRxHandler::authenticated_decrypt_update_final(ceph::buffer::v14_2_0::list&amp;&amp;, unsigned int)</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>crypto_onwire.cc</file>
      <line>274</line>
    </frame>
    <frame>
      <ip>0x5355E60</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>ProtocolV2::handle_read_frame_epilogue_main(std::unique_ptr&lt;ceph::buffer::v14_2_0::ptr_node, ceph::buffer::v14_2_0::ptr_node::disposer&gt;&amp;&amp;, int)</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>ProtocolV2.cc</file>
      <line>1311</line>
    </frame>
    <frame>
      <ip>0x533E2A3</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>ProtocolV2::run_continuation(Ct&lt;ProtocolV2&gt;&amp;)</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>ProtocolV2.cc</file>
      <line>45</line>
    </frame>
    <frame>
      <ip>0x534FB1C</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>ProtocolV2::reuse_connection(boost::intrusive_ptr&lt;AsyncConnection&gt; const&amp;, ProtocolV2*)::{lambda(ConnectedSocket&amp;)#3}::operator()(ConnectedSocket&amp;)::{lambda()#2}::operator()()</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>ProtocolV2.cc</file>
      <line>2739</line>
    </frame>
    <frame>
      <ip>0x534FF57</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>ProtocolV2::reuse_connection(boost::intrusive_ptr&lt;AsyncConnection&gt; const&amp;, ProtocolV2*)::{lambda(ConnectedSocket&amp;)#3}::operator()(ConnectedSocket&amp;)</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>ProtocolV2.cc</file>
      <line>2745</line>
    </frame>
    <frame>
      <ip>0x535001E</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>__invoke_impl&lt;void, ProtocolV2::reuse_connection(const AsyncConnectionRef&amp;, ProtocolV2*)::&lt;lambda(ConnectedSocket&amp;)&gt;&amp;, ConnectedSocket&amp;&gt;</fn>
      <dir>/opt/rh/devtoolset-8/root/usr/include/c++/8/bits</dir>
      <file>invoke.h</file>
      <line>60</line>
    </frame>
    <frame>
      <ip>0x535001E</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>__invoke&lt;ProtocolV2::reuse_connection(const AsyncConnectionRef&amp;, ProtocolV2*)::&lt;lambda(ConnectedSocket&amp;)&gt;&amp;, ConnectedSocket&amp;&gt;</fn>
      <dir>/opt/rh/devtoolset-8/root/usr/include/c++/8/bits</dir>
      <file>invoke.h</file>
      <line>95</line>
    </frame>
    <frame>
      <ip>0x535001E</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>__call&lt;void, 0&gt;</fn>
      <dir>/opt/rh/devtoolset-8/root/usr/include/c++/8</dir>
      <file>functional</file>
      <line>400</line>
    </frame>
    <frame>
      <ip>0x535001E</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>operator()&lt;&gt;</fn>
      <dir>/opt/rh/devtoolset-8/root/usr/include/c++/8</dir>
      <file>functional</file>
      <line>484</line>
    </frame>
    <frame>
      <ip>0x535001E</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>EventCenter::C_submit_event&lt;std::_Bind&lt;ProtocolV2::reuse_connection(boost::intrusive_ptr&lt;AsyncConnection&gt; const&amp;, ProtocolV2*)::{lambda(ConnectedSocket&amp;)#3} (ConnectedSocket)&gt; &gt;::do_request(unsigned long)</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>Event.h</file>
      <line>227</line>
    </frame>
    <frame>
      <ip>0x535FCD6</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>EventCenter::process_events(unsigned int, std::chrono::duration&lt;unsigned long, std::ratio&lt;1l, 1000000000l&gt; &gt;*)</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>Event.cc</file>
      <line>441</line>
    </frame>
    <frame>
      <ip>0x5365086</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>operator()</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>Stack.cc</file>
      <line>53</line>
    </frame>
    <frame>
      <ip>0x5365086</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>std::_Function_handler&lt;void (), NetworkStack::add_thread(unsigned int)::{lambda()#1}&gt;::_M_invoke(std::_Any_data const&amp;)</fn>
      <dir>/opt/rh/devtoolset-8/root/usr/include/c++/8/bits</dir>
      <file>std_function.h</file>
      <line>297</line>
    </frame>
    <frame>
      <ip>0x55F519E</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>execute_native_thread_routine</fn>
    </frame>
    <frame>
      <ip>0x1076BDD4</ip>
      <obj>/usr/lib64/libpthread-2.17.so</obj>
      <fn>start_thread</fn>
    </frame>
    <frame>
      <ip>0x118E0EAC</ip>
      <obj>/usr/lib64/libc-2.17.so</obj>
      <fn>clone</fn>
    </frame>
  </stack>
</error>

Signed-off-by: Sage Weil <sage@redhat.com>
liewegas added a commit that referenced this pull request Jun 19, 2019
We just took the curmap ref above; do not call get_osdmap() again.

I think it may explain a weird segv I saw here in ~shared_ptr, although
I'm not quite certain.  Regardless, this change is correct and better.

(gdb) bt
#0  raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00005596e5a98261 in reraise_fatal (signum=11) at ./src/global/signal_handler.cc:326
#2  handle_fatal_signal(int) () at ./src/global/signal_handler.cc:326
#3  <signal handler called>
#4  0x00005596f4fe80e0 in ?? ()
#5  0x00005596e5464068 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x5596f4b7cf60) at /usr/include/c++/9/bits/shared_ptr_base.h:148
#6  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x5596f4b7cf60) at /usr/include/c++/9/bits/shared_ptr_base.h:148
#7  0x00005596e543377f in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7f2b25044e28, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:1169
#8  std::__shared_ptr<OSDMap const, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7f2b25044e20, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:1169
#9  std::shared_ptr<OSDMap const>::~shared_ptr (this=0x7f2b25044e20, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr.h:103
#10 OSD::handle_osd_ping(MOSDPing*) () at ./src/osd/OSD.cc:4662

Signed-off-by: Sage Weil <sage@redhat.com>
liewegas added a commit that referenced this pull request Jul 19, 2019
I am seeing this trace, which matches except for the
'fun:_ZN15AsyncConnection7processEv' frame.

<error>
  <unique>0x2399</unique>
  <tid>11</tid>
  <threadname>msgr-worker-1</threadname>
  <kind>UninitCondition</kind>
  <what>Conditional jump or move depends on uninitialised value(s)</what>
  <stack>
    <frame>
      <ip>0x5366B18</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>ceph::crypto::onwire::AES128GCM_OnWireRxHandler::authenticated_decrypt_update_final(ceph::buffer::v14_2_0::list&amp;&amp;, unsigned int)</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>crypto_onwire.cc</file>
      <line>274</line>
    </frame>
    <frame>
      <ip>0x5355E60</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>ProtocolV2::handle_read_frame_epilogue_main(std::unique_ptr&lt;ceph::buffer::v14_2_0::ptr_node, ceph::buffer::v14_2_0::ptr_node::disposer&gt;&amp;&amp;, int)</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>ProtocolV2.cc</file>
      <line>1311</line>
    </frame>
    <frame>
      <ip>0x533E2A3</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>ProtocolV2::run_continuation(Ct&lt;ProtocolV2&gt;&amp;)</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>ProtocolV2.cc</file>
      <line>45</line>
    </frame>
    <frame>
      <ip>0x534FB1C</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>ProtocolV2::reuse_connection(boost::intrusive_ptr&lt;AsyncConnection&gt; const&amp;, ProtocolV2*)::{lambda(ConnectedSocket&amp;)#3}::operator()(ConnectedSocket&amp;)::{lambda()#2}::operator()()</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>ProtocolV2.cc</file>
      <line>2739</line>
    </frame>
    <frame>
      <ip>0x534FF57</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>ProtocolV2::reuse_connection(boost::intrusive_ptr&lt;AsyncConnection&gt; const&amp;, ProtocolV2*)::{lambda(ConnectedSocket&amp;)#3}::operator()(ConnectedSocket&amp;)</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>ProtocolV2.cc</file>
      <line>2745</line>
    </frame>
    <frame>
      <ip>0x535001E</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>__invoke_impl&lt;void, ProtocolV2::reuse_connection(const AsyncConnectionRef&amp;, ProtocolV2*)::&lt;lambda(ConnectedSocket&amp;)&gt;&amp;, ConnectedSocket&amp;&gt;</fn>
      <dir>/opt/rh/devtoolset-8/root/usr/include/c++/8/bits</dir>
      <file>invoke.h</file>
      <line>60</line>
    </frame>
    <frame>
      <ip>0x535001E</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>__invoke&lt;ProtocolV2::reuse_connection(const AsyncConnectionRef&amp;, ProtocolV2*)::&lt;lambda(ConnectedSocket&amp;)&gt;&amp;, ConnectedSocket&amp;&gt;</fn>
      <dir>/opt/rh/devtoolset-8/root/usr/include/c++/8/bits</dir>
      <file>invoke.h</file>
      <line>95</line>
    </frame>
    <frame>
      <ip>0x535001E</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>__call&lt;void, 0&gt;</fn>
      <dir>/opt/rh/devtoolset-8/root/usr/include/c++/8</dir>
      <file>functional</file>
      <line>400</line>
    </frame>
    <frame>
      <ip>0x535001E</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>operator()&lt;&gt;</fn>
      <dir>/opt/rh/devtoolset-8/root/usr/include/c++/8</dir>
      <file>functional</file>
      <line>484</line>
    </frame>
    <frame>
      <ip>0x535001E</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>EventCenter::C_submit_event&lt;std::_Bind&lt;ProtocolV2::reuse_connection(boost::intrusive_ptr&lt;AsyncConnection&gt; const&amp;, ProtocolV2*)::{lambda(ConnectedSocket&amp;)#3} (ConnectedSocket)&gt; &gt;::do_request(unsigned long)</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>Event.h</file>
      <line>227</line>
    </frame>
    <frame>
      <ip>0x535FCD6</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>EventCenter::process_events(unsigned int, std::chrono::duration&lt;unsigned long, std::ratio&lt;1l, 1000000000l&gt; &gt;*)</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>Event.cc</file>
      <line>441</line>
    </frame>
    <frame>
      <ip>0x5365086</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>operator()</fn>
      <dir>/usr/src/debug/ceph-15.0.0-1717-g8d72af7/src/msg/async</dir>
      <file>Stack.cc</file>
      <line>53</line>
    </frame>
    <frame>
      <ip>0x5365086</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>std::_Function_handler&lt;void (), NetworkStack::add_thread(unsigned int)::{lambda()#1}&gt;::_M_invoke(std::_Any_data const&amp;)</fn>
      <dir>/opt/rh/devtoolset-8/root/usr/include/c++/8/bits</dir>
      <file>std_function.h</file>
      <line>297</line>
    </frame>
    <frame>
      <ip>0x55F519E</ip>
      <obj>/usr/lib64/ceph/libceph-common.so.0</obj>
      <fn>execute_native_thread_routine</fn>
    </frame>
    <frame>
      <ip>0x1076BDD4</ip>
      <obj>/usr/lib64/libpthread-2.17.so</obj>
      <fn>start_thread</fn>
    </frame>
    <frame>
      <ip>0x118E0EAC</ip>
      <obj>/usr/lib64/libc-2.17.so</obj>
      <fn>clone</fn>
    </frame>
  </stack>
</error>

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit f019fc0)
liewegas added a commit that referenced this pull request Jul 23, 2019
# This is the 1st commit message:

osd: add lease/readable tracking members, helpers

Signed-off-by: Sage Weil <sage@redhat.com>

# The commit message #2 will be skipped:

# fix ru logic
liewegas pushed a commit that referenced this pull request Jul 30, 2019
CID 174874 (#2 of 2): Dereference after null check (FORWARD_NULL)
30. var_deref_op: Dereference null pointer out2.

Signed-off-by: songweibin <song.weibin@zte.com.cn>
liewegas pushed a commit that referenced this pull request Feb 18, 2020
Accordingly to cppreference.com [1]:

  "If multiple threads of execution access the same std::shared_ptr
  object without synchronization and any of those accesses uses
  a non-const member function of shared_ptr then a data race will
  occur (...)"

[1]: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic

One of the coredumps showed the `shared_ptr`-typed `OSD::osdmap`
with healthy looking content but damaged control block:

  ```
  [Current thread is 1 (Thread 0x7f7dcaf73700 (LWP 205295))]
  (gdb) bt
  #0  0x0000559cb81c3ea0 in ?? ()
  #1  0x0000559c97675b27 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148
  #2  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148
  #3  0x0000559c975ef8aa in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167
  #4  std::__shared_ptr<OSDMap const, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167
  #5  std::shared_ptr<OSDMap const>::~shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr.h:103
  #6  OSD::create_context (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9053
  #7  0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...)
      at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665
  #8  0x0000559c97886db6 in ceph::osd::scheduler::PGPeeringItem::run (this=<optimized out>, osd=<optimized out>, sdata=<optimized out>, pg=..., handle=...) at /usr/include/c++/8/ext/atomicity.h:96
  #9  0x0000559c9764862f in ceph::osd::scheduler::OpSchedulerItem::run (handle=..., pg=..., sdata=<optimized out>, osd=<optimized out>, this=0x7f7dcaf703f0) at /usr/include/c++/8/bits/unique_ptr.h:342
  #10 OSD::ShardedOpWQ::_process (this=<optimized out>, thread_index=<optimized out>, hb=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:10677
  #11 0x0000559c97c76094 in ShardedThreadPool::shardedthreadpool_worker (this=0x559ca22aca28, thread_index=14) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.cc:311
  #12 0x0000559c97c78cf4 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.h:706
  #13 0x00007f7df17852de in start_thread () from /lib64/libpthread.so.0
  #14 0x00007f7df052f133 in __libc_ifunc_impl_list () from /lib64/libc.so.6
  #15 0x0000000000000000 in ?? ()
  (gdb) frame 7
  #7  0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...)
      at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665
  9665      in /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc
  (gdb) print osdmap
  $24 = std::shared_ptr<const OSDMap> (expired, weak count 0) = {get() = 0x559cba028000}
  (gdb) print *osdmap
     # pretty sane OSDMap
  (gdb) print sizeof(osdmap)
  $26 = 16
  (gdb) x/2a &osdmap
  0x559ca22acef0:   0x559cba028000  0x559cba0ec900

  (gdb) frame 2
  #2  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148
  148       /usr/include/c++/8/bits/shared_ptr_base.h: No such file or directory.
  (gdb) disassemble
  Dump of assembler code for function std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release():
  ...
     0x0000559c97675b1e <+62>:      mov    (%rdi),%rax
     0x0000559c97675b21 <+65>:      mov    %rdi,%rbx
     0x0000559c97675b24 <+68>:      callq  *0x10(%rax)
  => 0x0000559c97675b27 <+71>:      test   %rbp,%rbp
  ...
  End of assembler dump.
  (gdb) info registers rdi rbx rax
  rdi            0x559cba0ec900      94131624790272
  rbx            0x559cba0ec900      94131624790272
  rax            0x559cba0ec8a0      94131624790176
  (gdb) x/a 0x559cba0ec8a0 + 0x10
  0x559cba0ec8b0:   0x559cb81c3ea0
  (gdb) bt
  #0  0x0000559cb81c3ea0 in ?? ()
  ...
  (gdb) p $_siginfo._sifields._sigfault.si_addr
  $27 = (void *) 0x559cb81c3ea0
  ```

Helgrind seems to agree:
  ```
  ==00:00:02:54.519 510301== Possible data race during write of size 8 at 0xF123930 by thread ceph#90
  ==00:00:02:54.519 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8
  ==00:00:02:54.519 510301==    at 0x7218DD: operator= (shared_ptr_base.h:1078)
  ==00:00:02:54.519 510301==    by 0x7218DD: operator= (shared_ptr.h:103)
  ==00:00:02:54.519 510301==    by 0x7218DD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116)
  ==00:00:02:54.519 510301==    by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678)
  ==00:00:02:54.519 510301==    by 0x72A06C: Context::complete(int) (Context.h:77)
  ==00:00:02:54.519 510301==    by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66)
  ==00:00:02:54.519 510301==    by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
  ==00:00:02:54.519 510301==    by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
  ==00:00:02:54.519 510301==    by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so)
  ==00:00:02:54.519 510301==
  ==00:00:02:54.519 510301== This conflicts with a previous read of size 8 by thread ceph#117
  ==00:00:02:54.519 510301== Locks held: 1, at address 0x2123E9A0
  ==00:00:02:54.519 510301==    at 0x6B5842: __shared_ptr (shared_ptr_base.h:1165)
  ==00:00:02:54.519 510301==    by 0x6B5842: shared_ptr (shared_ptr.h:129)
  ==00:00:02:54.519 510301==    by 0x6B5842: get_osdmap (OSD.h:1700)
  ==00:00:02:54.519 510301==    by 0x6B5842: OSD::create_context() (OSD.cc:9053)
  ==00:00:02:54.519 510301==    by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665)
  ==00:00:02:54.519 510301==    by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701)
  ==00:00:02:54.519 510301==    by 0x70E62E: run (OpSchedulerItem.h:148)
  ==00:00:02:54.519 510301==    by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677)
  ==00:00:02:54.519 510301==    by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311)
  ==00:00:02:54.519 510301==    by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706)
  ==00:00:02:54.519 510301==    by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
  ==00:00:02:54.519 510301==    by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
  ==00:00:02:54.519 510301==  Address 0xf123930 is 3,824 bytes inside a block of size 10,296 alloc'd
  ==00:00:02:54.519 510301==    at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433)
  ==00:00:02:54.519 510301==    by 0x66F766: main (ceph_osd.cc:688)
  ==00:00:02:54.519 510301==  Block was alloc'd by thread #1
  ```

Actually there is plenty of similar issues reported like:
  ```
  ==00:00:05:04.903 510301== Possible data race during read of size 8 at 0x1E3E0588 by thread ceph#119
  ==00:00:05:04.903 510301== Locks held: 1, at address 0x1EAD41D0
  ==00:00:05:04.903 510301==    at 0x753165: clear (hashtable.h:2051)
  ==00:00:05:04.903 510301==    by 0x753165: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t>
  >, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__deta
  il::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369)
  ==00:00:05:04.903 510301==    by 0x75331C: ~unordered_map (unordered_map.h:102)
  ==00:00:05:04.903 510301==    by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350)
  ==00:00:05:04.903 510301==    by 0x753606: operator() (shared_cache.hpp:100)
  ==00:00:05:04.903 510301==    by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr
  _base.h:471)
  ==00:00:05:04.903 510301==    by 0x73BB26: _M_release (shared_ptr_base.h:155)
  ==00:00:05:04.903 510301==    by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148)
  ==00:00:05:04.903 510301==    by 0x6B58A9: ~__shared_count (shared_ptr_base.h:728)
  ==00:00:05:04.903 510301==    by 0x6B58A9: ~__shared_ptr (shared_ptr_base.h:1167)
  ==00:00:05:04.903 510301==    by 0x6B58A9: ~shared_ptr (shared_ptr.h:103)
  ==00:00:05:04.903 510301==    by 0x6B58A9: OSD::create_context() (OSD.cc:9053)
  ==00:00:05:04.903 510301==    by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665)
  ==00:00:05:04.903 510301==    by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701)
  ==00:00:05:04.903 510301==    by 0x70E62E: run (OpSchedulerItem.h:148)
  ==00:00:05:04.903 510301==    by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677)
  ==00:00:05:04.903 510301==    by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311)
  ==00:00:05:04.903 510301==    by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706)
  ==00:00:05:04.903 510301==    by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
  ==00:00:05:04.903 510301==    by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
  ==00:00:05:04.903 510301==    by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so)
  ==00:00:05:04.903 510301==
  ==00:00:05:04.903 510301== This conflicts with a previous write of size 8 by thread ceph#90
  ==00:00:05:04.903 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8
  ==00:00:05:04.903 510301==    at 0x7531E1: clear (hashtable.h:2054)
  ==00:00:05:04.903 510301==    by 0x7531E1: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t> >, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369)
  ==00:00:05:04.903 510301==    by 0x75331C: ~unordered_map (unordered_map.h:102)
  ==00:00:05:04.903 510301==    by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350)
  ==00:00:05:04.903 510301==    by 0x753606: operator() (shared_cache.hpp:100)
  ==00:00:05:04.903 510301==    by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr_base.h:471)
  ==00:00:05:04.903 510301==    by 0x73BB26: _M_release (shared_ptr_base.h:155)
  ==00:00:05:04.903 510301==    by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148)
  ==00:00:05:04.903 510301==    by 0x72191E: operator= (shared_ptr_base.h:747)
  ==00:00:05:04.903 510301==    by 0x72191E: operator= (shared_ptr_base.h:1078)
  ==00:00:05:04.903 510301==    by 0x72191E: operator= (shared_ptr.h:103)
  ==00:00:05:04.903 510301==    by 0x72191E: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116)
  ==00:00:05:04.903 510301==    by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678)
  ==00:00:05:04.903 510301==    by 0x72A06C: Context::complete(int) (Context.h:77)
  ==00:00:05:04.903 510301==    by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66)
  ==00:00:05:04.903 510301==  Address 0x1e3e0588 is 872 bytes inside a block of size 1,208 alloc'd
  ==00:00:05:04.903 510301==    at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433)
  ==00:00:05:04.903 510301==    by 0x6C7C0C: OSDService::try_get_map(unsigned int) (OSD.cc:1606)
  ==00:00:05:04.903 510301==    by 0x7213BD: get_map (OSD.h:699)
  ==00:00:05:04.903 510301==    by 0x7213BD: get_map (OSD.h:1732)
  ==00:00:05:04.903 510301==    by 0x7213BD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8076)
  ==00:00:05:04.903 510301==    by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678)
  ==00:00:05:04.903 510301==    by 0x72A06C: Context::complete(int) (Context.h:77)
  ==00:00:05:04.903 510301==    by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66)
  ==00:00:05:04.903 510301==    by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
  ==00:00:05:04.903 510301==    by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
  ==00:00:05:04.903 510301==    by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so)
  ```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
liewegas pushed a commit that referenced this pull request Mar 24, 2020
* no need to discard_result(). as `output_stream::close()` returns an
  empty future<> already
* free the connected socket after the background task finishes, because:

we should not free the connected socket before the promise referencing it is fulfilled.

otherwise we have error messages from ASan, like

==287182==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000019aa0 at pc 0x55e2ae2de882 bp 0x7fff7e2bf080 sp 0x7fff7e2bf078
READ of size 8 at 0x611000019aa0 thread T0
    #0 0x55e2ae2de881 in seastar::reactor_backend_aio::await_events(int, __sigset_t const*) ../src/seastar/src/core/reactor_backend.cc:396
    #1 0x55e2ae2dfb59 in seastar::reactor_backend_aio::reap_kernel_completions() ../src/seastar/src/core/reactor_backend.cc:428
    #2 0x55e2adbea397 in seastar::reactor::reap_kernel_completions_pollfn::poll() (/var/ssd/ceph/build/bin/crimson-osd+0x155e9397)
    #3 0x55e2adaec6d0 in seastar::reactor::poll_once() ../src/seastar/src/core/reactor.cc:2789
    #4 0x55e2adae7cf7 in operator() ../src/seastar/src/core/reactor.cc:2687
    #5 0x55e2adb7c595 in __invoke_impl<bool, seastar::reactor::run()::<lambda()>&> /usr/include/c++/10/bits/invoke.h:60
    #6 0x55e2adb699b0 in __invoke_r<bool, seastar::reactor::run()::<lambda()>&> /usr/include/c++/10/bits/invoke.h:113
    #7 0x55e2adb50222 in _M_invoke /usr/include/c++/10/bits/std_function.h:291
    #8 0x55e2adc2ba00 in std::function<bool ()>::operator()() const /usr/include/c++/10/bits/std_function.h:622
    #9 0x55e2adaea491 in seastar::reactor::run() ../src/seastar/src/core/reactor.cc:2713
    #10 0x55e2ad98f1c7 in seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) ../src/seastar/src/core/app-template.cc:199
    #11 0x55e2a9e57538 in main ../src/crimson/osd/main.cc:148
    #12 0x7fae7f20de0a in __libc_start_main ../csu/libc-start.c:308
    #13 0x55e2a9d431e9 in _start (/var/ssd/ceph/build/bin/crimson-osd+0x117421e9)

0x611000019aa0 is located 96 bytes inside of 240-byte region [0x611000019a40,0x611000019b30)
freed by thread T0 here:
    #0 0x7fae80a4e487 in operator delete(void*, unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.6+0xac487)
    #1 0x55e2ae302a0a in seastar::aio_pollable_fd_state::~aio_pollable_fd_state() ../src/seastar/src/core/reactor_backend.cc:458
    #2 0x55e2ae2e1059 in seastar::reactor_backend_aio::forget(seastar::pollable_fd_state&) ../src/seastar/src/core/reactor_backend.cc:524
    #3 0x55e2adab9b9a in seastar::pollable_fd_state::forget() ../src/seastar/src/core/reactor.cc:1396
    #4 0x55e2adab9d05 in seastar::intrusive_ptr_release(seastar::pollable_fd_state*) ../src/seastar/src/core/reactor.cc:1401
    #5 0x55e2ace1b72b in boost::intrusive_ptr<seastar::pollable_fd_state>::~intrusive_ptr() /opt/ceph/include/boost/smart_ptr/intrusive_ptr.hpp:98
    #6 0x55e2ace115a5 in seastar::pollable_fd::~pollable_fd() ../src/seastar/include/seastar/core/internal/pollable_fd.hh:109
    #7 0x55e2ae0ed35c in seastar::net::posix_server_socket_impl::~posix_server_socket_impl() ../src/seastar/include/seastar/net/posix-stack.hh:161
    #8 0x55e2ae0ed3cf in seastar::net::posix_server_socket_impl::~posix_server_socket_impl() ../src/seastar/include/seastar/net/posix-stack.hh:161
    #9 0x55e2ae0ed943 in std::default_delete<seastar::net::api_v2::server_socket_impl>::operator()(seastar::net::api_v2::server_socket_impl*) const /usr/include/c++/10/bits/unique_ptr.h:81
    #10 0x55e2ae0db357 in std::unique_ptr<seastar::net::api_v2::server_socket_impl, std::default_delete<seastar::net::api_v2::server_socket_impl> >::~unique_ptr()
	/usr/include/c++/10/bits/unique_ptr.h:357    #11 0x55e2ae1438b7 in seastar::api_v2::server_socket::~server_socket() ../src/seastar/src/net/stack.cc:195
    #12 0x55e2aa1c7656 in std::_Optional_payload_base<seastar::api_v2::server_socket>::_M_destroy() /usr/include/c++/10/optional:260
    #13 0x55e2aa16c84b in std::_Optional_payload_base<seastar::api_v2::server_socket>::_M_reset() /usr/include/c++/10/optional:280
    #14 0x55e2ac24b2b7 in std::_Optional_base_impl<seastar::api_v2::server_socket, std::_Optional_base<seastar::api_v2::server_socket, false, false> >::_M_reset() /usr/include/c++/10/optional:432
    #15 0x55e2ac23f37b in std::optional<seastar::api_v2::server_socket>::reset() /usr/include/c++/10/optional:975
    #16 0x55e2ac21a2e7 in crimson::admin::AdminSocket::stop() ../src/crimson/admin/admin_socket.cc:265
    #17 0x55e2aa099825 in operator() ../src/crimson/osd/osd.cc:450
    #18 0x55e2aa0d4e3e in apply ../src/seastar/include/seastar/core/apply.hh:36

Signed-off-by: Kefu Chai <kchai@redhat.com>
liewegas pushed a commit that referenced this pull request Feb 19, 2021
Accordingly to cppreference.com [1]:

  "If multiple threads of execution access the same std::shared_ptr
  object without synchronization and any of those accesses uses
  a non-const member function of shared_ptr then a data race will
  occur (...)"

[1]: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic

One of the coredumps showed the `shared_ptr`-typed `OSD::osdmap`
with healthy looking content but damaged control block:

  ```
  [Current thread is 1 (Thread 0x7f7dcaf73700 (LWP 205295))]
  (gdb) bt
  #0  0x0000559cb81c3ea0 in ?? ()
  #1  0x0000559c97675b27 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148
  #2  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148
  #3  0x0000559c975ef8aa in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167
  #4  std::__shared_ptr<OSDMap const, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167
  #5  std::shared_ptr<OSDMap const>::~shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr.h:103
  #6  OSD::create_context (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9053
  #7  0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...)
      at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665
  #8  0x0000559c97886db6 in ceph::osd::scheduler::PGPeeringItem::run (this=<optimized out>, osd=<optimized out>, sdata=<optimized out>, pg=..., handle=...) at /usr/include/c++/8/ext/atomicity.h:96
  #9  0x0000559c9764862f in ceph::osd::scheduler::OpSchedulerItem::run (handle=..., pg=..., sdata=<optimized out>, osd=<optimized out>, this=0x7f7dcaf703f0) at /usr/include/c++/8/bits/unique_ptr.h:342
  #10 OSD::ShardedOpWQ::_process (this=<optimized out>, thread_index=<optimized out>, hb=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:10677
  #11 0x0000559c97c76094 in ShardedThreadPool::shardedthreadpool_worker (this=0x559ca22aca28, thread_index=14) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.cc:311
  #12 0x0000559c97c78cf4 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.h:706
  #13 0x00007f7df17852de in start_thread () from /lib64/libpthread.so.0
  #14 0x00007f7df052f133 in __libc_ifunc_impl_list () from /lib64/libc.so.6
  #15 0x0000000000000000 in ?? ()
  (gdb) frame 7
  #7  0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...)
      at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665
  9665      in /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc
  (gdb) print osdmap
  $24 = std::shared_ptr<const OSDMap> (expired, weak count 0) = {get() = 0x559cba028000}
  (gdb) print *osdmap
     # pretty sane OSDMap
  (gdb) print sizeof(osdmap)
  $26 = 16
  (gdb) x/2a &osdmap
  0x559ca22acef0:   0x559cba028000  0x559cba0ec900

  (gdb) frame 2
  #2  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148
  148       /usr/include/c++/8/bits/shared_ptr_base.h: No such file or directory.
  (gdb) disassemble
  Dump of assembler code for function std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release():
  ...
     0x0000559c97675b1e <+62>:      mov    (%rdi),%rax
     0x0000559c97675b21 <+65>:      mov    %rdi,%rbx
     0x0000559c97675b24 <+68>:      callq  *0x10(%rax)
  => 0x0000559c97675b27 <+71>:      test   %rbp,%rbp
  ...
  End of assembler dump.
  (gdb) info registers rdi rbx rax
  rdi            0x559cba0ec900      94131624790272
  rbx            0x559cba0ec900      94131624790272
  rax            0x559cba0ec8a0      94131624790176
  (gdb) x/a 0x559cba0ec8a0 + 0x10
  0x559cba0ec8b0:   0x559cb81c3ea0
  (gdb) bt
  #0  0x0000559cb81c3ea0 in ?? ()
  ...
  (gdb) p $_siginfo._sifields._sigfault.si_addr
  $27 = (void *) 0x559cb81c3ea0
  ```

Helgrind seems to agree:
  ```
  ==00:00:02:54.519 510301== Possible data race during write of size 8 at 0xF123930 by thread ceph#90
  ==00:00:02:54.519 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8
  ==00:00:02:54.519 510301==    at 0x7218DD: operator= (shared_ptr_base.h:1078)
  ==00:00:02:54.519 510301==    by 0x7218DD: operator= (shared_ptr.h:103)
  ==00:00:02:54.519 510301==    by 0x7218DD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116)
  ==00:00:02:54.519 510301==    by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678)
  ==00:00:02:54.519 510301==    by 0x72A06C: Context::complete(int) (Context.h:77)
  ==00:00:02:54.519 510301==    by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66)
  ==00:00:02:54.519 510301==    by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
  ==00:00:02:54.519 510301==    by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
  ==00:00:02:54.519 510301==    by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so)
  ==00:00:02:54.519 510301==
  ==00:00:02:54.519 510301== This conflicts with a previous read of size 8 by thread ceph#117
  ==00:00:02:54.519 510301== Locks held: 1, at address 0x2123E9A0
  ==00:00:02:54.519 510301==    at 0x6B5842: __shared_ptr (shared_ptr_base.h:1165)
  ==00:00:02:54.519 510301==    by 0x6B5842: shared_ptr (shared_ptr.h:129)
  ==00:00:02:54.519 510301==    by 0x6B5842: get_osdmap (OSD.h:1700)
  ==00:00:02:54.519 510301==    by 0x6B5842: OSD::create_context() (OSD.cc:9053)
  ==00:00:02:54.519 510301==    by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665)
  ==00:00:02:54.519 510301==    by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701)
  ==00:00:02:54.519 510301==    by 0x70E62E: run (OpSchedulerItem.h:148)
  ==00:00:02:54.519 510301==    by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677)
  ==00:00:02:54.519 510301==    by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311)
  ==00:00:02:54.519 510301==    by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706)
  ==00:00:02:54.519 510301==    by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
  ==00:00:02:54.519 510301==    by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
  ==00:00:02:54.519 510301==  Address 0xf123930 is 3,824 bytes inside a block of size 10,296 alloc'd
  ==00:00:02:54.519 510301==    at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433)
  ==00:00:02:54.519 510301==    by 0x66F766: main (ceph_osd.cc:688)
  ==00:00:02:54.519 510301==  Block was alloc'd by thread #1
  ```

Actually there is plenty of similar issues reported like:
  ```
  ==00:00:05:04.903 510301== Possible data race during read of size 8 at 0x1E3E0588 by thread ceph#119
  ==00:00:05:04.903 510301== Locks held: 1, at address 0x1EAD41D0
  ==00:00:05:04.903 510301==    at 0x753165: clear (hashtable.h:2051)
  ==00:00:05:04.903 510301==    by 0x753165: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t>
  >, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__deta
  il::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369)
  ==00:00:05:04.903 510301==    by 0x75331C: ~unordered_map (unordered_map.h:102)
  ==00:00:05:04.903 510301==    by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350)
  ==00:00:05:04.903 510301==    by 0x753606: operator() (shared_cache.hpp:100)
  ==00:00:05:04.903 510301==    by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr
  _base.h:471)
  ==00:00:05:04.903 510301==    by 0x73BB26: _M_release (shared_ptr_base.h:155)
  ==00:00:05:04.903 510301==    by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148)
  ==00:00:05:04.903 510301==    by 0x6B58A9: ~__shared_count (shared_ptr_base.h:728)
  ==00:00:05:04.903 510301==    by 0x6B58A9: ~__shared_ptr (shared_ptr_base.h:1167)
  ==00:00:05:04.903 510301==    by 0x6B58A9: ~shared_ptr (shared_ptr.h:103)
  ==00:00:05:04.903 510301==    by 0x6B58A9: OSD::create_context() (OSD.cc:9053)
  ==00:00:05:04.903 510301==    by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665)
  ==00:00:05:04.903 510301==    by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701)
  ==00:00:05:04.903 510301==    by 0x70E62E: run (OpSchedulerItem.h:148)
  ==00:00:05:04.903 510301==    by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677)
  ==00:00:05:04.903 510301==    by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311)
  ==00:00:05:04.903 510301==    by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706)
  ==00:00:05:04.903 510301==    by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
  ==00:00:05:04.903 510301==    by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
  ==00:00:05:04.903 510301==    by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so)
  ==00:00:05:04.903 510301==
  ==00:00:05:04.903 510301== This conflicts with a previous write of size 8 by thread ceph#90
  ==00:00:05:04.903 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8
  ==00:00:05:04.903 510301==    at 0x7531E1: clear (hashtable.h:2054)
  ==00:00:05:04.903 510301==    by 0x7531E1: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t> >, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369)
  ==00:00:05:04.903 510301==    by 0x75331C: ~unordered_map (unordered_map.h:102)
  ==00:00:05:04.903 510301==    by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350)
  ==00:00:05:04.903 510301==    by 0x753606: operator() (shared_cache.hpp:100)
  ==00:00:05:04.903 510301==    by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr_base.h:471)
  ==00:00:05:04.903 510301==    by 0x73BB26: _M_release (shared_ptr_base.h:155)
  ==00:00:05:04.903 510301==    by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148)
  ==00:00:05:04.903 510301==    by 0x72191E: operator= (shared_ptr_base.h:747)
  ==00:00:05:04.903 510301==    by 0x72191E: operator= (shared_ptr_base.h:1078)
  ==00:00:05:04.903 510301==    by 0x72191E: operator= (shared_ptr.h:103)
  ==00:00:05:04.903 510301==    by 0x72191E: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116)
  ==00:00:05:04.903 510301==    by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678)
  ==00:00:05:04.903 510301==    by 0x72A06C: Context::complete(int) (Context.h:77)
  ==00:00:05:04.903 510301==    by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66)
  ==00:00:05:04.903 510301==  Address 0x1e3e0588 is 872 bytes inside a block of size 1,208 alloc'd
  ==00:00:05:04.903 510301==    at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433)
  ==00:00:05:04.903 510301==    by 0x6C7C0C: OSDService::try_get_map(unsigned int) (OSD.cc:1606)
  ==00:00:05:04.903 510301==    by 0x7213BD: get_map (OSD.h:699)
  ==00:00:05:04.903 510301==    by 0x7213BD: get_map (OSD.h:1732)
  ==00:00:05:04.903 510301==    by 0x7213BD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8076)
  ==00:00:05:04.903 510301==    by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678)
  ==00:00:05:04.903 510301==    by 0x72A06C: Context::complete(int) (Context.h:77)
  ==00:00:05:04.903 510301==    by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66)
  ==00:00:05:04.903 510301==    by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
  ==00:00:05:04.903 510301==    by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
  ==00:00:05:04.903 510301==    by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so)
  ```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
(cherry picked from commit 80da5f9)

Conflicts:
	src/osd/OSD.cc in
		bool OSD::asok_command
		int OSD::shutdown
		void OSD::maybe_update_heartbeat_peers
		void OSD::_preboot
		void OSD::queue_want_up_thru
		void OSD::send_alive
		void OSD::send_failures
		void OSD::send_beacon
		MPGStats* OSD::collect_pg_stats
		void OSD::note_down_osd
		void OSD::consume_map
		void OSD::activate_map
	src/osd/OSD.h in
		private: dispatch_session_waiting

- also use the new const OSDMapRef in places that no longer exist in master
	src/osd/OSD.cc in
		void OSDService::share_map
		void OSDService::send_incremental_map
		int OSD::_do_command
		void OSD::note_up_osd
		int OSD::init_op_flags
	src/osd/OSD.h in
		void send_incremental_map
		void share_map
liewegas added a commit that referenced this pull request May 4, 2021
Otherwise, if we assert, we'll hang here:

Thread 1 (Thread 0x7f74eba79580 (LWP 1688617)):
#0  0x00007f74eb2aa529 in futex_wait (private=<optimized out>, expected=132, futex_word=0x7ffd642b4b54) at ../sysdeps/unix/sysv/linux/futex-internal.h:61
#1  futex_wait_simple (private=<optimized out>, expected=132, futex_word=0x7ffd642b4b54) at ../sysdeps/nptl/futex-internal.h:135
#2  __pthread_cond_destroy (cond=0x7ffd642b4b30) at pthread_cond_destroy.c:54

#3  0x0000563ff2e5a891 in LibRadosService_StatusFormat_Test::TestBody (this=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:78
#4  0x0000563ff2e9dc3a in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (location=0x563ff2ea72e4 "the test body", method=<optimized out>, object=0x563ff422a6d0)
    at ./src/googletest/googletest/src/gtest.cc:2605
#5  testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=object@entry=0x563ff422a6d0, method=<optimized out>, location=location@entry=0x563ff2ea72e4 "the test body")
    at ./src/googletest/googletest/src/gtest.cc:2641
#6  0x0000563ff2e908c3 in testing::Test::Run (this=0x563ff422a6d0) at ./src/googletest/googletest/src/gtest.cc:2680
#7  0x0000563ff2e90a25 in testing::TestInfo::Run (this=0x563ff41a3b70) at ./src/googletest/googletest/src/gtest.cc:2858
#8  0x0000563ff2e90ec1 in testing::TestSuite::Run (this=0x563ff41b6230) at ./src/googletest/googletest/src/gtest.cc:3012
#9  0x0000563ff2e92bdc in testing::internal::UnitTestImpl::RunAllTests (this=<optimized out>) at ./src/googletest/googletest/src/gtest.cc:5723
#10 0x0000563ff2e9e14a in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (location=0x563ff2ea8728 "auxiliary test code (environments or event listeners)",
    method=<optimized out>, object=0x563ff41a2d10) at ./src/googletest/googletest/src/gtest.cc:2605
#11 testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x563ff41a2d10, method=<optimized out>,
    location=location@entry=0x563ff2ea8728 "auxiliary test code (environments or event listeners)") at ./src/googletest/googletest/src/gtest.cc:2641
#12 0x0000563ff2e90ae8 in testing::UnitTest::Run (this=0x563ff30c0660 <testing::UnitTest::GetInstance()::instance>) at ./src/googletest/googletest/src/gtest.cc:5306

Signed-off-by: Sage Weil <sage@newdream.net>
liewegas added a commit that referenced this pull request May 5, 2021
Otherwise, if we assert, we'll hang here:

Thread 1 (Thread 0x7f74eba79580 (LWP 1688617)):
#0  0x00007f74eb2aa529 in futex_wait (private=<optimized out>, expected=132, futex_word=0x7ffd642b4b54) at ../sysdeps/unix/sysv/linux/futex-internal.h:61
#1  futex_wait_simple (private=<optimized out>, expected=132, futex_word=0x7ffd642b4b54) at ../sysdeps/nptl/futex-internal.h:135
#2  __pthread_cond_destroy (cond=0x7ffd642b4b30) at pthread_cond_destroy.c:54

#3  0x0000563ff2e5a891 in LibRadosService_StatusFormat_Test::TestBody (this=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:78
#4  0x0000563ff2e9dc3a in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (location=0x563ff2ea72e4 "the test body", method=<optimized out>, object=0x563ff422a6d0)
    at ./src/googletest/googletest/src/gtest.cc:2605
#5  testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=object@entry=0x563ff422a6d0, method=<optimized out>, location=location@entry=0x563ff2ea72e4 "the test body")
    at ./src/googletest/googletest/src/gtest.cc:2641
#6  0x0000563ff2e908c3 in testing::Test::Run (this=0x563ff422a6d0) at ./src/googletest/googletest/src/gtest.cc:2680
#7  0x0000563ff2e90a25 in testing::TestInfo::Run (this=0x563ff41a3b70) at ./src/googletest/googletest/src/gtest.cc:2858
#8  0x0000563ff2e90ec1 in testing::TestSuite::Run (this=0x563ff41b6230) at ./src/googletest/googletest/src/gtest.cc:3012
#9  0x0000563ff2e92bdc in testing::internal::UnitTestImpl::RunAllTests (this=<optimized out>) at ./src/googletest/googletest/src/gtest.cc:5723
#10 0x0000563ff2e9e14a in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (location=0x563ff2ea8728 "auxiliary test code (environments or event listeners)",
    method=<optimized out>, object=0x563ff41a2d10) at ./src/googletest/googletest/src/gtest.cc:2605
#11 testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x563ff41a2d10, method=<optimized out>,
    location=location@entry=0x563ff2ea8728 "auxiliary test code (environments or event listeners)") at ./src/googletest/googletest/src/gtest.cc:2641
#12 0x0000563ff2e90ae8 in testing::UnitTest::Run (this=0x563ff30c0660 <testing::UnitTest::GetInstance()::instance>) at ./src/googletest/googletest/src/gtest.cc:5306

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit ee5a0c9)
liewegas pushed a commit that referenced this pull request May 11, 2021
In 7f04700, we made the pg removal code
much more efficient. But it started marking the pgmeta object as an unexpected
onode, which in reality is expected to be removed after all the other objects.

This behavior is very easily reproducible in a vstart cluster:

ceph osd pool create test 1 1
rados -p test bench 10 write --no-cleanup
ceph osd pool delete test test  --yes-i-really-really-mean-it

Before this patch:

"do_delete_work additional unexpected onode list (new onodes has appeared
since PG removal started[#2:00000000::::head#]" seen in the OSD logs.

After this patch:

"do_delete_work removing pgmeta object #2:00000000::::head#" is seen.

Related to:https://tracker.ceph.com/issues/50466
Signed-off-by: Neha Ojha <nojha@redhat.com>
liewegas pushed a commit that referenced this pull request May 26, 2021
f7181ab has optimized the client
parallelism. To achieve that `PG::do_osd_ops()` were converted to
return basically future pair of futures. Unfortunately, the life-
time management of `OpsExecuter` was kept intact. In the result,
the object was valid only till fullfying the outer future while,
due to the `rollbacker` instances, it should be available till
`all_completed` becomes available.

This issue can explain the following problem has been observed
in a Teuthology job [1].

```
DEBUG 2021-05-20 08:03:22,617 [shard 0] osd - do_op_call: method returned ret=-17, outdata.length()=0 while num_read=1, num_write=0
DEBUG 2021-05-20 08:03:22,617 [shard 0] osd - rollback_obc_if_modified: object 19:e17d4708:test-rados-api-smithi095-38404-2::foo:head got erro
r generic:17, need_rollback=false
=================================================================
==33626==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d0000b9320 at pc 0x560f486b8222 bp 0x7fffc467a1e0 sp 0x7fffc467a1d0
READ of size 4 at 0x60d0000b9320 thread T0
    #0 0x560f486b8221  (/usr/bin/ceph-osd+0x2c610221)
    #1 0x560f4880c6b1 in seastar::continuation<seastar::internal::promise_base_with_type<boost::intrusive_ptr<MOSDOpReply> >, seastar::noncopy
able_function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_
wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > >
 > ()>, seastar::future<void>::then_impl_nrvo<seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail<crimson::osd::
IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson:
:errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>, crimson::interruptible::interruptible_future_detail<crimson::osd::IOInte
rruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::error
ated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > >(seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail
<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_
future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>&&)::{lambda(seastar::internal::promise_base_with_type<boos
t::intrusive_ptr<MOSDOpReply> >&&, seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>::run_and_dispose() (/usr/bin/ceph-osd+0x2c7646b1)
    #2 0x560f5352c3ae  (/usr/bin/ceph-osd+0x374843ae)
    #3 0x560f535318ef  (/usr/bin/ceph-osd+0x374898ef)
    #4 0x560f536e395a  (/usr/bin/ceph-osd+0x3763b95a)
    #5 0x560f532413d9  (/usr/bin/ceph-osd+0x371993d9)
    #6 0x560f476af95a in main (/usr/bin/ceph-osd+0x2b60795a)
    #7 0x7f7aa0af97b2 in __libc_start_main (/lib64/libc.so.6+0x237b2)
    #8 0x560f477d2e8d in _start (/usr/bin/ceph-osd+0x2b72ae8d)

```

[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-20_07:28:16-rados-master-distro-basic-smithi/6124735/

The commit deals with the problem by repacking the outer future.
An alternative could be in switching from `std::unique_ptr` to
`seastar::shared_ptr` for managing `OpsExecuter`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
liewegas pushed a commit that referenced this pull request Jun 2, 2021
The `FuturizedStore` interface imposes the `get_attr()`
takes the `name` parameter as `std::string_view`, and
thus burdens implementations with extending the life-
time of the data the instance refers to.

Unfortunately, `AlienStore` is unaware that prolonging
the life of a `std::string_view` instance doesn't prolong
the data memory it points to. This problem has manifested
in the following use-after-free detected at Sepia:

```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/6136929$ less ./remote/smithi194/log/ceph-osd.7.log.gz
...
DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - do_osd_ops_execute: object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head - handling op
call
DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - handling op call on object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - calling method lock.lock, num_read=0, num_write=0
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - handling op getxattr on object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - getxattr on obj=14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head for attr=_lock.TestLockPP1
DEBUG 2021-05-26 20:24:54,078 [shard 0] bluestore - get_attr
=================================================================
==34068==ERROR: AddressSanitizer: heap-use-after-free on address 0x6030001851d0 at pc 0x7f824d6a5b27 bp 0x7f822b4201c0 sp 0x7f822b41f968
READ of size 17 at 0x6030001851d0 thread T28 (alien-store-tp)
...
    #0 0x7f824d6a5b26  (/lib64/libasan.so.5+0x40b26)
    #1 0x55e2cbb2e00b  (/usr/bin/ceph-osd+0x2b6dc00b)
    #2 0x55e2d31f086e  (/usr/bin/ceph-osd+0x32d9e86e)
    #3 0x55e2d3467607 in crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) (/usr/bin/ceph-osd+0x33015607)
    #4 0x55e2d346b14a  (/usr/bin/ceph-osd+0x3301914a)
    #5 0x7f8249d32ba2  (/lib64/libstdc++.so.6+0xc2ba2)
    #6 0x7f824a00d149 in start_thread (/lib64/libpthread.so.0+0x8149)
    #7 0x7f82486edf22 in clone (/lib64/libc.so.6+0xfcf22)

0x6030001851d0 is located 0 bytes inside of 31-byte region [0x6030001851d0,0x6030001851ef)
freed by thread T0 here:
    #0 0x7f824d757688 in operator delete(void*) (/lib64/libasan.so.5+0xf2688)

previously allocated by thread T0 here:
    #0 0x7f824d7567b0 in operator new(unsigned long) (/lib64/libasan.so.5+0xf17b0)

Thread T28 (alien-store-tp) created by T0 here:
    #0 0x7f824d6b7ea3 in __interceptor_pthread_create (/lib64/libasan.so.5+0x52ea3)

SUMMARY: AddressSanitizer: heap-use-after-free (/lib64/libasan.so.5+0x40b26)
Shadow bytes around the buggy address:
  0x0c06800289e0: fd fd fd fa fa fa fd fd fd fa fa fa 00 00 00 fa
  0x0c06800289f0: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
  0x0c0680028a00: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
  0x0c0680028a10: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
  0x0c0680028a20: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
=>0x0c0680028a30: fd fd fa fa fd fd fd fd fa fa[fd]fd fd fd fa fa
  0x0c0680028a40: fd fd fd fd fa fa fd fd fd fd fa fa 00 00 00 07
  0x0c0680028a50: fa fa 00 00 00 fa fa fa 00 00 00 fa fa fa fd fd
  0x0c0680028a60: fd fd fa fa fd fd fd fd fa fa fd fd fd fd fa fa
  0x0c0680028a70: 00 00 00 00 fa fa fd fd fd fd fa fa fd fd fd fd
  0x0c0680028a80: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==34068==ABORTING
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
liewegas pushed a commit that referenced this pull request Jun 8, 2021
This is an AlienStore's counterpart of 9fbd601
which becomes necessary as, since 6b66c30,
we call `open_collection()` to check the existence of OSD's superblock.

Without this fix the following crash happens:

```
[rzarzynski@o06 build]$ MDS=0 MGR=0 OSD=1 ../src/vstart.sh -l -n --without-dashboard --crimson
...
crimson-osd: boost/include/boost/smart_ptr/intrusive_ptr.hpp:199: T* boost::intrusive_ptr<T>::operator->() const [with T = ObjectStore::CollectionImpl]: Assertion `px != 0' failed.
Aborting on shard 0.
Backtrace:
 0# print_backtrace(std::basic_string_view<char, std::char_traits<char> >) at /home/rzarzynski/ceph1/build/boost/include/boost/stacktrace/stacktrace.hpp:129
 1# FatalSignal::signaled(int, siginfo_t const*) at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:80
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::operator()(int, siginfo_t*, void*) const at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:41
 3# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) at /home/rzarzynski/ceph1/build/../src/crimson/common/fatal_signal.cc:36
 4# 0x00007F8BF534DB30 in /lib64/libpthread.so.0
 5# gsignal in /lib64/libc.so.6
 6# abort in /lib64/libc.so.6
 7# 0x00007F8BF3948B19 in /lib64/libc.so.6
 8# 0x00007F8BF3956DF6 in /lib64/libc.so.6
 9# boost::intrusive_ptr<ObjectStore::CollectionImpl>::operator->() const at /home/rzarzynski/ceph1/build/boost/include/boost/smart_ptr/intrusive_ptr.hpp:200
10# crimson::os::AlienStore::open_collection(coll_t const&)::{lambda(boost::intrusive_ptr<ObjectStore::CollectionImpl>)#2}::operator()(boost::intrusive_ptr<ObjectStore::CollectionImpl>) const at /home/rzarzynski/ceph1/build/../src/crimson/os/alienstore/alien_store.cc:214
11# seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > seastar::futurize<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >::invoke<crimson::os::AlienStore::open_collection(coll_t const&)::{lambda(boost::intrusive_ptr<ObjectStore::CollectionImpl>)#2}&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >(crimson::os::AlienStore::open_collection(coll_t const&)::{lambda(boost::intrusive_ptr<ObjectStore::CollectionImpl>)#2}&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:2135
12# auto seastar::futurize_invoke<crimson::os::AlienStore::open_collection(coll_t const&)::{lambda(boost::intrusive_ptr<ObjectStore::CollectionImpl>)#2}&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >(crimson::os::AlienStore::open_collection(coll_t const&)::{lambda(boost::intrusive_ptr<ObjectStore::CollectionImpl>)#2}&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:2167
13# _ZZN7seastar6futureIN5boost13intrusive_ptrIN11ObjectStore14CollectionImplEEEE4thenIZN7crimson2os10AlienStore15open_collectionERK6coll_tEUlS5_E0_NS0_INS2_INS9_19FuturizedCollectionEEEEEEET0_OT_ENUlDpOT_E_clIJS5_EEEDaSN_ at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:1526
14# _ZN7seastar20noncopyable_functionIFNS_6futureIN5boost13intrusive_ptrIN7crimson2os19FuturizedCollectionEEEEEONS3_IN11ObjectStore14CollectionImplEEEEE17direct_vtable_forIZNS1_ISB_E4thenIZNS5_10AlienStore15open_collectionERK6coll_tEUlSB_E0_S8_EET0_OT_EUlDpOT_E_E4callEPKSE_SC_ at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/util/noncopyable_function.hh:125
15# seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>::operator()(boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) const at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/util/noncopyable_function.hh:210
16# seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > std::__invoke_impl<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> >, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >(std::__invoke_other, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) at /opt/rh/gcc-toolset-9/root/usr/include/c++/9/bits/invoke.h:60
17# std::__invoke_result<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >::type std::__invoke<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) at /opt/rh/gcc-toolset-9/root/usr/include/c++/9/bits/invoke.h:97
18# std::invoke_result<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >::type std::invoke<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) at /opt/rh/gcc-toolset-9/root/usr/include/c++/9/functional:83
19# auto seastar::internal::future_invoke<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl> >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&&) at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:1213
20# seastar::future<boost::intrusive_ptr<ObjectStore::CollectionImpl> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>, seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&)#1}::operator()(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&) const::{lambda()#1}::operator()() const at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:1575
21# void seastar::futurize<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >::satisfy_with_result_of<seastar::future<boost::intrusive_ptr<ObjectStore::CollectionImpl> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>, seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&)#1}::operator()(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&) const::{lambda()#1}>(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&&) at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:2120
22# seastar::future<boost::intrusive_ptr<ObjectStore::CollectionImpl> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>, seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&)#1}::operator()(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&) const at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:1571
23# seastar::continuation<seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>, seastar::future<boost::intrusive_ptr<ObjectStore::CollectionImpl> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>, seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >(seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<boost::intrusive_ptr<crimson::os::FuturizedCollection> >&&, seastar::noncopyable_function<seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > (boost::intrusive_ptr<ObjectStore::CollectionImpl>&&)>&, seastar::future_state<boost::intrusive_ptr<ObjectStore::CollectionImpl> >&&)#1}, boost::intrusive_ptr<ObjectStore::CollectionImpl> >::run_and_dispose() at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/core/future.hh:771
24# seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /home/rzarzynski/ceph1/build/../src/seastar/src/core/reactor.cc:2237
25# seastar::reactor::run_some_tasks() at /home/rzarzynski/ceph1/build/../src/seastar/src/core/reactor.cc:2646 (discriminator 1)
26# seastar::reactor::run() at /home/rzarzynski/ceph1/build/../src/seastar/src/core/reactor.cc:2805
27# seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at /home/rzarzynski/ceph1/build/../src/seastar/src/core/app-template.cc:207 (discriminator 7)
28# seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) at /home/rzarzynski/ceph1/build/../src/seastar/src/core/app-template.cc:115 (discriminator 2)
29# main at /home/rzarzynski/ceph1/build/../src/crimson/osd/main.cc:206 (discriminator 1)
30# __libc_start_main in /lib64/libc.so.6
31# _start in /home/rzarzynski/ceph1/build/bin/crimson-osd
Reactor stalled for 33157 ms on shard 0. Backtrace: 0xb14ab 0xa6cd418 0xa6a496d 0xa54fd22 0xa566e3d 0xa56383c 0xa5639fc 0xa566808 0x12b2f 0x3780e 0x21c44 0x21b18 0x2fdf5 0x7cfdf93 0x7b78622 0x7c3fcb7 0x7c1bcb2 0x7c1bdf1 0x7c4004a 0x7d913c7 0x7d875ed 0x7d777d7 0x7d5f626 0x7d47ca4 0x7d47b5a 0x7d5f770 0x7d47931 0x7d9bb74 0xa5998b5 0xa59e1ff 0xa5a3a95 0xa43546c 0xa433331 0x3f01992 0x237c2 0x3c7f9bd
../src/vstart.sh: line 28: 665993 Aborted                 (core dumped) PATH=$CEPH_BIN:$PATH "$@"
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
liewegas pushed a commit that referenced this pull request Jun 30, 2021
In 7f04700, we made the pg removal code
much more efficient. But it started marking the pgmeta object as an unexpected
onode, which in reality is expected to be removed after all the other objects.

This behavior is very easily reproducible in a vstart cluster:

ceph osd pool create test 1 1
rados -p test bench 10 write --no-cleanup
ceph osd pool delete test test  --yes-i-really-really-mean-it

Before this patch:

"do_delete_work additional unexpected onode list (new onodes has appeared
since PG removal started[#2:00000000::::head#]" seen in the OSD logs.

After this patch:

"do_delete_work removing pgmeta object #2:00000000::::head#" is seen.

Related to:https://tracker.ceph.com/issues/50466
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 0e917f1)
liewegas pushed a commit that referenced this pull request Jul 28, 2021
In 7f04700, we made the pg removal code
much more efficient. But it started marking the pgmeta object as an unexpected
onode, which in reality is expected to be removed after all the other objects.

This behavior is very easily reproducible in a vstart cluster:

ceph osd pool create test 1 1
rados -p test bench 10 write --no-cleanup
ceph osd pool delete test test  --yes-i-really-really-mean-it

Before this patch:

"do_delete_work additional unexpected onode list (new onodes has appeared
since PG removal started[#2:00000000::::head#]" seen in the OSD logs.

After this patch:

"do_delete_work removing pgmeta object #2:00000000::::head#" is seen.

Related to:https://tracker.ceph.com/issues/50466
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 0e917f1)
liewegas pushed a commit that referenced this pull request Sep 30, 2021
`PG::request_{local,remote}_recovery_reservation()` dynamically allocates
up to 2 instances of `LambdaContext<T>` and transfers their ownership to
the `AsyncReserver<T, F>`. This is expressed in raw pointers (`new` and
`delete`) notion. Further analysis shows the only place where `delete`
for these objects is called is the `AsyncReserver::cancel_reservation()`.
In contrast to the classical OSD, crimson doesn't invoke the method when
stopping a PG during the shutdown sequence. This would explain the
following ASan issue observed at Sepia:

```
Direct leak of 576 byte(s) in 24 object(s) allocated from:
    #0 0x7fa108fc57b0 in operator new(unsigned long) (/lib64/libasan.so.5+0xf17b0)
    #1 0x55723d8b0b56 in non-virtual thunk to crimson::osd::PG::request_local_background_io_reservation(unsigned int, std::unique_ptr<PGPeeringEvent, std::default_delete<PGPeeringEvent> >, std::unique_ptr<PGPeeringEvent, std::default_delete<PGPeeringEvent> >) (/usr/bin/ceph-osd+0x24d95b56)
    #2 0x55723f1f66ef in PeeringState::WaitDeleteReserved::WaitDeleteReserved(boost::statechart::state<PeeringState::WaitDeleteReserved, PeeringState::ToDelete, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::my_context) (/usr/bin/ceph-osd+0x266db6ef)
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
alfonsomthd pushed a commit that referenced this pull request Oct 18, 2021
exit() will call pthread_cond_destroy attempting to destroy dpdk::eal::cond
upon which other threads are currently blocked results in undefine
behavior. Link different libc version test, libc-2.17 can exit,
libc-2.27 will deadlock, the call stack is as follows:

Thread 3 (Thread 0xffff7e5749f0 (LWP 62213)):
 #0  0x0000ffff7f3c422c in futex_wait_cancelable (private=<optimized out>, expected=0,
    futex_word=0xaaaadc0e30f4 <dpdk::eal::cond+44>) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
 #1  __pthread_cond_wait_common (abstime=0x0, mutex=0xaaaadc0e30f8 <dpdk::eal::lock>, cond=0xaaaadc0e30c8 <dpdk::eal::cond>)
    at pthread_cond_wait.c:502
 #2  __pthread_cond_wait (cond=0xaaaadc0e30c8 <dpdk::eal::cond>, mutex=0xaaaadc0e30f8 <dpdk::eal::lock>)
    at pthread_cond_wait.c:655
 #3  0x0000ffff7f1f1f80 in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
   from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
 #4  0x0000aaaad37f5078 in dpdk::eal::<lambda()>::operator()(void) const (__closure=<optimized out>, __closure=<optimized out>)
    at ./src/msg/async/dpdk/dpdk_rte.cc:136
 #5  0x0000ffff7f1f7ed4 in ?? () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
 #6  0x0000ffff7f3be088 in start_thread (arg=0xffffe73e197f) at pthread_create.c:463
 #7  0x0000ffff7efc74ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78

Thread 1 (Thread 0xffff7ee3b010 (LWP 62200)):
 #0  0x0000ffff7f3c3c38 in futex_wait (private=<optimized out>, expected=12, futex_word=0xaaaadc0e30ec <dpdk::eal::cond+36>)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:61
 #1  futex_wait_simple (private=<optimized out>, expected=12, futex_word=0xaaaadc0e30ec <dpdk::eal::cond+36>)
    at ../sysdeps/nptl/futex-internal.h:135
 #2  __pthread_cond_destroy (cond=0xaaaadc0e30c8 <dpdk::eal::cond>) at pthread_cond_destroy.c:54
 #3  0x0000ffff7ef2be34 in __run_exit_handlers (status=-6, listp=0xffff7f04a5a0 <__exit_funcs>, run_list_atexit=255,
    run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108
 #4  0x0000ffff7ef2bf6c in __GI_exit (status=<optimized out>) at exit.c:139
 #5  0x0000ffff7ef176e4 in __libc_start_main (main=0x0, argc=0, argv=0x0, init=<optimized out>, fini=<optimized out>,
    rtld_fini=<optimized out>, stack_end=<optimized out>) at ../csu/libc-start.c:344
 #6  0x0000aaaad2939db0 in _start () at ./src/include/buffer.h:642

Fixes: https://tracker.ceph.com/issues/42890
Signed-off-by: Chunsong Feng <fengchunsong@huawei.com>
Signed-off-by: luo rixin <luorixin@huawei.com>
liewegas pushed a commit that referenced this pull request Nov 5, 2021
Support wildcard subdomains
liewegas pushed a commit that referenced this pull request Nov 30, 2021
Initially, we were assuming that no pointer obtained from SharedLRU
can outlive the lru itself. However, since going with the interruption
concept for handling shutdowns, this is no longer valid.

The patch is supposed to deal with crashes like the following one:

```
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-8898-ge57ad63c/rpm/el8/BUILD/ceph-17.0.
0-8898-ge57ad63c/src/crimson/common/shared_lru.h:46: SharedLRU<K, V>::~SharedLRU() [with K = unsigned int; V = OSDMap]: Assertion `weak_refs.empty()' failed.
Aborting on shard 0.
Backtrace:
Reactor stalled for 1162 ms on shard 0. Backtrace: 0xb14ab 0x46e57428 0x46bc450d 0x46be03bd 0x46be0782 0x46be0946 0x46be0bf6 0x12b1f 0xc8e3b 0x3fdd77e2 0x3fddccdb 0x3fdde1ee 0x3fdde8b3 0x3fdd3f2b 0x3fdd4442 0x3f
dd4c3a 0x12b1f 0x3737e 0x21db4 0x21c88 0x2fa75 0x3a5ae1b9 0x3a38c5e2 0x3a0c823d 0x3a1771f1 0x3a1796f5 0x46ff92c9 0x46ff9525 0x46ff9e93 0x46ff8eae 0x46ff8bd9 0x3a160e67 0x39f50c83 0x39f51cd0 0x46b96271 0x46bde51a
 0x46d6891b 0x46d6a8f0 0x4681a7d2 0x4681f03b 0x39fd50f2 0x23492 0x39b7a7dd
 0# gsignal in /lib64/libc.so.6
 1# abort in /lib64/libc.so.6
 2# 0x00007F9535E04C89 in /lib64/libc.so.6
 3# 0x00007F9535E12A76 in /lib64/libc.so.6
 4# crimson::osd::OSD::~OSD() in ceph-osd
 5# seastar::shared_ptr_count_for<crimson::osd::OSD>::~shared_ptr_count_for() in ceph-osd
 6# seastar::shared_ptr<crimson::osd::OSD>::~shared_ptr() in ceph-osd
 7# seastar::futurize<std::result_of<seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) co
nst::{lambda()#1} ()>::type>::type seastar::smp::submit_to<seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::opera
tor()(unsigned int) const::{lambda()#1}>(unsigned int, seastar::smp_submit_to_options, seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}::operator()(seastar::future<void>) const::{la
mbda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&&) in ceph-osd
 8# std::_Function_handler<seastar::future<void> (unsigned int), seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}>
::_M_invoke(std::_Any_data const&, unsigned int&&) in ceph-osd
 9# 0x0000562DA18162CA in ceph-osd
10# 0x0000562DA1816526 in ceph-osd
11# 0x0000562DA1816E94 in ceph-osd
12# 0x0000562DA1815EAF in ceph-osd
13# 0x0000562DA1815BDA in ceph-osd
14# seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)>::direct_vtable_for<seastar::future<void>::then_wrapped_maybe_erase<true, seastar::future<void>, seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}>(seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}&&)::{lambda(seastar::future<void>&&)#1}>::call(seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)> const*, seastar::future<void>&&) in ceph-osd
15# 0x0000562D9476DC84 in ceph-osd
16# 0x0000562D9476ECD1 in ceph-osd
17# 0x0000562DA13B3272 in ceph-osd
18# 0x0000562DA13FB51B in ceph-osd
19# 0x0000562DA158591C in ceph-osd
20# 0x0000562DA15878F1 in ceph-osd
21# 0x0000562DA10377D3 in ceph-osd
22# 0x0000562DA103C03C in ceph-osd
23# main in ceph-osd
24# __libc_start_main in /lib64/libc.so.6
25# _start in ceph-osd
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants