Project

General

Profile

Actions

Bug #66478

closed

crimson: crash during shutdown due to use-after-free in erasing OSDMap cache entry

Added by Samuel Just almost 2 years ago. Updated 5 months ago.

Status:
Resolved
Priority:
High
Category:
-
Target version:
-
% Done:

0%

Source:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Tags (freeform):
Fixed In:
v19.3.0-6707-g87648edee2
Released In:
v20.2.0~1405
Upkeep Timestamp:
2025-11-01T01:17:36+00:00

Description

kernel callstack:
    #0 0x45d733c in std::_Rb_tree<unsigned int, std::pair<unsigned int const, std::pair<boost::weak_ptr<OSDMap>, OSDMap*> >, std::_Select1st<std::pair<unsigned
 int const, std::pair<boost::weak_ptr<OSDMap>, OSDMap*> > >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, std::pair<boost::weak_ptr<OS
DMap>, OSDMap*> > > >::equal_range(unsigned int const&) (/usr/bin/ceph-osd+0x45d733c) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #1 0x45d855a in std::_Rb_tree<unsigned int, std::pair<unsigned int const, std::pair<boost::weak_ptr<OSDMap>, OSDMap*> >, std::_Select1st<std::pair<unsigned
 int const, std::pair<boost::weak_ptr<OSDMap>, OSDMap*> > >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, std::pair<boost::weak_ptr<OS
DMap>, OSDMap*> > > >::erase(unsigned int const&) (/usr/bin/ceph-osd+0x45d855a) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #2 0x45d88db in SharedLRU<unsigned int, OSDMap>::Deleter::operator()(OSDMap*) (/usr/bin/ceph-osd+0x45d88db) (BuildId: dcf093a77611281d02b9db353ab69f899978d
f23)
    #3 0x45dab6b in boost::detail::sp_counted_impl_pd<OSDMap*, boost::detail::local_sp_deleter<SharedLRU<unsigned int, OSDMap>::Deleter> >::dispose() (/usr/bin
/ceph-osd+0x45dab6b) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #4 0x44c1cbc in boost::detail::sp_counted_base::release() (/usr/bin/ceph-osd+0x44c1cbc) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #5 0x44c1e5e in boost::detail::shared_count::~shared_count() (/usr/bin/ceph-osd+0x44c1e5e) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #6 0x44c2c14 in boost::detail::local_counted_impl_em::local_cb_destroy() (/usr/bin/ceph-osd+0x44c2c14) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #7 0x359690c in boost::detail::local_counted_base::release() (/usr/bin/ceph-osd+0x359690c) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #8 0x35a338b in boost::local_shared_ptr<OSDMap const>::~local_shared_ptr() (/usr/bin/ceph-osd+0x35a338b) (BuildId: dcf093a77611281d02b9db353ab69f899978df23
)
    #9 0x35f7924 in boost::local_shared_ptr<OSDMap const>::operator=(decltype(nullptr)) (/usr/bin/ceph-osd+0x35f7924) (BuildId: dcf093a77611281d02b9db353ab69f8
99978df23)
    #10 0x36203b1 in seastar::foreign_ptr<boost::local_shared_ptr<OSDMap const> >::destroy_on(boost::local_shared_ptr<OSDMap const>, unsigned int) (/usr/bin/ce
ph-osd+0x36203b1) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #11 0x36204dc in seastar::foreign_ptr<boost::local_shared_ptr<OSDMap const> >::destroy(boost::local_shared_ptr<OSDMap const>, unsigned int) (/usr/bin/ceph-
osd+0x36204dc) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #12 0x362071e in seastar::foreign_ptr<boost::local_shared_ptr<OSDMap const> >::~foreign_ptr() (/usr/bin/ceph-osd+0x362071e) (BuildId: dcf093a77611281d02b9d
b353ab69f899978df23)
    #13 0x36208d8 in seastar::internal::lw_shared_ptr_accessors_no_esft<seastar::foreign_ptr<boost::local_shared_ptr<OSDMap const> > >::dispose(seastar::lw_sha
red_ptr_counter_base*) (/usr/bin/ceph-osd+0x36208d8) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #14 0x3620ab3 in crimson::local_shared_foreign_ptr<boost::local_shared_ptr<OSDMap const> >::~local_shared_foreign_ptr() (/usr/bin/ceph-osd+0x3620ab3) (Buil
dId: dcf093a77611281d02b9db353ab69f899978df23)
    #15 0x3860e77 in crimson::osd::OSD::~OSD() (/usr/bin/ceph-osd+0x3860e77) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #16 0x364ad48 in main::{lambda()#1}::operator()() const::{lambda()#1}::operator()() const (/usr/bin/ceph-osd+0x364ad48) (BuildId: dcf093a77611281d02b9db353
ab69f899978df23)
    #17 0x364c190 in seastar::future<int> seastar::futurize<int>::apply<main::{lambda()#1}::operator()() const::{lambda()#1}>(main::{lambda()#1}::operator()()
const::{lambda()#1}&&, std::tuple<>&&) (/usr/bin/ceph-osd+0x364c190) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #18 0x364c483 in seastar::async<main::{lambda()#1}::operator()() const::{lambda()#1}>(seastar::thread_attributes, main::{lambda()#1}::operator()() const::{lambda()#1}&&)::{lambda()#1}::operator()() const (/usr/bin/ceph-osd+0x364c483) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #19 0x364c694 in seastar::noncopyable_function<void ()>::direct_vtable_for<seastar::async<main::{lambda()#1}::operator()() const::{lambda()#1}>(seastar::thread_attributes, main::{lambda()#1}::operator()() const::{lambda()#1}&&)::{lambda()#1}>::call(seastar::noncopyable_function<void ()> const*) (/usr/bin/ceph-osd+0x364c694) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #20 0xb737f0e in seastar::noncopyable_function<void ()>::operator()() const (/usr/bin/ceph-osd+0xb737f0e) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)
    #21 0xbc6dc3c in seastar::thread_context::main() (/usr/bin/ceph-osd+0xbc6dc3c) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)

0x61b000005548 is located 200 bytes inside of 1584-byte region [0x61b000005480,0x61b000005ab0)
freed by thread T0 here:
    #0 0x7fec2e6e0ea0 in operator delete(void*, unsigned long) (/lib64/libasan.so.8+0xe0ea0) (BuildId: e5a5081a368c8ab83d3c385d322e3c190cca7e1d)
    #1 0x3cffd13 in seastar::shared_ptr_count_for<crimson::osd::OSDSingletonState>::~shared_ptr_count_for() (/usr/bin/ceph-osd+0x3cffd13) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)

previously allocated by thread T0 here:
    #0 0x7fec2e6dfdf8 in operator new(unsigned long) (/lib64/libasan.so.8+0xdfdf8) (BuildId: e5a5081a368c8ab83d3c385d322e3c190cca7e1d)
    #1 0x3af65f8 in seastar::shared_ptr<crimson::osd::OSDSingletonState> seastar::shared_ptr_make_helper<crimson::osd::OSDSingletonState, false>::make<int const&, std::reference_wrapper<crimson::net::Messenger>, std::reference_wrapper<crimson::net::Messenger>, std::reference_wrapper<crimson::mon::Client>, std::reference_wrapper<crimson::mgr::Client> >(int const&, std::reference_wrapper<crimson::net::Messenger>&&, std::reference_wrapper<crimson::net::Messenger>&&, std::reference_wrapper<crimson::mon::Client>&&, std::reference_wrapper<crimson::mgr::Client>&&) (/usr/bin/ceph-osd+0x3af65f8) (BuildId: dcf093a77611281d02b9db353ab69f899978df23)

SUMMARY: AddressSanitizer: heap-use-after-free (/usr/bin/ceph-osd+0x45d733c) (BuildId: dcf093a77611281d02b9db353ab69f899978df23) in std::_Rb_tree<unsigned int, std::pair<unsigned int const, std::pair<boost::weak_ptr<OSDMap>, OSDMap*> >, std::_Select1st<std::pair<unsigned int const, std::pair<boost::weak_ptr<OSDMap>, OSDMap*> > >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, std::pair<boost::weak_ptr<OSDMap>, OSDMap*> > > >::equal_range(unsigned int const&)

sjust-2024-06-14_00:04:08-crimson-rados:thrash-wip-sjust-crimson-testing-2024-06-12-distro-default-smithi/7754907


Related issues 1 (0 open1 closed)

Related to crimson - Bug #64935: crimson: heap use after free during ~OSD()ResolvedSamuel Just

Actions
Actions #1

Updated by Laura Flores almost 2 years ago

  • Project changed from RADOS to crimson
Actions #2

Updated by Nitzan Mordechai over 1 year ago

/a/nmordech-2024-07-09_05:22:27-crimson-rados-wip-nitzan-wait-osd-admin-command-distro-default-smithi/7793384

Actions #3

Updated by Matan Breizman over 1 year ago

  • Related to Bug #64935: crimson: heap use after free during ~OSD() added
Actions #4

Updated by Matan Breizman over 1 year ago

  • Priority changed from Normal to High

Reproducible by stopping a vstart OSD without redirecting the output:

SUMMARY: AddressSanitizer: heap-use-after-free /usr/include/c++/11/bits/stl_tree.h:735 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, std::pair<boost::weak_ptr<OSDMap>, OSDMap*> >, std::_Select1st<std::pair<unsigned int const, std::pair<boo
st::weak_ptr<OSDMap>, OSDMap*> > >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, std::pair<boost::weak_ptr<OSDMap>, OSDMap*> > > >::_M_mbegin() const

    #5 0x3ce9c34 in SharedLRU<unsigned int, OSDMap>::_erase_weak(unsigned int const&) /home/matan/ceph/src/crimson/common/shared_lru.h:37                                                                                                                     
    #6 0x3ce9c34 in SharedLRU<unsigned int, OSDMap>::Deleter::operator()(OSDMap*) /home/matan/ceph/src/crimson/common/shared_lru.h:32                                                                                                                         
    #7 0x3ce9c34 in void boost::detail::local_sp_deleter<SharedLRU<unsigned int, OSDMap>::Deleter>::operator()<OSDMap>(OSDMap*) /home/matan/ceph/build/boost/include/boost/smart_ptr/detail/local_sp_deleter.hpp:60                                           
    #8 0x3ce9c34 in boost::detail::sp_counted_impl_pd<OSDMap*, boost::detail::local_sp_deleter<SharedLRU<unsigned int, OSDMap>::Deleter> >::dispose() /home/matan/ceph/build/boost/include/boost/smart_ptr/detail/sp_counted_impl.hpp:179                     
    #9 0x3b552a3 in boost::detail::sp_counted_base::release() /home/matan/ceph/build/boost/include/boost/smart_ptr/detail/sp_counted_base_gcc_atomic.hpp:120                                                                                                  
    #10 0x3b552a3 in boost::detail::shared_count::~shared_count() /home/matan/ceph/build/boost/include/boost/smart_ptr/detail/shared_count.hpp:432                                                                                                            
    #11 0x3b552a3 in boost::detail::local_counted_impl_em::local_cb_destroy() /home/matan/ceph/build/boost/include/boost/smart_ptr/detail/local_counted_base.hpp:135                                                                                          
    #12 0x1ff2472 in boost::detail::local_counted_base::release() /home/matan/ceph/build/boost/include/boost/smart_ptr/detail/local_counted_base.hpp:82                                                                                                       
    #13 0x1ff2472 in boost::local_shared_ptr<OSDMap const>::~local_shared_ptr() /home/matan/ceph/build/boost/include/boost/smart_ptr/local_shared_ptr.hpp:134                                                                                                 
    #14 0x1ff2472 in boost::local_shared_ptr<OSDMap const>::operator=(decltype(nullptr)) /home/matan/ceph/build/boost/include/boost/smart_ptr/local_shared_ptr.hpp:357
    #15 0x1ff2472 in seastar::foreign_ptr<boost::local_shared_ptr<OSDMap const> >::destroy_on(boost::local_shared_ptr<OSDMap const>, unsigned int) /home/matan/ceph/src/seastar/include/seastar/core/sharded.hh:867                                           
    #16 0x1ff2472 in seastar::foreign_ptr<boost::local_shared_ptr<OSDMap const> >::destroy(boost::local_shared_ptr<OSDMap const>, unsigned int) /home/matan/ceph/src/seastar/include/seastar/core/sharded.hh:851                                              
    #17 0x1ff2472 in seastar::foreign_ptr<boost::local_shared_ptr<OSDMap const> >::~foreign_ptr() /home/matan/ceph/src/seastar/include/seastar/core/sharded.hh:895                                                                                            
    #18 0x1ff2472 in seastar::lw_shared_ptr_no_esft<seastar::foreign_ptr<boost::local_shared_ptr<OSDMap const> > >::~lw_shared_ptr_no_esft() /home/matan/ceph/src/seastar/include/seastar/core/shared_ptr.hh:170                                              
    #19 0x1ff2472 in seastar::internal::lw_shared_ptr_accessors_no_esft<seastar::foreign_ptr<boost::local_shared_ptr<OSDMap const> > >::dispose(seastar::lw_shared_ptr_counter_base*) /home/matan/ceph/src/seastar/include/seastar/core/shared_ptr.hh:226     
    #20 0x1ff2472 in seastar::lw_shared_ptr<seastar::foreign_ptr<boost::local_shared_ptr<OSDMap const> > >::~lw_shared_ptr() /home/matan/ceph/src/seastar/include/seastar/core/shared_ptr.hh:324                                                              
    #21 0x1ff2472 in crimson::local_shared_foreign_ptr<boost::local_shared_ptr<OSDMap const> >::~local_shared_foreign_ptr() /home/matan/ceph/src/crimson/common/local_shared_foreign_ptr.h:59                                                                 
    #22 0x24e0dbf in crimson::osd::OSD::~OSD() /home/matan/ceph/src/crimson/osd/osd.cc:136                                                                                                                                                                    
    #23 0x206560d in operator() /home/matan/ceph/src/crimson/osd/main.cc:244                                                                                                                                                                                  
    #24 0x2066f75 in __invoke_impl<int, main(int, char const**)::<lambda()>::<lambda()> > /usr/include/c++/11/bits/invoke.h:61                                                                                                                                
    #25 0x2066f75 in __invoke<main(int, char const**)::<lambda()>::<lambda()> > /usr/include/c++/11/bits/invoke.h:96                                                                                                                                          
    #26 0x2066f75 in __apply_impl<main(int, char const**)::<lambda()>::<lambda()>, std::tuple<> > /usr/include/c++/11/tuple:1868                                                                                                                              
    #27 0x2066f75 in apply<main(int, char const**)::<lambda()>::<lambda()>, std::tuple<> > /usr/include/c++/11/tuple:1879                                                                                                                                     
    #28 0x2066f75 in apply<main(int, char const**)::<lambda()>::<lambda()> > /home/matan/ceph/src/seastar/include/seastar/core/future.hh:2004                                                                                                                 
    #29 0x2066f75 in operator() /home/matan/ceph/src/seastar/include/seastar/core/thread.hh:260                                                                                                                                                               
    #30 0x2066f75 in call /home/matan/ceph/src/seastar/include/seastar/util/noncopyable_function.hh:129                                                                                                                                                       
    #31 0x12d39c7e in seastar::noncopyable_function<void ()>::operator()() const /home/matan/ceph/src/seastar/include/seastar/util/noncopyable_function.hh:215                                                                                                
    #32 0x12d39c7e in seastar::thread_context::main() /home/matan/ceph/src/seastar/src/core/thread.cc:311                                                                                                                                                     

0x61b000004048 is located 200 bytes inside of 1584-byte region [0x61b000003f80,0x61b0000045b0)                                                                                                                                                                
freed by thread T0 here:                                                                                                                                                                                                                                      
    #0 0x7f930d4b73cf in operator delete(void*, unsigned long) (/lib64/libasan.so.6+0xb73cf)                                                                                                                                                                  
    #1 0x2cf78be in seastar::shared_ptr_count_for<crimson::osd::OSDSingletonState>::~shared_ptr_count_for() (/home/matan/ceph/build/bin/crimson-osd+0x2cf78be)

Actions #5

Updated by Matan Breizman about 1 year ago

  • Status changed from New to Fix Under Review
  • Assignee changed from Samuel Just to Matan Breizman
  • Pull request ID set to 61213
Actions #6

Updated by Matan Breizman about 1 year ago

https://pulpito.ceph.com/matan-2025-01-07_14:57:06-crimson-rados-main-distro-crimson-smithi/8066282/

tested with main: 4b0d458ab0c155c0d167417730d2a3d59b161d98 before the fix was merged.

Actions #7

Updated by Matan Breizman 12 months ago

  • Status changed from Fix Under Review to Resolved
Actions #8

Updated by Upkeep Bot 8 months ago

  • Merge Commit set to 87648edee2ab3b21a0081cfecfa9380fdeac32f3
  • Fixed In set to v19.3.0-6707-g87648edee2a
  • Upkeep Timestamp set to 2025-07-10T23:38:37+00:00
Actions #9

Updated by Upkeep Bot 8 months ago

  • Fixed In changed from v19.3.0-6707-g87648edee2a to v19.3.0-6707-g87648edee2
  • Upkeep Timestamp changed from 2025-07-10T23:38:37+00:00 to 2025-07-14T22:43:01+00:00
Actions #10

Updated by Upkeep Bot 5 months ago

  • Released In set to v20.2.0~1405
  • Upkeep Timestamp changed from 2025-07-14T22:43:01+00:00 to 2025-11-01T01:17:36+00:00
Actions

Also available in: Atom PDF