Description
I meet a problem that coredump happened in rebooting system.
I think it should be a mistake m_timer(class CrmOrch) using smart pointer.
The SelectableTimer object m_timer pointed is referenced by ExecutableTimer using raw pointer(Executor::m_selectable).
In destructing CrmOrch, m_timer destructed first(had release SelectableTimer object memory), and then base class member m_consumerMap destructed, so ~Executor() tried to delete the SelectableTimer object again.
I am not sure how error config lead to jump out of while loop. It should be an exception occured.
Steps to reproduce the issue:
- Edit configure file, set multiple ip address on one loopback interface.
"LOOPBACK_INTERFACE": {
"Loopback2|101.101.101.101/32": {},
"Loopback2|101.101.1.1/32": {},
"Loopback2|101.101.1.2/32": {},
"Loopback3|101.101.1.3/32": {},
...
},
- Run system using the error config.
- Reboot system, a coredump file is produced.
Describe the results you received:
Describe the results you expected:
Additional information you deem important (e.g. issue happens only occasionally):
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/orchagent -d /var/log/swss -b 8192 -m 6c:ec:5a:08:18:67'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000000000071 in ?? ()
(gdb) bt
#0 0x0000000000000071 in ?? ()
#1 0x000000000048e811 in ~Executor (this=0x1a8f330, __in_chrg=<optimized out>) at orch.h:77
#2 ~ExecutableTimer (this=0x1a8f330, __in_chrg=<optimized out>) at timer.h:8
#3 swss::ExecutableTimer::~ExecutableTimer (this=0x1a8f330, __in_chrg=<optimized out>) at timer.h:8
#4 0x000000000041d8ed in _M_dispose (this=0x1a8c380) at /usr/include/c++/4.9/bits/shared_ptr_base.h:373
#5 _M_release (this=0x1a8c380) at /usr/include/c++/4.9/bits/shared_ptr_base.h:149
#6 ~__shared_count (this=0x1a8dd80, __in_chrg=<optimized out>) at /usr/include/c++/4.9/bits/shared_ptr_base.h:666
#7 ~__shared_ptr (this=0x1a8dd78, __in_chrg=<optimized out>) at /usr/include/c++/4.9/bits/shared_ptr_base.h:914
#8 ~shared_ptr (this=0x1a8dd78, __in_chrg=<optimized out>) at /usr/include/c++/4.9/bits/shared_ptr.h:93
#9 ~pair (this=0x1a8dd70, __in_chrg=<optimized out>) at /usr/include/c++/4.9/bits/stl_pair.h:96
#10 destroy<std::pair<std::basic_string<char> const, std::shared_ptr<Executor> > > (this=<optimized out>, __p=0x1a8dd70) at /usr/include/c++/4.9/ext/new_allocator.h:124
#11 _S_destroy<std::pair<std::basic_string<char> const, std::shared_ptr<Executor> > > (__p=0x1a8dd70, __a=...) at /usr/include/c++/4.9/bits/alloc_traits.h:282
#12 destroy<std::pair<std::basic_string<char> const, std::shared_ptr<Executor> > > (__a=..., __p=0x1a8dd70) at /usr/include/c++/4.9/bits/alloc_traits.h:411
#13 _M_destroy_node (this=0x1a8b518, __p=0x1a8dd50) at /usr/include/c++/4.9/bits/stl_tree.h:436
#14 std::_Rb_tree<std::string, std::pair<std::string const, std::shared_ptr<Executor> >, std::_Select1st<std::pair<std::string const, std::shared_ptr<Executor> > >, std::less<std::string>, std::allocator<std::pair<std::string const, std::shared_ptr<Executor> > > >::_M_erase (this=this@entry=0x1a8b518, __x=0x1a8dd50)
at /usr/include/c++/4.9/bits/stl_tree.h:1247
#15 0x000000000041d8a1 in std::_Rb_tree<std::string, std::pair<std::string const, std::shared_ptr<Executor> >, std::_Select1st<std::pair<std::string const, std::shared_ptr<Executor> > >, std::less<std::string>, std::allocator<std::pair<std::string const, std::shared_ptr<Executor> > > >::_M_erase (this=0x1a8b518, __x=0x1a8abd0)
at /usr/include/c++/4.9/bits/stl_tree.h:1245
#16 0x00000000004b8dc7 in ~CrmOrch (this=0x1a8b510, __in_chrg=<optimized out>) at crmorch.h:37
#17 CrmOrch::~CrmOrch (this=0x1a8b510, __in_chrg=<optimized out>) at crmorch.h:37
#18 0x0000000000412e46 in OrchDaemon::~OrchDaemon (this=0x1a82900, __in_chrg=<optimized out>) at orchdaemon.cpp:43
#19 0x0000000000408c5e in main (argc=<optimized out>, argv=<optimized out>) at main.cpp:288
(gdb)
Description
I meet a problem that coredump happened in rebooting system.
I think it should be a mistake m_timer(class CrmOrch) using smart pointer.
The SelectableTimer object m_timer pointed is referenced by ExecutableTimer using raw pointer(Executor::m_selectable).
In destructing CrmOrch, m_timer destructed first(had release SelectableTimer object memory), and then base class member m_consumerMap destructed, so ~Executor() tried to delete the SelectableTimer object again.
I am not sure how error config lead to jump out of while loop. It should be an exception occured.
Steps to reproduce the issue:
"LOOPBACK_INTERFACE": {
"Loopback2|101.101.101.101/32": {},
"Loopback2|101.101.1.1/32": {},
"Loopback2|101.101.1.2/32": {},
"Loopback3|101.101.1.3/32": {},
...
},
Describe the results you received:
Describe the results you expected:
Additional information you deem important (e.g. issue happens only occasionally):