Bug #73822
openRocky10 - rados/verify - valgrind error: MismatchedFree operator delete[](void*, unsigned long, std::align_val_t) RocksDBStore::close() RocksDBStore::~RocksDBStore()
0%
Description
/a/sjust-2025-11-11_04:48:46-rados-wip-rocky10-branch-of-the-day-2025-11-10-1762829866-distro-default-smithi/ ['8594563', '8594659', '8594510']
please check for example /a/sjust-2025-11-11_04:48:46-rados-wip-rocky10-branch-of-the-day-2025-11-10-1762829866-distro-default-smithi/8594563/remote/smithi121/log/valgrind/osd.2.log.gz for full valgrind error output
<error>
<unique>0x675</unique>
<tid>1</tid>
<kind>MismatchedFree</kind>
<what>Mismatched free() / delete / delete []</what>
<stack>
<frame>
<ip>0x48485EC</ip>
<obj>/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so</obj>
<fn>operator delete[](void*, unsigned long, std::align_val_t)</fn>
<dir>/builddir/build/BUILD/valgrind-3.24.0/coregrind/m_replacemalloc</dir>
<file>vg_replace_malloc.c</file>
<line>1504</line>
</frame>
<frame>
<ip>0x1050182</ip>
<obj>/usr/bin/ceph-osd</obj>
<fn>RocksDBStore::close()</fn>
<dir>/usr/src/debug/ceph-20.3.0-4047.gd005727a.el10.x86_64/src/kv</dir>
<file>RocksDBStore.cc</file>
<line>1330</line>
</frame>
<frame>
<ip>0x105021D</ip>
<obj>/usr/bin/ceph-osd</obj>
<fn>RocksDBStore::~RocksDBStore()</fn>
<dir>/usr/src/debug/ceph-20.3.0-4047.gd005727a.el10.x86_64/src/kv</dir>
<file>RocksDBStore.cc</file>
<line>1291</line>
</frame>
<frame>
<ip>0xB541B3</ip>
<obj>/usr/bin/ceph-osd</obj>
<fn>UnknownInlinedFun</fn>
<dir>/usr/src/debug/ceph-20.3.0-4047.gd005727a.el10.x86_64/src/kv</dir>
<file>RocksDBStore.cc</file>
<line>1295</line>
</frame>
<frame>
<ip>0xB541B3</ip>
<obj>/usr/bin/ceph-osd</obj>
<fn>BlueStore::_close_db()</fn>
<dir>/usr/src/debug/ceph-20.3.0-4047.gd005727a.el10.x86_64/src/os/bluestore</dir>
<file>BlueStore.cc</file>
<line>8248</line>
</frame>
<frame>
<ip>0xB56023</ip>
<obj>/usr/bin/ceph-osd</obj>
<fn>BlueStore::_open_db_and_around(bool, bool)</fn>
<dir>/usr/src/debug/ceph-20.3.0-4047.gd005727a.el10.x86_64/src/os/bluestore</dir>
<file>BlueStore.cc</file>
<line>7891</line>
</frame>
<frame>
<ip>0xB589DB</ip>
<obj>/usr/bin/ceph-osd</obj>
<fn>BlueStore::_mount()</fn>
<dir>/usr/src/debug/ceph-20.3.0-4047.gd005727a.el10.x86_64/src/os/bluestore</dir>
<file>BlueStore.cc</file>
<line>9426</line>
</frame>
Updated by Adam Kupczyk 4 months ago · Edited
I am unable to replicate the problem.
DB is removed by simple
void RocksDBStore::close() {
...
delete db;
}
that translates into virtual table call, I guess 0x18 is "~":
10201: 48 8b bd 88 00 00 00 mov 0x88(%rbp),%rdi delete db; 10213: 48 85 ff test %rdi,%rdi 10216: 74 06 je 1021e <RocksDBStore::close()+0x28e> 10218: 48 8b 07 mov (%rdi),%rax 1021b: ff 50 18 callq *0x18(%rax)
The code in DBImplReadOnly::~DBImplReadOnly() is:
74: e9 00 00 00 00 jmpq 79 <_ZN7rocksdb14DBImplReadOnlyD0Ev+0x29>
75: R_X86_64_PLT32 _ZdlPvmSt11align_val_t-0x4
Note: jmpq instead of call has a consequence that we do not see a ~DBImplReadOnly in valgrind callstack.
echo _ZdlPvmSt11align_val_t | c++filt operator delete(void*, unsigned long, std::align_val_t)
I tried this with gcc toolset 11 and gcc toolset 13.3.1-2 (I think this is used by builder)
BUT:
valgrind callstack (suppression proposal part) clearly has:
echo _ZdaPvmSt11align_val_t | c++filt operator delete[](void*, unsigned long, std::align_val_t)
I suspect that the compiler somehow emitted invalid call to `delete[]` instead of `delete`.
Updated by Radoslaw Zarzynski 4 months ago · Edited
@Adam Kupczyk: How about suppressing it?
Updated by Nitzan Mordechai 4 months ago
I added some comparisons from centos9 and rocky10 rpms:
Centos9 shows:
/opt/rh/gcc-toolset-13
GLIBCXX_3.4.29 - GCC 13
Rocky10:
GLIBCXX_3.4.32 - GCC 14
Compare the delete:
Rocky 10 (0xf48166):
f48166: mov 0x88(%rbp),%rdi
f48178: test %rdi,%rdi
f4817d: mov (%rdi),%rax
f48180: call *0x18(%rax)
CentOS 9 (0xf8d616):
f8d616: mov 0x88(%rbp),%rdi
f8d628: test %rdi,%rdi
f8d62d: mov (%rdi),%rax
f8d630: call *0x18(%rax)
looks identical, RocksDB version didn't change, so it looks like GCC 14 issue with RocksDB and not ceph
Updated by Yaarit Hatuka 4 months ago
- Related to Bug #73930: ceph-mgr modules rely on deprecated python subinterpreters added
Updated by Radoslaw Zarzynski 4 months ago
Do we observe any other effect of this bug beyond making Valgrind angry?
Updated by Nitzan Mordechai 4 months ago
I didn't see any other effect, Valgrind complained only during the shutdown and release memory time.
Updated by Radoslaw Zarzynski 4 months ago
OK, let's update the suppression file.
Updated by Radoslaw Zarzynski 3 months ago
@Nitzan Mordechai: are you going to update the suppression file?
Updated by Nitzan Mordechai 3 months ago
- Status changed from New to Fix Under Review
- Pull request ID set to 66651
Updated by Nitzan Mordechai 3 months ago
Radoslaw Zarzynski wrote in #note-10:
@Nitzan Mordechai: are you going to update the suppression file?
Done
Updated by Laura Flores about 2 months ago
- Related to Bug #74604: Rocky10 - MismatchedFree delete coming from ceph-osd-classic code added
Updated by Laura Flores about 2 months ago · Edited
There is also a leak in the OSD along with the Mon RocksDB leak: https://tracker.ceph.com/issues/74604
Is is a duplicate of this one? For now, I linked it as related.
Updated by Nitzan Mordechai about 1 month ago
/a/yaarit-2026-02-08_02:25:21-rados-wip-rocky10-branch-of-the-day-2026-02-06-1770413686-distro-default-trial/39937
Updated by Radoslaw Zarzynski about 1 month ago
Oops, it looks the fixing commit was present in the branch mentioned in the previous comment:
$ git log ceph-ci/wip-rocky10-branch-of-the-day-2026-02-06-1770413686
...
commit 56de49411b1c1f1e837f7694c653118f1145fafe
Author: NitzanMordhai <nmordech@ibm.com>
Date: Thu Feb 5 11:48:39 2026 +0000
qa: suppress false positive delete map mismatch errors
Valgrind reports "Mismatched free() / delete / delete []" errors during
OSD startup.
Standard library containers (like std::map) correctly call delete, but
Valgrind falsely interprets this as a call to delete[] because GCC 14
folds the identical aligned delete operators into a single symbol. This
causes Valgrind to flag a mismatch against the non-array allocation.
Fixes: https://tracker.ceph.com/issues/74604
Signed-off-by: Nitzan Mordechai <nmordech@ibm.com>
Updated by Nitzan Mordechai about 1 month ago
That one is a bit different, coming from main and with 2 calls for RocksDBStore::~RocksDBStore
{
<insert_a_suppression_name_here>
Memcheck:Free
fun:_ZdaPvmSt11align_val_t
fun:_ZN12RocksDBStore5closeEv
fun:_ZN12RocksDBStoreD1Ev
fun:_ZN12RocksDBStoreD0Ev
fun:main
}
the suppression is:
{
rocksdb mismatched free bluestore close
Memcheck:Free
fun:_ZdaPvmSt11align_val_t
fun:_ZN12RocksDBStore5closeEv
fun:_ZN12RocksDBStoreD*Ev
fun:_ZN9BlueStore9_close_dbEv
}
i'll combine them into 1:
{
rocksdb mismatched free bluestore close
Memcheck:Free
fun:_ZdaPvmSt11align_val_t
fun:_ZN12RocksDBStore5closeEv
fun:_ZN12RocksDBStoreD*Ev
fun:_ZN12RocksDBStoreD*Ev
...
}
Updated by Nitzan Mordechai about 1 month ago
/a/yaarit-2026-02-10_23:48:52-rados-wip-rocky10-branch-of-the-day-2026-02-09-1770676549-distro-default-trial/
['44504', '44382', '44329']
Updated by Laura Flores 25 days ago
This PR is under test in https://tracker.ceph.com/issues/74811.
Updated by Laura Flores 22 days ago
/a/nmordech-2026-02-25_11:36:23-rados-wip-rocky10-branch-of-the-day-2026-02-24-1771941190-distro-default-trial/70160
Updated by Radoslaw Zarzynski 11 days ago
The associated PR fixes multiple tickets. Reapproved after a change.