Making PeerReplayer multithreaded for file transfers for single directories#4
Making PeerReplayer multithreaded for file transfers for single directories#4sajibreadd wants to merge 1 commit intomainfrom
Conversation
ae8d8c2 to
486cfce
Compare
| std::scoped_lock locker(m_lock); | ||
| ceph_assert(!m_stopping); | ||
| m_stopping = true; | ||
| ceph_assert(!m_stopping.load()); |
There was a problem hiding this comment.
AFAIU no need to use load/store for atomics explicitly - direct access/assignment should work as well
| std::condition_variable pick_task; | ||
| std::condition_variable give_task; | ||
| std::mutex pmtx; | ||
| int num_threads; |
There was a problem hiding this comment.
apparently redundant - workers.size() is enough
| int num_threads; | ||
| int max_concurrent_large_file; | ||
| uint64_t file_size_threshold; | ||
| uint64_t max_token_sum; |
There was a problem hiding this comment.
nit: no much sense to have as a class member - could be calculated at the beginning of run() method
| // K * L + (S - L) * min(file_size, K) | ||
| uint64_t | ||
| PeerReplayer::FileTransferHandlerThreadPool::get_token(uint64_t file_size) { | ||
| return file_size_threshold * max_concurrent_large_file + |
There was a problem hiding this comment.
May be I'm missing something in my math below but what;s about the followintg case (all the default settings):
- max_token_sum = 3 * fst (aka file_size_threshold)
- we have 3 files of 1 byte size to transfer. Each file token is 1*fst + 2
This means one will get just 2 files transferred in parallel. While I'd expect we;d like to have 3 transfers...
There was a problem hiding this comment.
S = limit for small transfers
L = limit for large transfers
K = file size cap (1 object)
M = max token capacity of the queue
E = tokens for file existence
P = tokens per full object
Accounted tokens for a file: E + P * min(file_size, K)/K
Case 1: only small files
S * E = M
Case 2: only large files
L * (E + P) = M
Arbitrarily set M (e.g., to 1 million)
Solve for E: E = M / S
Solve for P: P = M / L - E
Set token capacity of the queue to M = K * S * L
Accounted tokens for a file: K * L + (S - L) * min(file_size, K)
Here is the math. This math is done by @patrakov and seems convincing to me.
There was a problem hiding this comment.
Sorry, I forgot to remove one confusing remark from the above: (e.g, to 1 million). Correct version: set M to K * S * L to avoid non-integer math. You can check the math as follows: if cephfs-mirror is asked to transfer very small (e.g, zero-byte) files, the accounted tokens would be K * L and so S files will fit in the queue. Analogously, when the file size is K, then the expression for the accounted tokens per file simplifies to K * S and therefore L files fit in the queue. The rest is just some smooth behavior in between.
What the text above misses is the justification for the whole business with different limits for large and small files. So, here it is.
The implicit assumption here is that the destination cluster is not the bottleneck. This is indeed the case even if the servers in the destination cluster have much worse specs than the source one, as the user is effectively required to treat the destination cluster as read-only; thus, there will be no background load, which I found to be a limiting factor on the source cluster.
So, we have to consider two bottlenecks: the IOPS limit for a single-threaded workload due to HDDs in the source cluster, and network bandwidth. With a single thread, different bottlenecks are hit and thus different customer complaints are provoked:
- Small files: "your old cephfs-mirror can only transfer 100 files per second, that's way too slow, please parallelize way way more" - but with too much parallelism, the HDDs in the source cluster would become overloaded. So, the small-files concurrency limit prevents the HDD overload in the source cluster.
- Large files: "your old cephfs-mirror is almost good, but please parallelize it by a factor of 2 so that it can better fill the network" - if we parallelize too much, we'll overload the network and provoke complaints about write latency on the destination cluster.
In the intended cluster for deployment, I will probably set the small-files concurrency to 64 or 128 and large-file concurrency to 4.
There was a problem hiding this comment.
This means one will get just 2 files transferred in parallel. While I'd expect we;d like to have 3 transfers...
The limit is indeed reached for zero-sized files only, and that's intentional. Maybe we should rename the setting.
I would be open to adding some floor to the file size, but the problem is, there is no good value for such a floor, and I don't want to breed tunable parameters. So I would prefer documenting that the limit applies to zero-sized files, and that even one byte would subtract 1 from the concurrency.
There was a problem hiding this comment.
OK, this is the minimal fix of the math to address the "one-byte files only reach the S - 1 concurrency level" objection. The idea is to aim for S + 1 or L + 1 concurrency but stop one token short.
Set token capacity of the queue to M = K * (S + 1) * (L + 1) - 1
Accounted tokens for a file: K * (L + 1) + (S - L) * min(file_size, K)
There was a problem hiding this comment.
Isn't it something like that we are considering 1 byte as the minimal byte size for a file. So we want to run S, 1 byte file parallel and L, K byte file in parallel, then shouldn't be it just simply,
Set token capacity of the queue to M = L * S * (K - 1)
Accounted tokens for a file: K * L - S + min(file_size, K) * (S - L)
There was a problem hiding this comment.
Isn't it something like that we are considering 1 byte as the minimal byte size for a file. So we want to run S, 1 byte file parallel and L, K byte file in parallel, then shouldn't be it just simply,
Set token capacity of the queue to M = L * S * (K - 1) Accounted tokens for a file: K * L - S + min(file_size, K) * (S - L)
Well, my "minimal fix" proposal is a bit more aggressive. Your proposal would apply the parallelism of S threads only to 1-byte files, but balk at 2-byte files, which would hit the same objection. My formula would extend the use of S threads to files as "big" as it is possible with a linear formula.
There was a problem hiding this comment.
I understood now. Idea is to set a lower bound for number of small files to be transferred such that if they are (K-1) length bytes, even then it should transfer >= S files
There was a problem hiding this comment.
Just trying to understand with your new formula does it transfer S, (k - 1) size byte file concurrently?
There was a problem hiding this comment.
I don't understand the question. Let me try to explain the formula in a different way.
The formula is:
Set token capacity of the queue to M = K * (S + 1) * (L + 1) - 1
Accounted tokens for a file: K * (L + 1) + (S - L) * min(file_size, K)
Let's suppose that somebody tries to transfer large (K bytes) files. Then, if an attempt is made to transfer L + 1 files at once, this would consume (L + 1) * (K * (L + 1) + (S - L) * K) tokens, which simplifies down to (L + 1) * K * (S + 1). But the queue can only contain K * (S + 1) * (L + 1) - 1 tokens, i.e., one token too few, so the L + 1st transfer does not happen, i.e., the concurrency for K-byte sized files is still L, as designed.
Now consider zero-sized files. Assume that one attempts to transfer S + 1 files at once. Then, the total number of tokens is (S + 1) * K * (L + 1), which is, again, one token more than what can fit in the queue. So, the achievable concurrency is S, not S + 1.
And for the one-byte files, although the formula is not really designed for them, let's consider S transfers. Then, the total number of used tokens is S * K * (L + 1) + S - L, which is not that nice. Let's see how many free tokens remain in the queue:
K * (S + 1) * (L + 1) - 1 - (S * K * (L + 1) + S - L)
= K * S * L + K * S + K * L + K - 1 - K * S * L - K * S - S + L
= K * L + K + L - 1 - S
= (K + 1) * (L + 1) - S - 2
This is non-negative if S <= (K + 1) * (L + 1) - 2. For realistic numbers, e.g., K = 4194304, S = 128, L = 4, this is of course positive, and therefore, for 1-byte files, there will be S concurrent transfers.
Another question is how large the "small" file can still be while still allowing S concurrent transfers. The amount of free tokens is:
K * (S + 1) * (L + 1) - 1 - S * (K * (L + 1) + (S - L) * file_size)
= K * S * L + K * S + K * L + K - 1 - K * S * L - K * S - S * (S - L) * file_size
= K * L + K - 1 - S * (S - L) * file_size
So the critical file_size on the lower end is:
(K * L + K - 1) / S / (S - L)
Which, for the example above, is 1321 bytes.
Now, consider the concurrency when transferring files of size K - 1. Conjecture: the concurrency for S > L in realistic scenarios would be L + 1. Indeed, the number of free tokens is then:
K * (S + 1) * (L + 1) - 1 - (L + 1) * (K * (L + 1) + (S - L) * (K - 1))
= (L + 1) * (S - L) - 1
... which is non-negative if both L + 1 and S - L are positive integers.
| worker.join(); | ||
| } | ||
| } | ||
| drain_queue(); |
There was a problem hiding this comment.
do we really want to drain the queue, why not stopping immediately? Mirroring should be able to recover after any terminatiion happened at arbitrary point - hence IMO there is no need for such a draining.
There was a problem hiding this comment.
current_token_sum to be reset anyway though.
There was a problem hiding this comment.
Problem is not about the current_token_sum. I maintained a atomic counter rem_file_transfer for each directory locally and send reference of that counter to the file transfer threads. My goal is to even all the file transfer's are posted to the queue, do_synchronize function should wait until all the file associated with this directory are executed.
counter++;
task_queue.emplace(
[f = std::forward<F>(f), ... args = std::forward<Args>(args), &counter,
&dir_cv](bool dont_execute) mutable {
if (!dont_execute) {
f(std::move(args)...);
}
--counter;
if (counter == 0) {
dir_cv.notify_one();
}
},
token);
Here is the counter. But when a stop_flag is raised, it doesn't know about the counter, so I though best way is to just drain the queue by calling the function f(dont_execute = true). It will don't execute the transfer just decrease the counter and let do_synchronize exit safely.
| } | ||
| task(false); | ||
| { | ||
| std::scoped_lock<std::mutex> lock(pmtx); |
There was a problem hiding this comment.
Looks like this block's ops could be placed under the same lock above, before pick_task.wait() call. E.g. by performing them if token is non-negative (-1 to assigned on initialization then). Hence avoiding duplicate lock aquisition.
There was a problem hiding this comment.
Yeah, good observation, thanks
| std::tie(task, token) = std::move(task_queue.front()); | ||
| task_queue.pop(); | ||
| give_task.notify_one(); | ||
| exec_task.wait(lock, [&token, this] { |
There was a problem hiding this comment.
Do we really need that dual stage (pick + exec) task processing. Couldn't we pick the task when there are enough "resources" (i.e. sum + token < max) for its execution only?
The only questionable benefit I can see is the ability to proceed with running subsequent "small" transfers when some "big" one is pending. But it's not clear why do we need something like that.
There was a problem hiding this comment.
Here is my observation for dual stage checking(I might be nit-picking or overthinking)
Let's say there is a very large task in the front of the queue, if will only be executed if full resource is available. My idea was let other lazy threads to pick some other tasks and let's see whether they can fit in with the current available resource.
Advantage:
- better resource utilization.
- differentiate 2 state when a thread is waiting for when queue is non-empty and when it is waiting for the resource available, otherwise even if queue is empty, it will be notified whenever some thread has finished it's execution, but as queue is empty and it will wake up for no-reason and sleep again(I know this is very rare case)
There was a problem hiding this comment.
off-by-one in the comment: sum + token <= max
There was a problem hiding this comment.
@patrakov it is in the code, may be It was a typo of Igor.
| void *ptr; | ||
| struct iovec iov[NR_IOVECS]; | ||
|
|
||
| int r = ceph_openat(m_local_mount, fh.c_fd, epath.c_str(), O_RDONLY | O_NOFOLLOW, 0); |
There was a problem hiding this comment.
Why not opening source/target files inside transfer()?
Not doing that might result in tons of file descriptors being opened while pending transfers in the queue...
|
|
||
| int PeerReplayer::cleanup_remote_dir(const std::string &dir_root, | ||
| const std::string &epath, const FHandles &fh) { | ||
| int PeerReplayer::cleanup_remote_dir( |
There was a problem hiding this comment.
Potentially cleanup_remote_dir & propagate_deleted_entries could benefit from multi-threaded processing as well. One might e.g. collect all the to-be-processed entries at specific level and form their processing as a task or make a task from a subfolder or something. There might be some performance benefit on large directory trees as well.
Perhaps could be implemented as a second stage, not with this patch?
There was a problem hiding this comment.
I would expect metadata-only operations to be fast enough anyway, and exclude the proposed optimization from the first iteration of the patch. The big question here is: what do we do with directories containing millions of deleted subdirs? Wouldn't collecting them in advance balloon the memory usage? Can we queue such mass deletions piece-wise without storing the whole list of stuff to delete?
There was a problem hiding this comment.
For deletion, I agree it is not possible without keeping a list in memory but for creation of dirs we can do that as, while creation we are going from top to down.
There was a problem hiding this comment.
Also in the current implementation of main branch it is doing bfs for syncing snapshot having id >= 2. So it is actually consuming memory of at least proportional to maximum number of nodes in the same level.
ccb068e to
09adc20
Compare
…tory. Signed-off-by: Md Mahamudur Rahaman Sajib <mahamudur.sajib@croit.io>
09adc20 to
f680297
Compare
… overflow()
When sanitizer is enabled, unittest_log fails as following
```
[ RUN ] Log.StderrPipeBig
=================================================================
==3302372==ERROR: AddressSanitizer: heap-use-after-free on address 0xffff96e01d00 at pc 0xaaaadd3db754 bp 0xffffd9ebffa0 sp 0xffffd9ebf790
READ of size 4096 at 0xffff96e01d00 thread T0
#0 0xaaaadd3db750 in __asan_memmove (/root/ceph-19.0.0/build/bin/unittest_log+0x3fb750) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
#1 0xffffafc23734 in char const* boost::container::dtl::memmove_n_source<char const*, char*>(char const*, unsigned long, char*) /root/ceph-19.0.0/build/boost/include/boost/container/detail/copy_move_algo.hpp:261:10
#2 0xffffafc23734 in boost::container::dtl::enable_if_memtransfer_copy_constructible<char const*, char*, char const*>::type boost::container::uninitialized_copy_alloc_n_source<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*, char*>(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char const*, unsigned long, char*) /root/ceph-19.0.0/build/boost/include/boost/container/detail/copy_move_algo.hpp:600:11
#3 0xffffafc23734 in void boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>::uninitialized_copy_n_and_update<char*>(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/detail/advanced_insert_int.hpp:85:22
#4 0xffffafc23734 in void boost::container::expand_forward_and_insert_alloc<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char*, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*> >(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char*, char*, unsigned long, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>) /root/ceph-19.0.0/build/boost/include/boost/container/detail/copy_move_algo.hpp:1469:23
#5 0xffffafc23734 in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_expand_forward<boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*> >(char*, unsigned long, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>, boost::move_detail::integral_constant<bool, false>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:3058:7
ceph#6 0xffffafc23734 in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range<boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*> >(char* const&, unsigned long, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2890:16
ceph#7 0xffffafc23734 in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::insert<char const*>(boost::container::vec_iterator<char*, true>, char const*, char const*, boost::move_detail::disable_if_or<void, boost::move_detail::is_convertible<char const*, unsigned long>, boost::container::dtl::is_input_iterator<char const*, has_iterator_category<char const*>::value>, boost::move_detail::bool_<false>, boost::move_detail::bool_<false> >::type*) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2088:20
ceph#8 0xffffafc23734 in ceph::logging::ConcreteEntry::ConcreteEntry(ceph::logging::Entry const&) /root/ceph-19.0.0/src/log/Entry.h:84:9
ceph#9 0xffffafc21a88 in decltype(new ((void*)(0))ceph::logging::ConcreteEntry(std::declval<ceph::logging::Entry>())) std::construct_at<ceph::logging::ConcreteEntry, ceph::logging::Entry>(ceph::logging::ConcreteEntry*, ceph::logging::Entry&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
ceph#10 0xffffafc21198 in void std::allocator_traits<std::allocator<ceph::logging::ConcreteEntry> >::construct<ceph::logging::ConcreteEntry, ceph::logging::Entry>(std::allocator<ceph::logging::ConcreteEntry>&, ceph::logging::ConcreteEntry*, ceph::logging::Entry&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
ceph#11 0xffffafc16464 in ceph::logging::ConcreteEntry& std::vector<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::emplace_back<ceph::logging::Entry>(ceph::logging::Entry&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:115:6
ceph#12 0xffffafc0dcbc in ceph::logging::Log::submit_entry(ceph::logging::Entry&&) /root/ceph-19.0.0/src/log/Log.cc:265:9
ceph#13 0xaaaadd41a404 in Log_StderrPipeBig_Test::TestBody() /root/ceph-19.0.0/src/log/test.cc:280:9
ceph#14 0xaaaade0b4338 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
ceph#15 0xaaaade061244 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
ceph#16 0xaaaade012680 in testing::Test::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2680:5
ceph#17 0xaaaade0145c4 in testing::TestInfo::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2858:11
ceph#18 0xaaaade015bc4 in testing::TestSuite::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:3012:28
ceph#19 0xaaaade031988 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5723:44
ceph#20 0xaaaade0be24c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
ceph#21 0xaaaade0687dc in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
ceph#22 0xaaaade030e00 in testing::UnitTest::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5306:10
ceph#23 0xaaaadd425c48 in RUN_ALL_TESTS() /root/ceph-19.0.0/src/googletest/googletest/include/gtest/gtest.h:2486:46
ceph#24 0xaaaadd4207a0 in main /root/ceph-19.0.0/src/log/test.cc:503:10
ceph#25 0xffffac3473f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
ceph#26 0xffffac3474c8 in __libc_start_main csu/../csu/libc-start.c:392:3
ceph#27 0xaaaadd364d6c in _start (/root/ceph-19.0.0/build/bin/unittest_log+0x384d6c) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
0xffff96e01d00 is located 0 bytes inside of 6553-byte region [0xffff96e01d00,0xffff96e03699)
freed by thread T0 here:
#0 0xaaaadd4136f0 in operator delete(void*) (/root/ceph-19.0.0/build/bin/unittest_log+0x4336f0) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
#1 0xaaaadd434968 in boost::container::new_allocator<char>::deallocate(char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/new_allocator.hpp:171:7
#2 0xaaaadd434934 in boost::container::allocator_traits<boost::container::new_allocator<char> >::deallocate(boost::container::new_allocator<char>&, char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:308:9
#3 0xaaaadd434934 in boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>::deallocate(char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/small_vector.hpp:255:10
#4 0xaaaadd43911c in boost::container::allocator_traits<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void> >::deallocate(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:308:9
#5 0xaaaadd43911c in boost::container::vector_alloc_holder<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, unsigned long, boost::move_detail::integral_constant<unsigned int, 1u> >::deallocate(char* const&, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:487:7
ceph#6 0xaaaadd43911c in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_new_allocation<boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&> >(char*, unsigned long, char*, unsigned long, boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:3080:25
ceph#7 0xaaaadd438aec in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_no_capacity<boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&> >(char*, unsigned long, boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&>, boost::move_detail::integral_constant<unsigned int, 1u>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2830:13
ceph#8 0xaaaadd4328bc in char& boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::emplace_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1888:24
ceph#9 0xaaaadd4328bc in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_push_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2746:13
ceph#10 0xaaaadd4328bc in boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::push_back(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1996:4
ceph#11 0xaaaadd4328bc in StackStringBuf<4096ul>::overflow(int) /root/ceph-19.0.0/src/common/StackStringStream.h:79:11
ceph#12 0xffffac6d3dac in std::ostream::put(char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x133dac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
ceph#13 0xffffac6d4aac in std::basic_ostream<char, std::char_traits<char> >& std::operator<<<std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x134aac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
ceph#14 0xaaaadd41a3c8 in Log_StderrPipeBig_Test::TestBody() /root/ceph-19.0.0/src/log/test.cc:278:9
ceph#15 0xaaaade0b4338 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
ceph#16 0xaaaade061244 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
ceph#17 0xaaaade012680 in testing::Test::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2680:5
ceph#18 0xaaaade0145c4 in testing::TestInfo::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2858:11
ceph#19 0xaaaade015bc4 in testing::TestSuite::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:3012:28
ceph#20 0xaaaade031988 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5723:44
ceph#21 0xaaaade0be24c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
ceph#22 0xaaaade0687dc in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
ceph#23 0xaaaade030e00 in testing::UnitTest::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5306:10
ceph#24 0xaaaadd425c48 in RUN_ALL_TESTS() /root/ceph-19.0.0/src/googletest/googletest/include/gtest/gtest.h:2486:46
ceph#25 0xaaaadd4207a0 in main /root/ceph-19.0.0/src/log/test.cc:503:10
ceph#26 0xffffac3473f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
ceph#27 0xffffac3474c8 in __libc_start_main csu/../csu/libc-start.c:392:3
ceph#28 0xaaaadd364d6c in _start (/root/ceph-19.0.0/build/bin/unittest_log+0x384d6c) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
previously allocated by thread T0 here:
#0 0xaaaadd412e88 in operator new(unsigned long) (/root/ceph-19.0.0/build/bin/unittest_log+0x432e88) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
#1 0xaaaadd433ec0 in boost::container::new_allocator<char>::allocate(unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/new_allocator.hpp:160:30
#2 0xaaaadd438a68 in boost::container::allocator_traits<boost::container::new_allocator<char> >::priv_allocate(boost::move_detail::integral_constant<bool, false>, boost::container::new_allocator<char>&, unsigned long, void const*) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:395:16
#3 0xaaaadd438a68 in boost::container::allocator_traits<boost::container::new_allocator<char> >::allocate(boost::container::new_allocator<char>&, unsigned long, void const*) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:318:14
#4 0xaaaadd438a68 in boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>::allocate(unsigned long, void const*) /root/ceph-19.0.0/build/boost/include/boost/container/small_vector.hpp:248:14
#5 0xaaaadd438a68 in boost::container::allocator_traits<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void> >::allocate(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:302:16
ceph#6 0xaaaadd438a68 in boost::container::vector_alloc_holder<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, unsigned long, boost::move_detail::integral_constant<unsigned int, 1u> >::allocate(unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:482:14
ceph#7 0xaaaadd438a68 in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_no_capacity<boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&> >(char*, unsigned long, boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&>, boost::move_detail::integral_constant<unsigned int, 1u>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2826:73
ceph#8 0xaaaadd4328bc in char& boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::emplace_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1888:24
ceph#9 0xaaaadd4328bc in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_push_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2746:13
ceph#10 0xaaaadd4328bc in boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::push_back(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1996:4
ceph#11 0xaaaadd4328bc in StackStringBuf<4096ul>::overflow(int) /root/ceph-19.0.0/src/common/StackStringStream.h:79:11
ceph#12 0xffffac6d3dac in std::ostream::put(char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x133dac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
ceph#13 0xffffac6d4aac in std::basic_ostream<char, std::char_traits<char> >& std::operator<<<std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x134aac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
ceph#14 0xaaaadd41a3c8 in Log_StderrPipeBig_Test::TestBody() /root/ceph-19.0.0/src/log/test.cc:278:9
ceph#15 0xaaaade0b4338 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
ceph#16 0xaaaade061244 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
ceph#17 0xaaaade012680 in testing::Test::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2680:5
ceph#18 0xaaaade0145c4 in testing::TestInfo::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2858:11
ceph#19 0xaaaade015bc4 in testing::TestSuite::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:3012:28
ceph#20 0xaaaade031988 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5723:44
ceph#21 0xaaaade0be24c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
ceph#22 0xaaaade0687dc in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
ceph#23 0xaaaade030e00 in testing::UnitTest::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5306:10
ceph#24 0xaaaadd425c48 in RUN_ALL_TESTS() /root/ceph-19.0.0/src/googletest/googletest/include/gtest/gtest.h:2486:46
ceph#25 0xaaaadd4207a0 in main /root/ceph-19.0.0/src/log/test.cc:503:10
ceph#26 0xffffac3473f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
ceph#27 0xffffac3474c8 in __libc_start_main csu/../csu/libc-start.c:392:3
ceph#28 0xaaaadd364d6c in _start (/root/ceph-19.0.0/build/bin/unittest_log+0x384d6c) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
SUMMARY: AddressSanitizer: heap-use-after-free (/root/ceph-19.0.0/build/bin/unittest_log+0x3fb750) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409) in __asan_memmove
Shadow bytes around the buggy address:
0x200ff2dc0350: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff2dc0360: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff2dc0370: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff2dc0380: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff2dc0390: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x200ff2dc03a0:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x200ff2dc03b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x200ff2dc03c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x200ff2dc03d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x200ff2dc03e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x200ff2dc03f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==3302372==ABORTING
```
vec.push_back(str) will allocate memory and release the old one once
there is insufficient memory which causing the old one to be invalid. So
streambuf's data pointer and insertion position should be updated to
newly allocated memory's address in vec.
Fixes: https://tracker.ceph.com/issues/65805
Signed-off-by: Rongqi Sun <sunrongqi@huawei.com>
(cherry picked from commit c8d51b9)
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
… overflow()
When sanitizer is enabled, unittest_log fails as following
```
[ RUN ] Log.StderrPipeBig
=================================================================
==3302372==ERROR: AddressSanitizer: heap-use-after-free on address 0xffff96e01d00 at pc 0xaaaadd3db754 bp 0xffffd9ebffa0 sp 0xffffd9ebf790
READ of size 4096 at 0xffff96e01d00 thread T0
#0 0xaaaadd3db750 in __asan_memmove (/root/ceph-19.0.0/build/bin/unittest_log+0x3fb750) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
#1 0xffffafc23734 in char const* boost::container::dtl::memmove_n_source<char const*, char*>(char const*, unsigned long, char*) /root/ceph-19.0.0/build/boost/include/boost/container/detail/copy_move_algo.hpp:261:10
#2 0xffffafc23734 in boost::container::dtl::enable_if_memtransfer_copy_constructible<char const*, char*, char const*>::type boost::container::uninitialized_copy_alloc_n_source<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*, char*>(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char const*, unsigned long, char*) /root/ceph-19.0.0/build/boost/include/boost/container/detail/copy_move_algo.hpp:600:11
#3 0xffffafc23734 in void boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>::uninitialized_copy_n_and_update<char*>(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/detail/advanced_insert_int.hpp:85:22
#4 0xffffafc23734 in void boost::container::expand_forward_and_insert_alloc<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char*, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*> >(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char*, char*, unsigned long, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>) /root/ceph-19.0.0/build/boost/include/boost/container/detail/copy_move_algo.hpp:1469:23
#5 0xffffafc23734 in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_expand_forward<boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*> >(char*, unsigned long, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>, boost::move_detail::integral_constant<bool, false>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:3058:7
ceph#6 0xffffafc23734 in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range<boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*> >(char* const&, unsigned long, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2890:16
ceph#7 0xffffafc23734 in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::insert<char const*>(boost::container::vec_iterator<char*, true>, char const*, char const*, boost::move_detail::disable_if_or<void, boost::move_detail::is_convertible<char const*, unsigned long>, boost::container::dtl::is_input_iterator<char const*, has_iterator_category<char const*>::value>, boost::move_detail::bool_<false>, boost::move_detail::bool_<false> >::type*) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2088:20
ceph#8 0xffffafc23734 in ceph::logging::ConcreteEntry::ConcreteEntry(ceph::logging::Entry const&) /root/ceph-19.0.0/src/log/Entry.h:84:9
ceph#9 0xffffafc21a88 in decltype(new ((void*)(0))ceph::logging::ConcreteEntry(std::declval<ceph::logging::Entry>())) std::construct_at<ceph::logging::ConcreteEntry, ceph::logging::Entry>(ceph::logging::ConcreteEntry*, ceph::logging::Entry&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
ceph#10 0xffffafc21198 in void std::allocator_traits<std::allocator<ceph::logging::ConcreteEntry> >::construct<ceph::logging::ConcreteEntry, ceph::logging::Entry>(std::allocator<ceph::logging::ConcreteEntry>&, ceph::logging::ConcreteEntry*, ceph::logging::Entry&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
ceph#11 0xffffafc16464 in ceph::logging::ConcreteEntry& std::vector<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::emplace_back<ceph::logging::Entry>(ceph::logging::Entry&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:115:6
ceph#12 0xffffafc0dcbc in ceph::logging::Log::submit_entry(ceph::logging::Entry&&) /root/ceph-19.0.0/src/log/Log.cc:265:9
ceph#13 0xaaaadd41a404 in Log_StderrPipeBig_Test::TestBody() /root/ceph-19.0.0/src/log/test.cc:280:9
ceph#14 0xaaaade0b4338 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
ceph#15 0xaaaade061244 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
ceph#16 0xaaaade012680 in testing::Test::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2680:5
ceph#17 0xaaaade0145c4 in testing::TestInfo::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2858:11
ceph#18 0xaaaade015bc4 in testing::TestSuite::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:3012:28
ceph#19 0xaaaade031988 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5723:44
ceph#20 0xaaaade0be24c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
ceph#21 0xaaaade0687dc in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
ceph#22 0xaaaade030e00 in testing::UnitTest::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5306:10
ceph#23 0xaaaadd425c48 in RUN_ALL_TESTS() /root/ceph-19.0.0/src/googletest/googletest/include/gtest/gtest.h:2486:46
ceph#24 0xaaaadd4207a0 in main /root/ceph-19.0.0/src/log/test.cc:503:10
ceph#25 0xffffac3473f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
ceph#26 0xffffac3474c8 in __libc_start_main csu/../csu/libc-start.c:392:3
ceph#27 0xaaaadd364d6c in _start (/root/ceph-19.0.0/build/bin/unittest_log+0x384d6c) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
0xffff96e01d00 is located 0 bytes inside of 6553-byte region [0xffff96e01d00,0xffff96e03699)
freed by thread T0 here:
#0 0xaaaadd4136f0 in operator delete(void*) (/root/ceph-19.0.0/build/bin/unittest_log+0x4336f0) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
#1 0xaaaadd434968 in boost::container::new_allocator<char>::deallocate(char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/new_allocator.hpp:171:7
#2 0xaaaadd434934 in boost::container::allocator_traits<boost::container::new_allocator<char> >::deallocate(boost::container::new_allocator<char>&, char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:308:9
#3 0xaaaadd434934 in boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>::deallocate(char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/small_vector.hpp:255:10
#4 0xaaaadd43911c in boost::container::allocator_traits<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void> >::deallocate(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:308:9
#5 0xaaaadd43911c in boost::container::vector_alloc_holder<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, unsigned long, boost::move_detail::integral_constant<unsigned int, 1u> >::deallocate(char* const&, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:487:7
ceph#6 0xaaaadd43911c in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_new_allocation<boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&> >(char*, unsigned long, char*, unsigned long, boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:3080:25
ceph#7 0xaaaadd438aec in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_no_capacity<boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&> >(char*, unsigned long, boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&>, boost::move_detail::integral_constant<unsigned int, 1u>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2830:13
ceph#8 0xaaaadd4328bc in char& boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::emplace_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1888:24
ceph#9 0xaaaadd4328bc in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_push_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2746:13
ceph#10 0xaaaadd4328bc in boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::push_back(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1996:4
ceph#11 0xaaaadd4328bc in StackStringBuf<4096ul>::overflow(int) /root/ceph-19.0.0/src/common/StackStringStream.h:79:11
ceph#12 0xffffac6d3dac in std::ostream::put(char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x133dac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
ceph#13 0xffffac6d4aac in std::basic_ostream<char, std::char_traits<char> >& std::operator<<<std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x134aac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
ceph#14 0xaaaadd41a3c8 in Log_StderrPipeBig_Test::TestBody() /root/ceph-19.0.0/src/log/test.cc:278:9
ceph#15 0xaaaade0b4338 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
ceph#16 0xaaaade061244 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
ceph#17 0xaaaade012680 in testing::Test::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2680:5
ceph#18 0xaaaade0145c4 in testing::TestInfo::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2858:11
ceph#19 0xaaaade015bc4 in testing::TestSuite::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:3012:28
ceph#20 0xaaaade031988 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5723:44
ceph#21 0xaaaade0be24c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
ceph#22 0xaaaade0687dc in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
ceph#23 0xaaaade030e00 in testing::UnitTest::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5306:10
ceph#24 0xaaaadd425c48 in RUN_ALL_TESTS() /root/ceph-19.0.0/src/googletest/googletest/include/gtest/gtest.h:2486:46
ceph#25 0xaaaadd4207a0 in main /root/ceph-19.0.0/src/log/test.cc:503:10
ceph#26 0xffffac3473f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
ceph#27 0xffffac3474c8 in __libc_start_main csu/../csu/libc-start.c:392:3
ceph#28 0xaaaadd364d6c in _start (/root/ceph-19.0.0/build/bin/unittest_log+0x384d6c) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
previously allocated by thread T0 here:
#0 0xaaaadd412e88 in operator new(unsigned long) (/root/ceph-19.0.0/build/bin/unittest_log+0x432e88) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
#1 0xaaaadd433ec0 in boost::container::new_allocator<char>::allocate(unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/new_allocator.hpp:160:30
#2 0xaaaadd438a68 in boost::container::allocator_traits<boost::container::new_allocator<char> >::priv_allocate(boost::move_detail::integral_constant<bool, false>, boost::container::new_allocator<char>&, unsigned long, void const*) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:395:16
#3 0xaaaadd438a68 in boost::container::allocator_traits<boost::container::new_allocator<char> >::allocate(boost::container::new_allocator<char>&, unsigned long, void const*) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:318:14
#4 0xaaaadd438a68 in boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>::allocate(unsigned long, void const*) /root/ceph-19.0.0/build/boost/include/boost/container/small_vector.hpp:248:14
#5 0xaaaadd438a68 in boost::container::allocator_traits<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void> >::allocate(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:302:16
ceph#6 0xaaaadd438a68 in boost::container::vector_alloc_holder<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, unsigned long, boost::move_detail::integral_constant<unsigned int, 1u> >::allocate(unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:482:14
ceph#7 0xaaaadd438a68 in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_no_capacity<boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&> >(char*, unsigned long, boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&>, boost::move_detail::integral_constant<unsigned int, 1u>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2826:73
ceph#8 0xaaaadd4328bc in char& boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::emplace_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1888:24
ceph#9 0xaaaadd4328bc in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_push_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2746:13
ceph#10 0xaaaadd4328bc in boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::push_back(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1996:4
ceph#11 0xaaaadd4328bc in StackStringBuf<4096ul>::overflow(int) /root/ceph-19.0.0/src/common/StackStringStream.h:79:11
ceph#12 0xffffac6d3dac in std::ostream::put(char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x133dac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
ceph#13 0xffffac6d4aac in std::basic_ostream<char, std::char_traits<char> >& std::operator<<<std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x134aac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
ceph#14 0xaaaadd41a3c8 in Log_StderrPipeBig_Test::TestBody() /root/ceph-19.0.0/src/log/test.cc:278:9
ceph#15 0xaaaade0b4338 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
ceph#16 0xaaaade061244 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
ceph#17 0xaaaade012680 in testing::Test::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2680:5
ceph#18 0xaaaade0145c4 in testing::TestInfo::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2858:11
ceph#19 0xaaaade015bc4 in testing::TestSuite::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:3012:28
ceph#20 0xaaaade031988 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5723:44
ceph#21 0xaaaade0be24c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
ceph#22 0xaaaade0687dc in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
ceph#23 0xaaaade030e00 in testing::UnitTest::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5306:10
ceph#24 0xaaaadd425c48 in RUN_ALL_TESTS() /root/ceph-19.0.0/src/googletest/googletest/include/gtest/gtest.h:2486:46
ceph#25 0xaaaadd4207a0 in main /root/ceph-19.0.0/src/log/test.cc:503:10
ceph#26 0xffffac3473f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
ceph#27 0xffffac3474c8 in __libc_start_main csu/../csu/libc-start.c:392:3
ceph#28 0xaaaadd364d6c in _start (/root/ceph-19.0.0/build/bin/unittest_log+0x384d6c) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
SUMMARY: AddressSanitizer: heap-use-after-free (/root/ceph-19.0.0/build/bin/unittest_log+0x3fb750) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409) in __asan_memmove
Shadow bytes around the buggy address:
0x200ff2dc0350: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff2dc0360: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff2dc0370: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff2dc0380: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x200ff2dc0390: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x200ff2dc03a0:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x200ff2dc03b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x200ff2dc03c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x200ff2dc03d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x200ff2dc03e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x200ff2dc03f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==3302372==ABORTING
```
vec.push_back(str) will allocate memory and release the old one once
there is insufficient memory which causing the old one to be invalid. So
streambuf's data pointer and insertion position should be updated to
newly allocated memory's address in vec.
Fixes: https://tracker.ceph.com/issues/65805
Signed-off-by: Rongqi Sun <sunrongqi@huawei.com>
(cherry picked from commit c8d51b9)
before this change, we allocate coefficients table with malloc() in ErasureCodeIsaDefault::prepare(), but free them using delete. so, in this change, we use new [] and delete [] to allocate and free them, to be more consistent, and to silence this warning. this issue was identified by AddressSanitizer: ``` 2025-03-22T04:27:58.210 INFO:tasks.ceph.osd.3.smithi102.stderr:==42899==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs free) on 0x611000570d40 2025-03-22T04:27:58.253 DEBUG:teuthology.orchestra.run.smithi102:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 env ASAN_OPTIONS=detect_leaks=0,detect_odr_violation=0,alloc_dealloc_mismatch=0 LD_PRELOAD=/lib64/libasan.so.6 ceph --cluster ceph osd pool set cephfs_data_ec allow_ec_overwrites true 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: #0 0x7efdc30b46b7 in free (/lib64/libasan.so.6+0xb46b7) 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: #1 0x7efd9e2e5f26 in ErasureCodeIsaTableCache::setEncodingTable(int, int, int, unsigned char*) (/usr/lib64/ceph/erasure-code/libec_isa.so+0x3bf26) 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: #2 0x7efd9e2f0fae in ErasureCodeIsaDefault::prepare() (/usr/lib64/ceph/erasure-code/libec_isa.so+0x46fae) 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: #3 0x7efd9e2bfa3e in ErasureCodeIsa::init(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::ostream*) (/usr/lib64/ceph/erasure-code/libec_isa.so+0x15a3e) 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: #4 0x7efd9e2ff510 in ErasureCodePluginIsa::factory(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::shared_ptr<ceph::ErasureCodeInterface>*, std::ostream*) (/usr/lib64/ceph/erasure-code/libec_isa.so+0x55510) 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: #5 0x55a7ef19b395 in ceph::ErasureCodePluginRegistry::factory(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::shared_ptr<ceph::ErasureCodeInterface>*, std::ostream*) (/usr/bin/ceph-osd+0x4dcb395) 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#6 0x55a7eb9089d8 in PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, ceph::common::CephContext*) (/usr/bin/ceph-osd+0x15389d8) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#7 0x55a7eb4b63ec in PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, spg_t) (/usr/bin/ceph-osd+0x10e63ec) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#8 0x55a7eaef1cc7 in OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t) (/usr/bin/ceph-osd+0xb21cc7) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#9 0x55a7eaef8d69 in OSD::handle_pg_create_info(std::shared_ptr<OSDMap const> const&, PGCreateInfo const*) (/usr/bin/ceph-osd+0xb28d69) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#10 0x55a7eafeb9ed in OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (/usr/bin/ceph-osd+0xc1b9ed) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#11 0x55a7ed04c9aa in ShardedThreadPool::shardedthreadpool_worker(unsigned int) (/usr/bin/ceph-osd+0x2c7c9aa) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#12 0x55a7ed04d338 in ShardedThreadPool::WorkThreadSharded::entry() (/usr/bin/ceph-osd+0x2c7d338) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#13 0x55a7ecf92b46 in Thread::entry_wrapper() (/usr/bin/ceph-osd+0x2bc2b46) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#14 0x55a7ecf92b82 in Thread::_entry_func(void*) (/usr/bin/ceph-osd+0x2bc2b82) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#15 0x7efdc1e8a3b1 in start_thread (/lib64/libc.so.6+0x8a3b1) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#16 0x7efdc1f0f42f in __GI___clone3 (/lib64/libc.so.6+0x10f42f) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr:0x611000570d40 is located 0 bytes inside of 256-byte region [0x611000570d40,0x611000570e40) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr:allocated by thread T817 here: 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: #0 0x7efdc30b64d7 in operator new[](unsigned long) (/lib64/libasan.so.6+0xb64d7) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: #1 0x7efd9e2f0e32 in ErasureCodeIsaDefault::prepare() (/usr/lib64/ceph/erasure-code/libec_isa.so+0x46e32) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: #2 0x7efd9e2bfa3e in ErasureCodeIsa::init(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::ostream*) (/usr/lib64/ceph/erasure-code/libec_isa.so+0x15a3e) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: #3 0x7efd9e2ff510 in ErasureCodePluginIsa::factory(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::shared_ptr<ceph::ErasureCodeInterface>*, std::ostream*) (/usr/lib64/ceph/erasure-code/libec_isa.so+0x55510) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: #4 0x55a7ef19b395 in ceph::ErasureCodePluginRegistry::factory(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::shared_ptr<ceph::ErasureCodeInterface>*, std::ostream*) (/usr/bin/ceph-osd+0x4dcb395) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: #5 0x55a7eb9089d8 in PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, ceph::common::CephContext*) (/usr/bin/ceph-osd+0x15389d8) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#6 0x55a7eb4b63ec in PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, spg_t) (/usr/bin/ceph-osd+0x10e63ec) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#7 0x55a7eaef1cc7 in OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t) (/usr/bin/ceph-osd+0xb21cc7) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#8 0x55a7eaef8d69 in OSD::handle_pg_create_info(std::shared_ptr<OSDMap const> const&, PGCreateInfo const*) (/usr/bin/ceph-osd+0xb28d69) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#9 0x55a7eafeb9ed in OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (/usr/bin/ceph-osd+0xc1b9ed) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#10 0x55a7ed04c9aa in ShardedThreadPool::shardedthreadpool_worker(unsigned int) (/usr/bin/ceph-osd+0x2c7c9aa) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#11 0x55a7ed04d338 in ShardedThreadPool::WorkThreadSharded::entry() (/usr/bin/ceph-osd+0x2c7d338) 2025-03-22T04:27:58.445 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#12 0x55a7ecf92b46 in Thread::entry_wrapper() (/usr/bin/ceph-osd+0x2bc2b46) 2025-03-22T04:27:58.445 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#13 0x55a7ecf92b82 in Thread::_entry_func(void*) (/usr/bin/ceph-osd+0x2bc2b82) 2025-03-22T04:27:58.445 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#14 0x7efdc1e8a3b1 in start_thread (/lib64/libc.so.6+0x8a3b1) 2025-03-22T04:27:58.445 INFO:tasks.ceph.osd.3.smithi102.stderr: ``` Fixes: https://tracker.ceph.com/issues/70619 Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Previously, in __ceph_abort and related abort handlers, we allocated
ClibBackTrace instances using raw pointers without proper cleanup. Since
these handlers terminate execution, the leaks didn't affect production
systems but were correctly flagged by ASan during testing:
```
Direct leak of 288 byte(s) in 1 object(s) allocated from:
#0 0x55aefe8cb65d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_ceph_assert+0x1f465d) (BuildId: a4faeddac80b0d81062bd53ede3388c0c10680bc)
#1 0x7f3b84da988d in ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/assert.cc:157:21
#2 0x55aefe8cf04b in supressed_assertf_line22() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/ceph_assert.cc:22:3
#3 0x55aefe8ce4e6 in CephAssertDeathTest_ceph_assert_supresssions_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/ceph_assert.cc:31:3
#4 0x55aefe99135d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2653:10
#5 0x55aefe94f015 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2689:14
...
```
This commit resolves the issue by using std::unique_ptr to manage the
lifecycle of backtrace objects, ensuring proper cleanup even in
non-returning functions. While these leaks had no practical impact in
production (as the process terminates anyway), fixing them improves code
quality and eliminates false positives in memory analysis tools.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
…ms in destructor
Fix memory leak in test_not_before_queue identified by AddressSanitizer.
Previously, the test was terminating without properly dequeueing all elements,
causing resource leaks during teardown.
This change implements a proper cleanup mechanism that:
1. Advances the queue's time until all elements become eligible for processing
2. Dequeues all remaining elements to ensure proper destruction
3. Guarantees clean teardown even for elements with future timestamps
The fix eliminates ASan-reported leaks occurring in the not_before_queue_t::enqueue
method when test cases were torn down prematurely.
A sample of the error from ASan:
```
Direct leak of 1800 byte(s) in 15 object(s) allocated from:
#0 0x7f71b1f1ab5b in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:86
#1 0x55fb4c977058 in void not_before_queue_t<tv_t, test_time_t>::enqueue<tv_t const&>(tv_t const&) /home/kefu/dev/ceph/src/common/not_before_queue.h:164
#2 0x55fb4c97748b in NotBeforeTest::load_test_data(std::vector<tv_t, std::allocator<tv_t> > const&) /home/kefu/dev/ceph/src/test/test_not_before_queue.cc:67
#3 0x55fb4c961112 in NotBeforeTest_RemoveIfByClass_no_cond_Test::TestBody() /home/kefu/dev/ceph/src/test/test_not_before_queue.cc:213
#4 0x55fb4ca05f67 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
#5 0x55fb4ca1c4f7 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
ceph#6 0x55fb4c9e1104 in testing::Test::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2728
ceph#7 0x55fb4c9e16e2 in testing::TestInfo::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2874
ceph#8 0x55fb4c9e73b4 in testing::TestSuite::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:3052
ceph#9 0x55fb4c9f059b in testing::internal::UnitTestImpl::RunAllTests() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:6004
ceph#10 0x55fb4ca064ff in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
ceph#11 0x55fb4ca1d1bf in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
ceph#12 0x55fb4c9e124d in testing::UnitTest::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:5583
ceph#13 0x55fb4c97a0b6 in RUN_ALL_TESTS() /home/kefu/dev/ceph/src/googletest/googletest/include/gtest/gtest.h:2334
ceph#14 0x55fb4c979ffc in main /home/kefu/dev/ceph/src/googletest/googlemock/src/gmock_main.cc:71
ceph#15 0x7f71ae833ca7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
SUMMARY: AddressSanitizer: 1800 byte(s) leaked in 15 allocation(s).
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Previously, we had memory leak in the test_bluestore_types.cc tests where
`BufferCacheShard` and `OnodeCacheShard` objects were allocated with
raw pointers but never freed, causing leaks detected by AddressSanitizer.
ASan rightly pointed this out:
```
Direct leak of 224 byte(s) in 1 object(s) allocated from:
#0 0x55a7432a079d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_bluestore_types+0xf2e79d) (BuildId: c3bec647afa97df6bb147bc82eac937531fc6272)
#1 0x55a743523340 in BlueStore::BufferCacheShard::create(BlueStore*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, ceph::common::PerfCounters*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/os/bluestore/Bl
ueStore.cc:1678:9
#2 0x55a74330b617 in ExtentMap_seek_lextent_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/test_bluestore_types.cc:1077:7
#3 0x55a7434f2b2d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.
cc:2653:10
#4 0x55a7434b5775 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:
2689:14
#5 0x55a74347005d in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2728:5
```
```
Direct leak of 9928 byte(s) in 1 object(s) allocated from:
#0 0x7ff249d21a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
#1 0x6048ed878b76 in BlueStore::OnodeCacheShard::create(ceph::common::CephContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::common::PerfCounters*) /home/kefu/dev/ceph/src/os/bluestore/BlueStore.cc:1219
#2 0x6048ed66d4f9 in GarbageCollector_BasicTest_Test::TestBody() /home/kefu/dev/ceph/src/test/objectstore/test_bluestore_types.cc:2662
#3 0x6048ed820555 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
#4 0x6048ed80c78a in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
#5 0x6048ed7b8bfa in testing::Test::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2728
```
In this change, we replace raw pointer allocation with unique_ptr to
ensure automatic cleanup when the objects go out of scope.
`
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Memory leak was detected by AddressSanitizer in dbstore tests. Objects
inserted into a static objectmap were not being properly freed when
tests completed without explicitly deleting buckets.
The leak occurred because:
- Objects were preserved in a static map after insertion
- Objects were only freed by DB::objectmapDelete() when deleting the
corresponding bucket
- In tests, buckets weren't always deleted after insertion, leaving
objects allocated
ASan rightly pointed this out:
```
Direct leak of 200 byte(s) in 1 object(s) allocated from:
#0 0x55c420f5274d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_dbstore_tests+0x4df74d) (BuildId: c19ed2d2b1ead306fdc59fc311f546a287300010)
#1 0x55c42143cfaf in SQLGetBucket::Execute(DoutPrefixProvider const*, rgw::store::DBOpParams*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/sqlite/sqliteDB.cc:1689:11
#2 0x55c42102751c in rgw::store::DB::ProcessOp(DoutPrefixProvider const*, std::basic_string_view<char, std::char_traits<char>>, rgw::store::DBOpParams*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/common/dbstore.cc:251:16
#3 0x55c4210328c9 in rgw::store::DB::get_bucket_info(DoutPrefixProvider const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, RGWBucketInfo&, std::map<std::__cxx11
::basic_string<char, std::char_traits<char>, std::allocator<char>>, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::c
har_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, ceph::buffer::v15_2_0::list>>>*, std::chrono::time_point<ceph::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l>>>*, obj_version*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/common/dbstore.cc:450:9
#4 0x55c421034357 in rgw::store::DB::create_bucket(DoutPrefixProvider const*, std::variant<rgw_user, rgw_account_id> const&, rgw_bucket const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, rgw_placement_rule const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, ceph::buffer::v15_2_0::list>>> const&, std::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>> const&, std::optional<RGWQuotaInfo> const&, std::optional<std::chrono::time_point<ceph::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l>>>>, obj_version*, RGWBucketInfo&, optional_yield) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/common/dbstore.cc:504:9
#5 0x55c420f6c3ed in DBStoreTest_CreateBucket_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/tests/dbstore_tests.cc:495:13
ceph#6 0x55c42174fa0d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2653:10
ceph#7 0x55c421717f25 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2689:14
ceph#8 0x55c4216d4bbd in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2728:5
```
In this change, we:
- Free objectmap entries in DB::Destroy() to ensure cleanup on shutdown
- Call DB::Destroy() in DBStoreTest::TearDown() to guarantee cleanup after each test
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Previously, we had Resource leak in Directory::for_each() where
directory streams opened with fdopendir() were never properly closed,
causing a 32KB leak per unclosed directory handle.
In this change, we add closedir() call to properly release directory stream
resources after enumeration completes.
ASan report:
```
Direct leak of 32816 byte(s) in 1 object(s) allocated from:
#0 0x738387320e15 in malloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:67
#1 0x738381383514 (/usr/lib/libc.so.6+0xe3514) (BuildId: 468e3585c794491a48ea75fceb9e4d6b1464fc35)
#2 0x738381383418 in fdopendir (/usr/lib/libc.so.6+0xe3418) (BuildId: 468e3585c794491a48ea75fceb9e4d6b1464fc35)
#3 0x6433e1aefac4 in for_each<rgw::sal::Directory::fill_cache(const DoutPrefixProvider*, optional_yield, rgw::sal::fill_cache_cb_t&)::<lambda(char const*)> > /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:874
#4 0x6433e1aa10ab in rgw::sal::Directory::fill_cache(DoutPrefixProvider const*, optional_yield, fu2::abi_310::detail::function<fu2::abi_310::detail::config<true, false, 16ul>, fu2::abi_310::detail::property<true, false, int (DoutPrefixProvider const*, rgw_bucket_dir_ent
ry&) const> > const&) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:1077
#5 0x6433e1aa8974 in rgw::sal::MPDirectory::fill_cache(DoutPrefixProvider const*, optional_yield, fu2::abi_310::detail::function<fu2::abi_310::detail::config<true, false, 16ul>, fu2::abi_310::detail::property<true, false, int (DoutPrefixProvider const*, rgw_bucket_dir_e
ntry&) const> > const&) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:1384
ceph#6 0x6433e1abf1ee in rgw::sal::POSIXBucket::fill_cache(DoutPrefixProvider const*, optional_yield, fu2::abi_310::detail::function<fu2::abi_310::detail::config<true, false, 16ul>, fu2::abi_310::detail::property<true, false, int (DoutPrefixProvider const*, rgw_bucket_dir_e
ntry&) const> > const&) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:2236
ceph#7 0x6433e1c43956 in file::listing::BucketCache<rgw::sal::POSIXDriver, rgw::sal::POSIXBucket>::fill(DoutPrefixProvider const*, file::listing::BucketCacheEntry<rgw::sal::POSIXDriver, rgw::sal::POSIXBucket>*, rgw::sal::POSIXBucket*, unsigned int, optional_yield) /home/kef
u/dev/ceph/src/rgw/driver/posix/bucket_cache.h:368
ceph#8 0x6433e1c1c71c in file::listing::BucketCache<rgw::sal::POSIXDriver, rgw::sal::POSIXBucket>::list_bucket(DoutPrefixProvider const*, optional_yield, rgw::sal::POSIXBucket*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, fu2::abi_310::d
etail::function<fu2::abi_310::detail::config<true, false, 16ul>, fu2::abi_310::detail::property<true, false, bool (rgw_bucket_dir_entry const&) const> >) /home/kefu/dev/ceph/src/rgw/driver/posix/bucket_cache.h:410
ceph#9 0x6433e1ac1829 in rgw::sal::POSIXBucket::list(DoutPrefixProvider const*, rgw::sal::Bucket::ListParams&, int, rgw::sal::Bucket::ListResults&, optional_yield) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:2256
ceph#10 0x6433e1ae1302 in rgw::sal::POSIXMultipartUpload::list_parts(DoutPrefixProvider const*, ceph::common::CephContext*, int, int, int*, bool*, optional_yield, bool) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:3692
ceph#11 0x6433e1ae2f3b in rgw::sal::POSIXMultipartUpload::complete(DoutPrefixProvider const*, optional_yield, ceph::common::CephContext*, std::map<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<int>, std::allocator<std::pair<
int const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::__cxx11::list<rgw_obj_index_key, std::allocator<rgw_obj_index_key> >&, unsigned long&, bool&, RGWCompressionInfo&, long&, std::__cxx11::basic_string<char, std::char_trait
s<char>, std::allocator<char> >&, ACLOwner&, unsigned long, rgw::sal::Object*, boost::container::flat_map<unsigned int, boost::container::flat_set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std
::char_traits<char>, std::allocator<char> > >, void>, std::less<unsigned int>, void>&) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:3770
ceph#12 0x6433e0fffb73 in POSIXMPObjectTest::create_MPObj(rgw::sal::MultipartUpload*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) /home/kefu/dev/ceph/src/test/rgw/test_rgw_posix_driver.cc:1717
ceph#13 0x6433e0e95ab9 in POSIXMPObjectTest_BucketList_Test::TestBody() /home/kefu/dev/ceph/src/test/rgw/test_rgw_posix_driver.cc:1863
ceph#14 0x6433e1308d17 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
```
Fixes https://tracker.ceph.com/issues/71505
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Previously, error messages passed to luaL_error() were formatted using
std::string concatenation. Since luaL_error() never returns (it throws
a Lua exception via longjmp), the allocated std::string memory was
leaked, as detected by AddressSanitizer:
```
Direct leak of 105 byte(s) in 1 object(s) allocated from:
#0 0x7fc5f1921a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
#1 0x563bd89144c7 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/include/c++/15.1.1/bits/new_allocator.h:151
#2 0x563bd89144c7 in std::allocator<char>::allocate(unsigned long) /usr/include/c++/15.1.1/bits/allocator.h:203
#3 0x563bd89144c7 in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/alloc_traits.h:614
#4 0x563bd89144c7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.h:142
#5 0x563bd89144c7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:164
ceph#6 0x563bd896ae1b in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:363
ceph#7 0x563bd896b256 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:455
ceph#8 0x563bd896b2bb in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(char const*) /usr/include/c++/15.1.1/bits/basic_string.h:1585
ceph#9 0x563bd943c2f2 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, char const*) /usr/include/c++/15.1.1/bits/basic_string.h:3977
ceph#10 0x563bd943c2f2 in rgw::lua::lua_state_guard::runtime_hook(lua_State*, lua_Debug*) /home/kefu/dev/ceph/src/rgw/rgw_lua_utils.cc:245
ceph#11 0x7fc5f139f8ef (/usr/lib/liblua.so.5.4+0xe8ef) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
ceph#12 0x7fc5f139fbfe (/usr/lib/liblua.so.5.4+0xebfe) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
ceph#13 0x7fc5f13b26fe (/usr/lib/liblua.so.5.4+0x216fe) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
ceph#14 0x7fc5f139f581 (/usr/lib/liblua.so.5.4+0xe581) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
ceph#15 0x7fc5f139b735 (/usr/lib/liblua.so.5.4+0xa735) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
ceph#16 0x7fc5f139ba8f (/usr/lib/liblua.so.5.4+0xaa8f) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
ceph#17 0x7fc5f139f696 in lua_pcallk (/usr/lib/liblua.so.5.4+0xe696) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
ceph#18 0x563bd8a925ef in rgw::lua::request::execute(rgw::sal::Driver*, RGWREST*, OpsLogSink*, req_state*, RGWOp*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/kefu/dev/ceph/src/rgw/rgw_lua_request.cc:824
ceph#19 0x563bd8952e3d in TestRGWLua_LuaRuntimeLimit_Test::TestBody() /home/kefu/dev/ceph/src/test/rgw/test_rgw_lua.cc:1628
ceph#20 0x563bd8a37817 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
ceph#21 0x563bd8a509b5 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (/home/kefu/dev/ceph/build/bin/unittest_rgw_lua+0x11199b5) (BuildId: b2628caba5290d882d25f7bea166f058b682bc85)`
```
This change replaces std::string formatting with stack-allocated buffer
and std::to_chars() to eliminate the memory leak.
Note: We cannot format int64_t directly through luaL_error() because
lua_pushfstring() does not support long long or int64_t format specifiers,
even in Lua 5.4 (see https://www.lua.org/manual/5.4/manual.html#lua_pushfstring).
Since libstdc++ uses int64_t for std::chrono::milliseconds::rep, we use
std::to_chars() for safe, efficient conversion without heap allocation.
The maximum runtime limit was a configuration introduced by 3e3cb15.
Fixes: https://tracker.ceph.com/issues/71595
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
…write
Fix 4KB memory leak in ErasureCodePlugin_parity_delta_write_Test caused by
unmanaged raw buffer allocation. The test was allocating a 4096-byte raw
buffer to replace shard 4 for delta encoding validation, but the buffer::ptr
constructed from the raw pointer did not manage the buffer's lifecycle.
Detected by AddressSanitizer:
```
Direct leak of 4096 byte(s) in 1 object(s) allocated from:
#0 0x7fb73a720e15 in malloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:67
#1 0x5562f4062ccc in ErasureCodePlugin_parity_delta_write_Test::TestBody() /home/kefu/dev/ceph/src/test/erasure-code/TestErasureCodePluginJerasure.cc:122
#2 0x5562f41081a1 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
#3 0x5562f40f3004 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
#4 0x5562f409cbba in testing::Test::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2728```
```
In this change, we replace raw pointer allocation with
create_bufferptr() to ensure proper memory management by buffer::ptr.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Previously, SyncPoint allocated two C_Gather instances tracked by raw
pointers but failed to properly clean them up when only a single sync
point existed, causing memory leaks detected by AddressSanitizer.
This change fixes the leak by modifying AbstractWriteLog::shut_down()
to check for prior sync points in the chain. When the current sync point
is the only one present, we now activate the m_prior_log_entries_persisted
context to ensure:
- The onfinish callback executes and releases the captured strong
reference to the enclosing SyncPoint
- The parent m_sync_point_persist context completes and gets properly
released
This ensures all allocated contexts are cleaned up correctly during
shutdown, eliminating the memory leak.
The ASan report:
```
Indirect leak of 2064 byte(s) in 1 object(s) allocated from:
#0 0x56440919ae2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librbd+0x2f3de2d) (BuildId: 6a04677c6ee5235f1a41815df807f97c5b96d4cd)
#1 0x56440bd67751 in __gnu_cxx::new_allocator<Context*>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
#2 0x56440bd676e0 in std::allocator<Context*>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
#3 0x56440bd676e0 in std::allocator_traits<std::allocator<Context*>>::allocate(std::allocator<Context*>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
#4 0x56440bd6730b in std::_Vector_base<Context*, std::allocator<Context*>>::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
#5 0x7fd33e00e8d1 in std::vector<Context*, std::allocator<Context*>>::reserve(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
ceph#6 0x7fd33e00c51c in librbd::cache::pwl::SyncPoint::SyncPoint(unsigned long, ceph::common::CephContext*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/SyncPoint.cc:20:27
ceph#7 0x56440bd65f26 in decltype(::new((void*)(0)) librbd::cache::pwl::SyncPoint(std::declval<unsigned long&>(), std::declval<ceph::common::CephContext*&>())) std::construct_at<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
ceph#8 0x56440bd65b98 in void std::allocator_traits<std::allocator<librbd::cache::pwl::SyncPoint>>::construct<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>&, librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
ceph#9 0x56440bd657d3 in std::_Sp_counted_ptr_inplace<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:519:4
ceph#10 0x56440bd65371 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*&, std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:651:6
ceph#11 0x56440bd65163 in std::__shared_ptr<librbd::cache::pwl::SyncPoint, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
ceph#12 0x56440bd650e6 in std::shared_ptr<librbd::cache::pwl::SyncPoint>::shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
ceph#13 0x56440bd65057 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::allocate_shared<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint> const&, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
ceph#14 0x56440bca97e7 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::make_shared<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
ceph#15 0x56440bd443c8 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::new_sync_point(librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1905:20
ceph#16 0x56440bd42e4c in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1951:3
ceph#17 0x56440bd9cbf2 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point_if_needed(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1990:5
ceph#18 0x56440bd9c636 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&)::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2152:9
ceph#19 0x56440bd9b9b4 in boost::detail::function::void_function_obj_invoker<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&), void, librbd::cache::pwl::GuardedRequestFunctionContext&>::invoke(boost::detail::function::function_buffer&, librbd::cache::pwl::GuardedRequestFunctionContext&) /opt/ceph/include/boost/function/function_template.hpp:100:11
ceph#20 0x56440bd29321 in boost::function_n<void, librbd::cache::pwl::GuardedRequestFunctionContext&>::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /opt/ceph/include/boost/function/function_template.hpp:789:14
ceph#21 0x56440bd28d85 in librbd::cache::pwl::GuardedRequestFunctionContext::finish(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/Request.h:335:5
ceph#22 0x5644091e0fe0 in Context::complete(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/include/Context.h:102:5
ceph#23 0x56440bd9b378 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::detain_guarded_request(librbd::cache::pwl::C_BlockIORequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::GuardedRequestFunctionContext*, bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1202:20
ceph#24 0x56440bd96c50 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2154:3
ceph#25 0x56440bd1e4b5 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::shut_down(Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:703:3
ceph#26 0x56440bdb9022 in librbd::cache::pwl::TestMockCacheSSDWriteLog_compare_and_write_compare_matched_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/cache/pwl/test_mock_SSDWriteLog.cc:403:7
```
Fixes: https://tracker.ceph.com/issues/71335
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Removing vxattr 'ceph.dir.subvolume' on a directory without it being set causes the mds to crash. This is because the snaprealm would be null for the directory and the null check is missing. Setting the vxattr, creates the snaprealm for the directory as part of it. Hence, mds doesn't crash when the vxattr is set and then removed. This patch fixes the same. Reproducer: $mkdir /mnt/dir1 $setfattr -x "ceph.dir.subvolume" /mnt/dir1 Traceback: ------- Core was generated by `./ceph/build/bin/ceph-mds -i a -c ./ceph/build/ceph.conf'. Program terminated with signal SIGSEGV, Segmentation fault. (gdb) bt #0 0x00007f33f1aa8034 in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x00007f33f1a4ef1e in raise () from /lib64/libc.so.6 #2 0x0000562b148a6fd0 in reraise_fatal (signum=signum@entry=11) at /ceph/src/global/signal_handler.cc:88 #3 0x0000562b148a83d9 in handle_oneshot_fatal_signal (signum=11) at /ceph/src/global/signal_handler.cc:367 #4 <signal handler called> #5 Server::handle_client_setvxattr (this=0x562b4ee3f800, mdr=..., cur=0x562b4ef9cc00) at /ceph/src/mds/Server.cc:6406 ceph#6 0x0000562b145fadc2 in Server::handle_client_removexattr (this=0x562b4ee3f800, mdr=...) at /ceph/src/mds/Server.cc:7022 ceph#7 0x0000562b145fbff0 in Server::dispatch_client_request (this=0x562b4ee3f800, mdr=...) at /ceph/src/mds/Server.cc:2825 ceph#8 0x0000562b145fcfa2 in Server::handle_client_request (this=0x562b4ee3f800, req=...) at /ceph/src/mds/Server.cc:2676 ceph#9 0x0000562b1460063c in Server::dispatch (this=0x562b4ee3f800, m=...) at /ceph/src/mds/Server.cc:382 ceph#10 0x0000562b1450eb22 in MDSRank::handle_message (this=this@entry=0x562b4ef42008, m=...) at /ceph/src/mds/MDSRank.cc:1222 ceph#11 0x0000562b14510c93 in MDSRank::_dispatch (this=this@entry=0x562b4ef42008, m=..., new_msg=new_msg@entry=true) at /ceph/src/mds/MDSRank.cc:1045 ceph#12 0x0000562b14511620 in MDSRankDispatcher::ms_dispatch (this=this@entry=0x562b4ef42000, m=...) at /ceph/src/mds/MDSRank.cc:1019 ceph#13 0x0000562b144ff117 in MDSDaemon::ms_dispatch2 (this=0x562b4ee64000, m=...) at /ceph/src/common/RefCountedObj.h:56 ceph#14 0x00007f33f2f4974a in Messenger::ms_deliver_dispatch (this=0x562b4ee70000, m=...) at /ceph/src/msg/Messenger.h:746 ceph#15 0x00007f33f2f467e2 in DispatchQueue::entry (this=0x562b4ee703b8) at /ceph/src/msg/DispatchQueue.cc:202 ceph#16 0x00007f33f2ff61cb in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /ceph/src/msg/DispatchQueue.h:101 ceph#17 0x00007f33f2df4b5d in Thread::entry_wrapper (this=0x562b4ee70518) at /ceph/src/common/Thread.cc:87 ceph#18 0x00007f33f2df4b6f in Thread::_entry_func (arg=<optimized out>) at /ceph/src/common/Thread.cc:74 ceph#19 0x00007f33f1aa6088 in start_thread () from /lib64/libc.so.6 ceph#20 0x00007f33f1b29f8c in clone3 () from /lib64/libc.so.6 --------- Fixes: https://tracker.ceph.com/issues/70794 Signed-off-by: Kotresh HR <khiremat@redhat.com>
Fix a memory leak where HybridAllocator's embedded BitmapAllocator
(bmap_alloc) was not properly cleaned up during destruction. The issue
occurred because virtual function calls in destructors don't dispatch
to derived class implementations - when AvlAllocator::~AvlAllocator()
calls shutdown(), it invokes AvlAllocator::shutdown() rather than
HybridAllocatorBase::shutdown(), leaving bmap_alloc unfreed.
This issue only affects unit tests and does not impact production,
as BlueFS always calls shutdown() explicitly when stopping allocators
in BlueFS::_stop_alloc()
This change has minimal impact on existing code that already calls
shutdown() explicitly, since shutdown() has idempotent behavior and
can be safely called multiple times. Additionally, destructors are
only invoked during system teardown, so the potential double shutdown()
call does not affect runtime performance.
Changes:
- Replace raw pointer bmap_alloc with unique_ptr for automatic cleanup
- Add explicit shutdown() call in BitmapAllocator destructor for RAII
- Replace manual bmap_alloc->shutdown() with bmap_alloc.reset()
- Add explicit default destructor to HybridAllocatorBase for clarity
This eliminates the need for manual shutdown() calls and ensures
proper resource cleanup through RAII principles.
Fixes ASan-reported indirect leak of BitmapAllocator resources.
Please see the leak reported by ASan for more details:
```
Indirect leak of 23 byte(s) in 1 object(s) allocated from:
#0 0x7fe30b121a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
#1 0x5614e057ee11 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/include/c++/15.1.1/bits/new_allocator.h:151
#2 0x5614e057d279 in std::allocator<char>::allocate(unsigned long) /usr/include/c++/15.1.1/bits/allocator.h:203
#3 0x5614e057d279 in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/alloc_traits.h:614
#4 0x5614e057d279 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.h:142
#5 0x5614e057cf27 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:164
ceph#6 0x5614e05fe91b in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<true>(char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:291
ceph#7 0x5614e05f3cd4 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /usr/include/c++/15.1.
1/bits/basic_string.h:617
ceph#8 0x5614e069b892 in ceph::mutex_debug_detail::mutex_debug_impl<false>::mutex_debug_impl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) /home/kefu/dev/ceph/src/common/mutex_debug.
h:103
ceph#9 0x5614e06b8586 in ceph::mutex_debug_detail::mutex_debug_impl<false> ceph::make_mutex<char const (&) [23]>(char const (&) [23]) /home/kefu/dev/ceph/src/common/ceph_mutex.h:118
ceph#10 0x5614e06b8350 in AllocatorLevel02<AllocatorLevel01Loose>::AllocatorLevel02() /home/kefu/dev/ceph/src/os/bluestore/fastbmap_allocator_impl.h:526
ceph#11 0x5614e06b393b in BitmapAllocator::BitmapAllocator(ceph::common::CephContext*, long, long, std::basic_string_view<char, std::char_traits<char> >) /home/kefu/dev/ceph/src/os/bluestore/BitmapAllocator.cc:16
ceph#12 0x5614e072755f in HybridAllocatorBase<AvlAllocator>::_spillover_range(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/HybridAllocator_impl.h:123
ceph#13 0x5614e06c9045 in AvlAllocator::_try_insert_range(unsigned long, unsigned long, boost::intrusive::tree_iterator<boost::intrusive::mhtraits<range_seg_t, boost::intrusive::avl_set_member_hook<>, &range_seg_t::offset_hook>,
false>*) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.h:210
ceph#14 0x5614e06c0104 in AvlAllocator::_add_to_tree(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.cc:136
ceph#15 0x5614e0586e5f in HybridAllocatorBase<AvlAllocator>::_add_to_tree(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/HybridAllocator.h:110
ceph#16 0x5614e06c6559 in AvlAllocator::init_add_free(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.cc:494
ceph#17 0x5614e0582943 in HybridAllocator_fragmentation_Test::TestBody() /home/kefu/dev/ceph/src/test/objectstore/hybrid_allocator_test.cc:285
ceph#18 0x5614e061bee3 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
ceph#19 0x5614e0606562 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689`
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Previously, SyncPoint allocated two C_Gather instances tracked by raw
pointers but failed to properly clean them up when only a single sync
point existed, causing memory leaks detected by AddressSanitizer.
This change fixes the leak by modifying AbstractWriteLog::shut_down()
to check for prior sync points in the chain. When the current sync point
is the only one present, we now activate the m_prior_log_entries_persisted
context to ensure:
- The onfinish callback executes and releases the captured strong
reference to the enclosing SyncPoint
- The parent m_sync_point_persist context completes and gets properly
released
This ensures all allocated contexts are cleaned up correctly during
shutdown, eliminating the memory leak.
The ASan report:
```
Indirect leak of 2064 byte(s) in 1 object(s) allocated from:
#0 0x56440919ae2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librbd+0x2f3de2d) (BuildId: 6a04677c6ee5235f1a41815df807f97c5b96d4cd)
#1 0x56440bd67751 in __gnu_cxx::new_allocator<Context*>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
#2 0x56440bd676e0 in std::allocator<Context*>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
#3 0x56440bd676e0 in std::allocator_traits<std::allocator<Context*>>::allocate(std::allocator<Context*>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
#4 0x56440bd6730b in std::_Vector_base<Context*, std::allocator<Context*>>::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
#5 0x7fd33e00e8d1 in std::vector<Context*, std::allocator<Context*>>::reserve(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
ceph#6 0x7fd33e00c51c in librbd::cache::pwl::SyncPoint::SyncPoint(unsigned long, ceph::common::CephContext*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/SyncPoint.cc:20:27
ceph#7 0x56440bd65f26 in decltype(::new((void*)(0)) librbd::cache::pwl::SyncPoint(std::declval<unsigned long&>(), std::declval<ceph::common::CephContext*&>())) std::construct_at<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
ceph#8 0x56440bd65b98 in void std::allocator_traits<std::allocator<librbd::cache::pwl::SyncPoint>>::construct<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>&, librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
ceph#9 0x56440bd657d3 in std::_Sp_counted_ptr_inplace<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:519:4
ceph#10 0x56440bd65371 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*&, std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:651:6
ceph#11 0x56440bd65163 in std::__shared_ptr<librbd::cache::pwl::SyncPoint, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
ceph#12 0x56440bd650e6 in std::shared_ptr<librbd::cache::pwl::SyncPoint>::shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
ceph#13 0x56440bd65057 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::allocate_shared<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint> const&, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
ceph#14 0x56440bca97e7 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::make_shared<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
ceph#15 0x56440bd443c8 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::new_sync_point(librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1905:20
ceph#16 0x56440bd42e4c in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1951:3
ceph#17 0x56440bd9cbf2 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point_if_needed(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1990:5
ceph#18 0x56440bd9c636 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&)::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2152:9
ceph#19 0x56440bd9b9b4 in boost::detail::function::void_function_obj_invoker<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&), void, librbd::cache::pwl::GuardedRequestFunctionContext&>::invoke(boost::detail::function::function_buffer&, librbd::cache::pwl::GuardedRequestFunctionContext&) /opt/ceph/include/boost/function/function_template.hpp:100:11
ceph#20 0x56440bd29321 in boost::function_n<void, librbd::cache::pwl::GuardedRequestFunctionContext&>::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /opt/ceph/include/boost/function/function_template.hpp:789:14
ceph#21 0x56440bd28d85 in librbd::cache::pwl::GuardedRequestFunctionContext::finish(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/Request.h:335:5
ceph#22 0x5644091e0fe0 in Context::complete(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/include/Context.h:102:5
ceph#23 0x56440bd9b378 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::detain_guarded_request(librbd::cache::pwl::C_BlockIORequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::GuardedRequestFunctionContext*, bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1202:20
ceph#24 0x56440bd96c50 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2154:3
ceph#25 0x56440bd1e4b5 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::shut_down(Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:703:3
ceph#26 0x56440bdb9022 in librbd::cache::pwl::TestMockCacheSSDWriteLog_compare_and_write_compare_matched_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/cache/pwl/test_mock_SSDWriteLog.cc:403:7
```
Fixes: https://tracker.ceph.com/issues/71335
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 05fd6f9)
Previously, run-cli-tests ignored all environment variables from the parent
process to ensure a clean test environment. However, this also dropped
sanitizer settings (ASAN_OPTIONS and LSAN_OPTIONS) needed when AddressSanitizer
is enabled.
This causes test failures with TCMalloc due to false-positive leak reports
from TCMalloc's internal objects, which is a known issue documented in
Google's C++ style guide. While recent gperftools releases have fixed this,
Ubuntu Jammy still ships with an older version that triggers these warnings.
This change preserves sanitizer environment variables while maintaining
the clean test environment for other variables.
Note: Once we upgrade to newer gperftools, we can remove the related
suppression rule in qa/lsan.supp.
The test failure with TCMalloc looks like:
```
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/ceph-kvstore-tool/help.t: failed
--- /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/ceph-kvstore-tool/help.t
+++ /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/ceph-kvstore-tool/help.t.err
@@ -21,3 +21,19 @@
stats
histogram [prefix]
+
+ =================================================================
+ ==87908==ERROR: LeakSanitizer: detected memory leaks
+
+ Direct leak of 45 byte(s) in 1 object(s) allocated from:
+ #0 0x562fd797265d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/ceph-kvstore-tool+0xe5e65d) (BuildId: 7eb56077b615aeb3c7aedafa0818ad89fdf3702d)
+ #1 0x562fd79815c8 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/new_allocator.h:137:27
+ #2 0x562fd7981520 in std::allocator<char>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/allocator.h:188:32
+ #3 0x562fd7981520 in std::allocator_traits<std::allocator<char>>::allocate(std::allocator<char>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/alloc_traits.h:464:20
+ #4 0x562fd798115a in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_create(unsigned long&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/basic_string.tcc:155:14
+ #5 0x562fd798787f in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/basic_string.tcc:328:21
+ ceph#6 0x562fd79876a7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_append(char const*, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/basic_string.tcc:420:8
+ ceph#7 0x7fa1aa0286f0 in MallocExtension::Initialize() (/lib/x86_64-linux-gnu/libtcmalloc.so.4+0x2a6f0) (BuildId: eeef3d1257388a806e122398dbce3157ee568ef4)
+
+ SUMMARY: AddressSanitizer: 45 byte(s) leaked in 1 allocation(s).
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Previously, SyncPoint allocated two C_Gather instances tracked by raw
pointers but failed to properly clean them up when only a single sync
point existed, causing memory leaks detected by AddressSanitizer.
This change fixes the leak by modifying AbstractWriteLog::shut_down()
to check for prior sync points in the chain. When the current sync point
is the only one present, we now activate the m_prior_log_entries_persisted
context to ensure:
- The onfinish callback executes and releases the captured strong
reference to the enclosing SyncPoint
- The parent m_sync_point_persist context completes and gets properly
released
This ensures all allocated contexts are cleaned up correctly during
shutdown, eliminating the memory leak.
The ASan report:
```
Indirect leak of 2064 byte(s) in 1 object(s) allocated from:
#0 0x56440919ae2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librbd+0x2f3de2d) (BuildId: 6a04677c6ee5235f1a41815df807f97c5b96d4cd)
#1 0x56440bd67751 in __gnu_cxx::new_allocator<Context*>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
#2 0x56440bd676e0 in std::allocator<Context*>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
#3 0x56440bd676e0 in std::allocator_traits<std::allocator<Context*>>::allocate(std::allocator<Context*>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
#4 0x56440bd6730b in std::_Vector_base<Context*, std::allocator<Context*>>::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
#5 0x7fd33e00e8d1 in std::vector<Context*, std::allocator<Context*>>::reserve(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
ceph#6 0x7fd33e00c51c in librbd::cache::pwl::SyncPoint::SyncPoint(unsigned long, ceph::common::CephContext*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/SyncPoint.cc:20:27
ceph#7 0x56440bd65f26 in decltype(::new((void*)(0)) librbd::cache::pwl::SyncPoint(std::declval<unsigned long&>(), std::declval<ceph::common::CephContext*&>())) std::construct_at<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
ceph#8 0x56440bd65b98 in void std::allocator_traits<std::allocator<librbd::cache::pwl::SyncPoint>>::construct<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>&, librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
ceph#9 0x56440bd657d3 in std::_Sp_counted_ptr_inplace<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:519:4
ceph#10 0x56440bd65371 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*&, std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:651:6
ceph#11 0x56440bd65163 in std::__shared_ptr<librbd::cache::pwl::SyncPoint, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
ceph#12 0x56440bd650e6 in std::shared_ptr<librbd::cache::pwl::SyncPoint>::shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
ceph#13 0x56440bd65057 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::allocate_shared<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint> const&, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
ceph#14 0x56440bca97e7 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::make_shared<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
ceph#15 0x56440bd443c8 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::new_sync_point(librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1905:20
ceph#16 0x56440bd42e4c in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1951:3
ceph#17 0x56440bd9cbf2 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point_if_needed(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1990:5
ceph#18 0x56440bd9c636 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&)::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2152:9
ceph#19 0x56440bd9b9b4 in boost::detail::function::void_function_obj_invoker<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&), void, librbd::cache::pwl::GuardedRequestFunctionContext&>::invoke(boost::detail::function::function_buffer&, librbd::cache::pwl::GuardedRequestFunctionContext&) /opt/ceph/include/boost/function/function_template.hpp:100:11
ceph#20 0x56440bd29321 in boost::function_n<void, librbd::cache::pwl::GuardedRequestFunctionContext&>::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /opt/ceph/include/boost/function/function_template.hpp:789:14
ceph#21 0x56440bd28d85 in librbd::cache::pwl::GuardedRequestFunctionContext::finish(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/Request.h:335:5
ceph#22 0x5644091e0fe0 in Context::complete(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/include/Context.h:102:5
ceph#23 0x56440bd9b378 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::detain_guarded_request(librbd::cache::pwl::C_BlockIORequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::GuardedRequestFunctionContext*, bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1202:20
ceph#24 0x56440bd96c50 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2154:3
ceph#25 0x56440bd1e4b5 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::shut_down(Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:703:3
ceph#26 0x56440bdb9022 in librbd::cache::pwl::TestMockCacheSSDWriteLog_compare_and_write_compare_matched_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/cache/pwl/test_mock_SSDWriteLog.cc:403:7
```
Fixes: https://tracker.ceph.com/issues/71335
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 05fd6f9)
Replace unsafe string construction with bufferlist::length() to avoid reading beyond buffer boundaries. In commit 92ccbff, we introduced a bug when checking if a digest was empty by constructing a std::string from bufferlist: ``` std::string(digest.second.c_str()).empty() ``` This is unsafe because bufferlist data is not guaranteed to be null- terminated. The std::string constructor searches for a null terminator and may read beyond the bufferlist's allocated memory, causing a heap-buffer-overflow detected by AddressSanitizer: ``` ==66092==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7e0c65215004 at pc 0x7fbc6e27c597 bp 0x7ffe29fb6100 sp 0x7ffe29fb58b8 READ of size 5 at 0x7e0c65215004 thread T0 #0 0x7fbc6e27c596 in strlen /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:425 #1 0x562c75fad91a in std::char_traits<char>::length(char const*) /usr/include/c++/15.2.1/bits/char_traits.h:393 #2 0x562c75fb4222 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<std::allocator<char> >(char const*, std::allocator<char> const&) /usr/include/c++/15.2.1/bits/b asic_string.h:713 #3 0x562c761b81ae in operator() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:1300 #4 0x562c761d7d53 in operator()<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false> > /usr/include/c++/15.2.1/bits/predefined_ops.h:318 #5 0x562c761d789c in __find_if<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, __gnu_cxx::__ops::_Iter_pred<ScrubBackend::match_in_shards(const hobject_t&, auth_selection_ t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > > /usr/include/c++/15.2.1/bits/stl_algobase.h:2095 ceph#6 0x562c761d72b2 in find_if<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:3921 ceph#7 0x562c761d5f6f in none_of<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:431 ceph#8 0x562c761d4a50 in any_of<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, s td::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:450 ceph#9 0x562c761bb84b in ScrubBackend::match_in_shards(hobject_t const&, auth_selection_t&, inconsistent_obj_wrapper&, std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >&) /home/k efu/dev/ceph/src/osd/scrubber/scrub_backend.cc:1297 ceph#10 0x562c761b4282 in ScrubBackend::compare_obj_in_maps[abi:cxx11](hobject_t const&) /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:941 ceph#11 0x562c761d44af in operator()<hobject_t> /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:887 ceph#12 0x562c761d4836 in for_each<std::_Rb_tree_const_iterator<hobject_t>, ScrubBackend::compare_smaps()::<lambda(const auto:422&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:3798 ceph#13 0x562c761b3259 in ScrubBackend::compare_smaps() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:884 ceph#14 0x562c761a478d in ScrubBackend::update_authoritative() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:315` ``` Fix by using bufferlist::length() which tells if the given buffer is empty instead of converting the buffer content to a string. Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Previously, SyncPoint allocated two C_Gather instances tracked by raw
pointers but failed to properly clean them up when only a single sync
point existed, causing memory leaks detected by AddressSanitizer.
This change fixes the leak by modifying AbstractWriteLog::shut_down()
to check for prior sync points in the chain. When the current sync point
is the only one present, we now activate the m_prior_log_entries_persisted
context to ensure:
- The onfinish callback executes and releases the captured strong
reference to the enclosing SyncPoint
- The parent m_sync_point_persist context completes and gets properly
released
This ensures all allocated contexts are cleaned up correctly during
shutdown, eliminating the memory leak.
The ASan report:
```
Indirect leak of 2064 byte(s) in 1 object(s) allocated from:
#0 0x56440919ae2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librbd+0x2f3de2d) (BuildId: 6a04677c6ee5235f1a41815df807f97c5b96d4cd)
#1 0x56440bd67751 in __gnu_cxx::new_allocator<Context*>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
#2 0x56440bd676e0 in std::allocator<Context*>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
#3 0x56440bd676e0 in std::allocator_traits<std::allocator<Context*>>::allocate(std::allocator<Context*>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
#4 0x56440bd6730b in std::_Vector_base<Context*, std::allocator<Context*>>::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
#5 0x7fd33e00e8d1 in std::vector<Context*, std::allocator<Context*>>::reserve(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
ceph#6 0x7fd33e00c51c in librbd::cache::pwl::SyncPoint::SyncPoint(unsigned long, ceph::common::CephContext*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/SyncPoint.cc:20:27
ceph#7 0x56440bd65f26 in decltype(::new((void*)(0)) librbd::cache::pwl::SyncPoint(std::declval<unsigned long&>(), std::declval<ceph::common::CephContext*&>())) std::construct_at<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
ceph#8 0x56440bd65b98 in void std::allocator_traits<std::allocator<librbd::cache::pwl::SyncPoint>>::construct<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>&, librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
ceph#9 0x56440bd657d3 in std::_Sp_counted_ptr_inplace<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:519:4
ceph#10 0x56440bd65371 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*&, std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:651:6
ceph#11 0x56440bd65163 in std::__shared_ptr<librbd::cache::pwl::SyncPoint, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
ceph#12 0x56440bd650e6 in std::shared_ptr<librbd::cache::pwl::SyncPoint>::shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
ceph#13 0x56440bd65057 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::allocate_shared<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint> const&, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
ceph#14 0x56440bca97e7 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::make_shared<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
ceph#15 0x56440bd443c8 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::new_sync_point(librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1905:20
ceph#16 0x56440bd42e4c in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1951:3
ceph#17 0x56440bd9cbf2 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point_if_needed(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1990:5
ceph#18 0x56440bd9c636 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&)::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2152:9
ceph#19 0x56440bd9b9b4 in boost::detail::function::void_function_obj_invoker<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&), void, librbd::cache::pwl::GuardedRequestFunctionContext&>::invoke(boost::detail::function::function_buffer&, librbd::cache::pwl::GuardedRequestFunctionContext&) /opt/ceph/include/boost/function/function_template.hpp:100:11
ceph#20 0x56440bd29321 in boost::function_n<void, librbd::cache::pwl::GuardedRequestFunctionContext&>::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /opt/ceph/include/boost/function/function_template.hpp:789:14
ceph#21 0x56440bd28d85 in librbd::cache::pwl::GuardedRequestFunctionContext::finish(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/Request.h:335:5
ceph#22 0x5644091e0fe0 in Context::complete(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/include/Context.h:102:5
ceph#23 0x56440bd9b378 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::detain_guarded_request(librbd::cache::pwl::C_BlockIORequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::GuardedRequestFunctionContext*, bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1202:20
ceph#24 0x56440bd96c50 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2154:3
ceph#25 0x56440bd1e4b5 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::shut_down(Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:703:3
ceph#26 0x56440bdb9022 in librbd::cache::pwl::TestMockCacheSSDWriteLog_compare_and_write_compare_matched_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/cache/pwl/test_mock_SSDWriteLog.cc:403:7
```
Fixes: https://tracker.ceph.com/issues/71335
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 05fd6f9)
…ives
Add suppression rules for two categories of false positive warnings
encountered during ASan-enabled testing:
1. PyModule_ExecDef memory leaks: ASan incorrectly interprets Python's
module loading behavior as memory leaks when the interpreter loads
extension modules.
2. __cxa_throw interception failures: ASan's interceptor cannot properly
intercept exception handling when libstdc++.so is loaded after the
ASan shared library, causing CHECK failures.
3. ErasureCodePluginRegistry::load:
`ceph::ErasureCodePluginRegistry::load()` is known to leak, as we
don't free the memory allocated by the ec plugins which are
registered in the `ErasureCodePluginRegistry` singleton. this is a
known issue, but since the `ErasureCodePluginRegistry` instance is a
singleton. we can live with it. in this change, we add the rule to
suppress the leak report from LeakSanitizer. this rule also exist in
qa/valgrind.supp.
All warnings are confirmed false positives that should be suppressed
to reduce noise in test output.
Example warnings:
```
Direct leak of 3264 byte(s) in 1 object(s) allocated from:
#0 0x7f6027d20cb5 in malloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:67
#1 0x7f60277557ad (/usr/lib/libpython3.13.so.1.0+0x1557ad) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#2 0x7f6027756067 (/usr/lib/libpython3.13.so.1.0+0x156067) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#3 0x7f60278471a0 (/usr/lib/libpython3.13.so.1.0+0x2471a0) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#4 0x7f602774d031 (/usr/lib/libpython3.13.so.1.0+0x14d031) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
#5 0x7b60234093bb in __Pyx_modinit_type_init_code.constprop.0 /home/kefu/dev/ceph/build/src/pybind/rados/rados.c:82066
ceph#6 0x7b602340a826 in __pyx_pymod_exec_rados /home/kefu/dev/ceph/build/src/pybind/rados/rados.c:82755
ceph#7 0x7f6027856777 in PyModule_ExecDef (/usr/lib/libpython3.13.so.1.0+0x256777) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#8 0x7f602785baa3 (/usr/lib/libpython3.13.so.1.0+0x25baa3) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#9 0x7f6027793df2 (/usr/lib/libpython3.13.so.1.0+0x193df2) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#10 0x7f6027777cbe in _PyEval_EvalFrameDefault (/usr/lib/libpython3.13.so.1.0+0x177cbe) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#11 0x7f60277957de (/usr/lib/libpython3.13.so.1.0+0x1957de) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#12 0x7f60277d11b9 in PyObject_CallMethodObjArgs (/usr/lib/libpython3.13.so.1.0+0x1d11b9) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#13 0x7f60277d0ee4 in PyImport_ImportModuleLevelObject (/usr/lib/libpython3.13.so.1.0+0x1d0ee4) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#14 0x7f6027779c0c in _PyEval_EvalFrameDefault (/usr/lib/libpython3.13.so.1.0+0x179c0c) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#15 0x7f602784e2c8 in PyEval_EvalCode (/usr/lib/libpython3.13.so.1.0+0x24e2c8) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#16 0x7f602788c88b (/usr/lib/libpython3.13.so.1.0+0x28c88b) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#17 0x7f602788985c (/usr/lib/libpython3.13.so.1.0+0x28985c) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#18 0x7f6027886f57 (/usr/lib/libpython3.13.so.1.0+0x286f57) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#19 0x7f6027886211 (/usr/lib/libpython3.13.so.1.0+0x286211) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#20 0x7f6027885b82 (/usr/lib/libpython3.13.so.1.0+0x285b82) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#21 0x7f6027883e50 in Py_RunMain (/usr/lib/libpython3.13.so.1.0+0x283e50) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#22 0x7f602783bbea in Py_BytesMain (/usr/lib/libpython3.13.so.1.0+0x23bbea) (BuildId: bea05fc2c8bd66145b159f10dcd810ebe813af39)
ceph#23 0x7f6027227674 (/usr/lib/libc.so.6+0x27674) (BuildId: 4fe011c94a88e8aeb6f2201b9eb369f42b4a1e9e)
ceph#24 0x7f6027227728 in __libc_start_main (/usr/lib/libc.so.6+0x27728) (BuildId: 4fe011c94a88e8aeb6f2201b9eb369f42b4a1e9e)
ceph#25 0x55dae17e6044 in _start (/usr/bin/python3.13+0x1044) (BuildId: 8c0dc848f5b978c56ebeb07255bb332b4b37ae4e)
```
```
AddressSanitizer: CHECK failed: asan_interceptors.cpp:335 "((__interception::real___cxa_throw)) != (0)" (0x0, 0x0) (tid=3246455)
#0 0x7f345ea81979 in CheckUnwind ../../../../src/libsanitizer/asan/asan_rtl.cpp:69
#1 0x7f345eaa790d in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) ../../../../src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:86
#2 0x7f345e9e1d54 in __interceptor___cxa_throw ../../../../src/libsanitizer/asan/asan_interceptors.cpp:335
#3 0x7f345e9e1d54 in __interceptor___cxa_throw ../../../../src/libsanitizer/asan/asan_interceptors.cpp:334
#4 0x7f3458623def in void boost::throw_exception<boost::bad_lexical_cast>(boost::bad_lexical_cast const&) /opt/ceph/include/boost/throw_exception.hpp:165
#5 0x7f345997ad3b in void boost::conversion::detail::throw_bad_cast<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long>() /opt/ceph/include/boost/lexical_cast/bad_lexical_cast.hpp:93
ceph#6 0x7f3459979d35 in unsigned long boost::lexical_cast<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /opt/ceph/include/boost/lexical_cast.hpp:43`
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
The static std::map max_prio_map was defined in the osd_types.h header
file, causing every translation unit that included this header to get
its own copy of the variable. This led to One Definition Rule (ODR)
violations where multiple instances of the same variable existed at
runtime.
During program cleanup, destructors for these multiple instances would
attempt to free the same memory regions, resulting in segmentation
faults in tcmalloc/memory allocator as seen with ceph-dencoder.
This issue surfaced after a yet-merged-change which converts erasure_code
and json_spirit to OBJECT libraries. Before that change, these were
STATIC libraries that were linked via target_link_libraries. The
incorrect linkage meant their object files (and thus their copies of
max_prio_map) were kept separate and didn't conflict at runtime.
After converting to OBJECT libraries and properly incorporating them
into libceph-common.so (commit 8b0e3fb2c23), the multiple copies of
max_prio_map from different translation units all ended up in the same
shared library, exposing the ODR violation. During program exit, the
dynamic linker attempted to run destructors for all instances, leading
to double-free crashes.
Fix by moving the map into a static helper function in PeeringState.cc
(the only file that uses it). The map is now a function-local static
const variable, ensuring a single instance that is properly initialized
and destructed.
Backtrace before fix:
```
#0 0x00007ffff7dbb1a0 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned int, int) () from /lib/x86_64-linux-gnu/libtcmalloc.so.4
#1 0x00007ffff7dbb57f in tcmalloc::ThreadCache::Scavenge() () from /lib/x86_64-linux-gnu/libtcmalloc.so.4
#2 0x00007ffff6bc8aa2 in std::__new_allocator<std::_Rb_tree_node<std::pair<int const, int> > >::deallocate (this=0x7ffff7d48f78 <max_prio_map>, __p=0x555555f43890, __n=1)
#3 0x00007ffff6bc89f9 in std::allocator<std::_Rb_tree_node<std::pair<int const, int> > >::deallocate (this=0x7ffff7d48f78 <max_prio_map>, __p=0x555555f43890, __n=1)
#4 std::allocator_traits<std::allocator<std::_Rb_tree_node<std::pair<int const, int> > > >::deallocate (__a=..., __p=0x555555f43890, __n=1)
#5 std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_put_node (this=0x7ffff7d48f78 <max_prio_map>, __p=0x555555f43890)
ceph#6 0x00007ffff6bc892e in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_drop_node (this=0x7ffff7d48f78 <max_prio_map>, __p=0x555555f43890)
ceph#7 0x00007ffff6bc886e in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_erase (this=0x7ffff7d48f78 <max_prio_map>, __x=0x555555f43890)
ceph#8 0x00007ffff6bc8854 in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_erase (this=0x7ffff7d48f78 <max_prio_map>, __x=0x555555f43cb0)
ceph#9 0x00007ffff6bc8854 in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_erase (this=0x7ffff7d48f78 <max_prio_map>, __x=0x555555f43ad0)
ceph#10 0x00007ffff6bc8805 in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::~_Rb_tree (this=0x7ffff7d48f78 <max_prio_map>)
ceph#11 0x00007ffff6bc7345 in std::map<int, int, std::less<int>, std::allocator<std::pair<int const, int> > >::~map (this=0x7ffff7d48f78 <max_prio_map>)
ceph#12 0x00007ffff484bd51 in __cxa_finalize (d=0x7ffff7d3f440) at ./stdlib/cxa_finalize.c:97
ceph#13 0x00007ffff6af9487 in __do_global_dtors_aux () from /home/kefu/dev/ceph/build/lib/libceph-common.so.2
ceph#14 0x00007ffff7fbfd20 in ?? ()
ceph#15 0x00007ffff7fc8fc2 in _dl_call_fini (closure_map=0x7fffffffd0f0, closure_map@entry=0x7ffff7fbfd20) at ./elf/dl-call_fini.c:43
ceph#16 0x00007ffff7fcbe72 in _dl_fini () at ./elf/dl-fini.c:120
ceph#17 0x00007ffff484c291 in __run_exit_handlers (status=0, listp=0x7ffff49f1680 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:118
ceph#18 0x00007ffff484c35a in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:148
ceph#19 0x00007ffff4833caf in __libc_start_call_main (main=main@entry=0x55555556cd90 <main(int, char const**)>, argc=argc@entry=2, argv=argv@entry=0x7fffffffd488) at ../sysdeps/nptl/libc_start_call_main.h:74
ceph#20 0x00007ffff4833d65 in __libc_start_main_impl (main=0x55555556cd90 <main(int, char const**)>, argc=2, argv=0x7fffffffd488, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffd478) at ../csu/libc-start.c:360
ceph#21 0x00005555555695e1 in _start ()
```
Signed-off-by: Kefu Chai <k.chai@proxmox.com>
…yed static" ``` Jan 20 09:27:16 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: AddressSanitizer:DEADLYSIGNAL Jan 20 09:27:16 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: ================================================================= Jan 20 09:27:16 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: ==3==ERROR: AddressSanitizer: stack-overflow on address 0x7b512f6c8dd8 (pc 0x0000046e7a72 bp 0x7b512de7c900 sp 0x7b512f6c8dd8 T0) Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: #0 0x0000046e7a72 in get_global_options() (/usr/bin/ceph-osd-crimson+0x46e7a72) (BuildId: 2a86043f51c9be9cb19801e276fb3ee36239556a) Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: #1 0x0000046e540e in build_options() (/usr/bin/ceph-osd-crimson+0x46e540e) (BuildId: 2a86043f51c9be9cb19801e276fb3ee36239556a) Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: #2 0x0000033b7949 in get_ceph_options() (/usr/bin/ceph-osd-crimson+0x33b7949) (BuildId: 2a86043f51c9be9cb19801e276fb3ee36239556a) Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: #3 0x000003440540 in md_config_t::md_config_t(ConfigValues&, ConfigTracker const&, bool) (/usr/bin/ceph-osd-crimson+0x3440540) (BuildId: 2a860> Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: #4 0x0000046856a8 in crimson::common::ConfigProxy::ConfigProxy(EntityName const&, std::basic_string_view<char, std::char_traits<char> >) (/usr> Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: #5 0x000000eb6cb5 in seastar::shared_ptr_count_for<crimson::common::ConfigProxy>::shared_ptr_count_for<EntityName&, std::__cxx11::basic_string> .. Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: ceph#40 0x000000ed6434 in seastar::future<int> seastar::futurize<int>::apply<crimson::osd::_get_early_config(int, char const**)::{lambda()#1}::ope> Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: ceph#41 0x000000ed672b in seastar::async<crimson::osd::_get_early_config(int, char const**)::{lambda()#1}::operator()() const::{lambda()#1}>(seast> ``` This reverts commit 1ab0a8c. Fixes: https://tracker.ceph.com/issues/74481 Signed-off-by: Matan Breizman <mbreizma@redhat.com>
The test creates OSD messages (MOSDPGLease, MOSDPGNotify2,
MBackfillReserve, etc.) via ceph::make_message but the mock
PGListener accumulates them in a messages map without clearing them
at test teardown. This caused ~100 leak reports in CI.
Fix by explicitly clearing undispatched messages in TearDown() before
destroying the listeners. This ensures all MessageRef objects are
properly released even if not all messages were dispatched during the
test.
Part of the leak report:
```
==77847==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 86400 byte(s) in 48 object(s) allocated from:
#0 0x61806af7c2bd in operator new(unsigned long) (/ceph/build/bin/unittest_peeringstate+0x9092bd) (BuildId: 9a900986804eedf4e9290ec705e6444f4bf0ba94)
#1 0x61806b4bc9b9 in boost::intrusive_ptr<MOSDPGInfo2> ceph::make_message<MOSDPGInfo2, spg_t&, pg_info_t const&, unsigned int&, unsigned int&, std::optional<pg_lease_t>&, std::optional<pg_lease_ack_t>&>(spg_t&, pg_info_t const&, unsigned int&, unsigned int&, std::optional<pg_lease_t>&, std::optional<pg_lease_ack_t>&) /ceph/src/msg/Message.h:597:11
#2 0x61806b3e182a in BufferedRecoveryMessages::send_info(int, spg_t, unsigned int, unsigned int, pg_info_t const&, std::optional<pg_lease_t>, std::optional<pg_lease_ack_t>) /ceph/src/osd/PeeringState.cc:86:5
#3 0x61806b4c86c9 in PeeringCtxWrapper::send_info(int, spg_t, unsigned int, unsigned int, pg_info_t const&, std::optional<pg_lease_t>, std::optional<pg_lease_ack_t>) /ceph/src/osd/PeeringState.h:269:10
#4 0x61806b48a56c in PeeringState::ReplicaActive::react(PeeringState::ActivateCommitted const&) /ceph/src/osd/PeeringState.cc:7011:8
#5 0x61806b5f7cb6 in boost::statechart::detail::reaction_result boost::statechart::custom_reaction<PeeringState::ActivateCommitted>::react<PeeringState::ReplicaActive, boost::statechart::event_base, void const*>(PeeringState::ReplicaActive&, boost::statechart::event_base const&, void const* const&) /opt/ceph/include/boost/statechart/custom_reaction.hpp:42:15
ceph#6 0x61806b5f7b1d in boost::statechart::detail::reaction_result boost::statechart::simple_state<PeeringState::ReplicaActive, PeeringState::Started, PeeringState::RepNotRecovering, (boost::statechart::history_mode)0>::local_react_impl_non_empty::local_react_impl<boost::mpl::list10<boost::statechart::custom_reaction<PeeringState::ActivateCommitted>, boost::statechart::custom_reaction<DeferRecovery>, boost::statechart::custom_reaction<DeferBackfill>, boost::statechart::custom_reaction<PeeringState::UnfoundRecovery>, boost::statechart::custom_reaction<PeeringState::UnfoundBackfill>, boost::statechart::custom_reaction<PeeringState::RemoteBackfillPreempted>, boost::statechart::custom_reaction<PeeringState::RemoteRecoveryPreempted>, boost::statechart::custom_reaction<RecoveryDone>, boost::statechart::transition<PeeringState::DeleteStart, PeeringState::ToDelete, boost::statechart::detail::no_context<PeeringState::DeleteStart>, &boost::statechart::detail::no_context<PeeringState::DeleteStart>::no_function(PeeringState::DeleteStart const&)>, boost::statechart::custom_reaction<MLease>>, boost::statechart::simple_state<PeeringState::ReplicaActive, PeeringState::Started, PeeringState::RepNotRecovering, (boost::statechart::history_mode)0>>(boost::statechart::simple_state<PeeringState::ReplicaActive, PeeringState::Started, PeeringState::RepNotRecovering, (boost::statechart::history_mode)0>&, boost::statechart::event_base const&, void const*) /opt/ceph/include/boost/statechart/simple_state.hpp:814:11
ceph#7 0x61806b5f79b4 in boost::statechart::detail::reaction_result boost::statechart::simple_state<PeeringState::ReplicaActive, PeeringState::Started, PeeringState::RepNotRecovering, (boost::statechart::history_mode)0>::local_react<boost::mpl::list10<boost::statechart::custom_reaction<PeeringState::ActivateCommitted>, boost::statechart::custom_reaction<DeferRecovery>, boost::statechart::custom_reaction<DeferBackfill>, boost::statechart::custom_reaction<PeeringState::UnfoundRecovery>, boost::statechart::custom_reaction<PeeringState::UnfoundBackfill>, boost::statechart::custom_reaction<PeeringState::RemoteBackfillPreempted>, boost::statechart::custom_reaction<PeeringState::RemoteRecoveryPreempted>, boost::statechart::custom_reaction<RecoveryDone>, boost::statechart::transition<PeeringState::DeleteStart, PeeringState::ToDelete, boost::statechart::detail::no_context<PeeringState::DeleteStart>, &boost::statechart::detail::no_context<PeeringState::DeleteStart>::no_function(PeeringState::DeleteStart const&)>, boost::statechart::custom_reaction<MLease>>>(boost::statechart::event_base const&, void const*) /opt/ceph/include/boost/statechart/simple_state.hpp:850:14
```
Signed-off-by: Kefu Chai <k.chai@proxmox.com>
Signed-off-by: Md Mahamudur Rahaman Sajib mahamudur.sajib@croit.io
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e