squid: client: Fix libcephfs aio metadata corruption. by vshankar · Pull Request #60245 · ceph/ceph

vshankar · 2024-10-10T12:51:32Z

backport tracker: https://tracker.ceph.com/issues/68495
backport tracker: https://tracker.ceph.com/issues/68494
backport tracker: https://tracker.ceph.com/issues/68493
backport tracker: https://tracker.ceph.com/issues/69401

backport of #59987
backport of #60411
parent tracker: https://tracker.ceph.com/issues/68146
parent tracker: https://tracker.ceph.com/issues/68308
parent tracker: https://tracker.ceph.com/issues/68309
parent tracker: https://tracker.ceph.com/issues/68641

this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh

Fixes: https://tracker.ceph.com/issues/68146 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 6d8f610)

Fixes: https://tracker.ceph.com/issues/68146 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 59b996f)

The same bufferlist is used without cleaning for multiple calls. The test 'LlreadvLlwritev' used to fail because of it. Fixed the same. Fixes: https://tracker.ceph.com/issues/68146 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit b5af1c1)

Problem: With cephfs nfs-ganesha, there were following asserts hit while doing write on a file. 1. FAILED ceph_assert((bool)_front == (bool)_size) 2. FAILED ceph_assert(cap_refs[c] > 0) Cause: In aio path, the client_lock was not being held in the internal callback after the io is done where it's expected to be taken leading to corruption. Fix: Take client_lock in the callback Fixes: https://tracker.ceph.com/issues/68146 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 3ebe974)

When libcephfs aio tests (src/test/client) are run with objectcacher disabled (ceph_test_client --client_oc=false), the TestClient.LlreadvLlwritev fails and core dumps. The client hits the assert 'caps_ref[c]<0'. This patch fixes the same. There is no need to give out cap_ref and take it again between multiple read because of short reads. In some cases, the get_caps used to fail in C_Read_Sync_NonBlocking::finish causing cap_ref to go negative when put_cap_ref is done at last in C_Read_Finish::finish_io Fixes: https://tracker.ceph.com/issues/68308 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 10c8330)

The following test fails when run with objectcacher disabled. TestClient.LlreadvLlwritevZeroBytes Failure - nonblocking.cc ceph/src/osdc/Striper.cc: 186: FAILED ceph_assert(len > 0) Traceback: ceph version Development (no_version) squid (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x125) [0x7fc0a340aafe] 2: (ceph::register_assert_context(ceph::common::CephContext*)+0) [0x7fc0a340ad20] 3: (Striper::file_to_extents(ceph::common::CephContext*, file_layout_t const*, ...)+0x184) [0x562727e13ab4] 4: (Striper::file_to_extents(ceph::common::CephContext*, char const*, ...)+0x97) [0x562727e145d1] 5: (Striper::file_to_extents(ceph::common::CephContext*, inodeno_t, ...)+0x75) [0x562727d29520] 6: (Filer::read_trunc(inodeno_t, file_layout_t const*, snapid_t, ...)+0x61) [0x562727d66ea5] 7: (Client::C_Read_Sync_NonBlocking::retry()+0x10c) [0x562727cd8a8e] 8: (Client::_read(Fh*, long, unsigned long, ceph::buffer::v15_2_0::list*, Context*)+0x578) [0x562727d10cb6] 9: (Client::_preadv_pwritev_locked(Fh*, iovec const*, int, long, bool, ...)+0x3a7) [0x562727d18159] 10: (Client::ll_preadv_pwritev(Fh*, iovec const*, int, long, bool, ...)+0x179) [0x562727d18b99] 11: (TestClient_LlreadvLlwritevZeroBytes_Test::TestBody()+0x592) [0x562727ca5352] 12: (void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, ...)+0x1b) [0x562727d9dea3] 13: (void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, ...)+0x80) [0x562727da2b26] 14: (testing::Test::Run()+0xb4) [0x562727d927ae] 15: (testing::TestInfo::Run()+0x104) [0x562727d92988] 16: (testing::TestSuite::Run()+0xb2) [0x562727d92b34] 17: (testing::internal::UnitTestImpl::RunAllTests()+0x36b) [0x562727d95303] 18: (bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, ...)(), char const*)+0x1b) [0x562727d9e15f] 19: (bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, ...)+0x80) [0x562727da3083] 20: (testing::UnitTest::Run()+0x63) [0x562727d92813] 21: (RUN_ALL_TESTS()+0x11) [0x562727c828d9] 22: main() The patch fixes the same. Fixes: https://tracker.ceph.com/issues/68309 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 942474c)

vshankar · 2024-11-06T07:54:38Z

jenkins retest this please

vshankar · 2024-11-06T08:02:55Z

This PR is under test in https://tracker.ceph.com/issues/68851.

github-actions · 2025-01-05T11:01:31Z

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

vshankar · 2025-01-05T13:46:34Z

This PR is under test in https://tracker.ceph.com/issues/68851.

@mchangir status of the QA run?

vshankar · 2025-01-05T13:46:48Z

jenkins retest this please

kotreshhr · 2025-03-06T17:34:47Z

@vshankar @mchangir I was waiting for this to merge before backporting #60411
I see that the QA Tracker talks about re-running the tests? Can I add #60411 into this ?

vshankar · 2025-03-07T07:04:23Z

@vshankar @mchangir I was waiting for this to merge before backporting #60411 I see that the QA Tracker talks about re-running the tests? Can I add #60411 into this ?

Suggest including this change in PR #60411.

Problem: When osd is full, the client receives the notification and cancels the ongoing writes. If the ongoing writes are async, it could cause a dead lock as the async callback registered also takes the 'client_lock' which the handle_osd_map takes at the beginning. The op_cancel_writes calls the callback registered for the async write synchronously holding the 'client_lock' causing the deadlock. Earlier approach: It was tried to solve this issue by calling 'op_cancel_writes' without holding 'client_lock'. But this failed lock dependency between objecter's 'rwlock' and async write's callback taking 'client_lock'. The 'client_lock' should always be taken before taking 'rwlock'. So this approach is dropped against the current approach. Solution: Use C_OnFinisher for objecter async write callback i.e., wrap the async write's callback using the Finisher. This queues the callback to the Finisher's context queue which the finisher thread picks up and executes thus avoiding the deadlock. Testing: The fix is tested in the vstart cluster with the following reproducer. 1. Mount the cephfs volume using nfs-ganesha at /mnt 2. Run fio on /mnt on one terminal 3. On the other terminal, blocklist the nfs client session 4. The fio would hang It is reproducing in the vstart cluster most of the times. I think that's because it's slow. The same test written for teuthology is not reproducing the issue. The test expects one or more writes to be on going in rados when the client is blocklisted for the deadlock to be hit. Stripped down version of Traceback: ---------- 0 0x00007f4d77274960 in __lll_lock_wait () 1 0x00007f4d7727aff2 in pthread_mutex_lock@@GLIBC_2.2.5 () 2 0x00007f4d7491b0a1 in __gthread_mutex_lock (__mutex=0x7f4d200f99b0) 3 std::mutex::lock (this=<optimized out>) 4 std::scoped_lock<std::mutex>::scoped_lock (__m=..., this=<optimized out>, this=<optimized out>, __m=...) 5 Client::C_Lock_Client_Finisher::finish (this=0x7f4ca0103550, r=-28) 6 0x00007f4d74888dfd in Context::complete (this=0x7f4ca0103550, r=<optimized out>) 7 0x00007f4d7498850c in std::__do_visit<...>(...) (__visitor=...) 8 std::visit<Objecter::Op::complete(...) (__visitor=...) 9 Objecter::Op::complete(...) (e=..., e=..., r=-28, ec=..., f=...) 10 Objecter::Op::complete (e=..., r=-28, ec=..., this=0x7f4ca022c7f0) 11 Objecter::op_cancel (this=0x7f4d200fab20, s=<optimized out>, tid=<optimized out>, r=-28) 12 0x00007f4d7498ea12 in Objecter::op_cancel_writes (this=0x7f4d200fab20, r=-28, pool=103) 13 0x00007f4d748e1c8e in Client::_handle_full_flag (this=0x7f4d200f9830, pool=103) 14 0x00007f4d748ed20c in Client::handle_osd_map (m=..., this=0x7f4d200f9830) 15 Client::ms_dispatch2 (this=0x7f4d200f9830, m=...) 16 0x00007f4d75b8add2 in Messenger::ms_deliver_dispatch (m=..., this=0x7f4d200ed3e0) 17 DispatchQueue::entry (this=0x7f4d200ed6f0) 18 0x00007f4d75c27fa1 in DispatchQueue::DispatchThread::entry (this=<optimized out>) 19 0x00007f4d77277c02 in start_thread () 20 0x00007f4d772fcc40 in clone3 () -------- Fixes: https://tracker.ceph.com/issues/68641 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 60c5801)

kotreshhr · 2025-03-07T07:49:24Z

@vshankar @mchangir I was waiting for this to merge before backporting #60411 I see that the QA Tracker talks about re-running the tests? Can I add #60411 into this ?

Suggest including this change in PR #60411.

Included PR #60411 into this PR

kotreshhr · 2025-03-07T10:34:35Z

jenkinst test make check

kotreshhr · 2025-04-22T12:49:00Z

jenkinst test make check

kotreshhr · 2025-06-17T03:44:40Z

jenkins test make check

vshankar · 2025-06-23T04:24:38Z

Superseded by PR #64090

github-actions · 2025-09-23T02:07:31Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

batrick · 2026-02-24T21:33:38Z

Superseded by PR #64090

close this?

vshankar · 2026-02-25T01:28:20Z

Superseded by PR #64090

close this?

Yes.

kotreshhr added 6 commits October 10, 2024 18:21

qa: Add data read/write test for nfs-ganesha

c6404f9

Fixes: https://tracker.ceph.com/issues/68146 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 6d8f610)

qa: Add libcephfs client test with objectcacher disabled

3060994

Fixes: https://tracker.ceph.com/issues/68146 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit 59b996f)

vshankar added this to the squid milestone Oct 10, 2024

vshankar added the cephfs Ceph File System label Oct 10, 2024

github-actions bot added nfs tests labels Oct 10, 2024

vshankar requested a review from a team October 14, 2024 09:59

vshankar added the wip-vshankar-testing2-squid Squid Backport Testing #2 label Nov 6, 2024

github-actions bot added the stale label Jan 5, 2025

github-actions bot removed the stale label Jan 5, 2025

github-actions bot added the needs-rebase label Sep 23, 2025

vshankar closed this Feb 25, 2026

Conversation

vshankar commented Oct 10, 2024 • edited by kotreshhr Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vshankar commented Nov 6, 2024

Uh oh!

vshankar commented Nov 6, 2024

Uh oh!

github-actions bot commented Jan 5, 2025

Uh oh!

vshankar commented Jan 5, 2025

Uh oh!

vshankar commented Jan 5, 2025

Uh oh!

kotreshhr commented Mar 6, 2025

Uh oh!

vshankar commented Mar 7, 2025

Uh oh!

kotreshhr commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kotreshhr commented Mar 7, 2025

Uh oh!

kotreshhr commented Apr 22, 2025

Uh oh!

kotreshhr commented Jun 17, 2025

Uh oh!

vshankar commented Jun 23, 2025

Uh oh!

github-actions bot commented Sep 23, 2025

Uh oh!

batrick commented Feb 24, 2026

Uh oh!

vshankar commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vshankar commented Oct 10, 2024 •

edited by kotreshhr

Loading

kotreshhr commented Mar 7, 2025 •

edited

Loading