Skip to content

SimpleRADOSStriper: Avoid moving bufferlists by using deque in read()#47841

Merged
yuriw merged 2 commits intoceph:mainfrom
Matan-B:wip-matanb-rados-striper
Sep 19, 2022
Merged

SimpleRADOSStriper: Avoid moving bufferlists by using deque in read()#47841
yuriw merged 2 commits intoceph:mainfrom
Matan-B:wip-matanb-rados-striper

Conversation

@Matan-B
Copy link
Contributor

@Matan-B Matan-B commented Aug 28, 2022

From the tracker's core dump:

(gdb) thread 3
..
#5  librados::AioCompletionImpl::wait_for_complete (this=0x55d397944720) at ./src/librados/AioCompletionImpl.h:63
#6  librados::v14_2_0::AioCompletion::wait_for_complete (this=<optimized out>) at ./src/librados/librados_cxx.cc:1050
#7  0x00007fc1bae649c0 in SimpleRADOSStriper::read (this=0x55d3974e6c70, data=data@entry=0x55d3974eee38, len=len@entry=65536, off=off@entry=4129788)
    at /usr/include/c++/9/bits/unique_ptr.h:360
[Current thread is 1 ..
(gdb) bt
#0  0x00007fc1bacfac5b in ceph::buffer::v15_2_0::list::buffers_t::clear_and_dispose (this=0x55d3974d3ed0) at ./src/include/buffer.h:599
#1  ceph::buffer::v15_2_0::list::buffers_t::operator= (other=..., this=0x55d3974d3ed0) at ./src/include/buffer.h:505
#2  ceph::buffer::v15_2_0::list::operator= (this=0x55d3974d3ed0, other=...) at ./src/include/buffer.h:970
#3  0x00007fc1bad35606 in Message::claim_data (bl=..., this=0x7fc1a001b680) at ./src/msg/Message.h:431
#4  Objecter::handle_osd_op_reply (this=0x55d397435b90, m=0x7fc1a001b680) at ./src/osdc/Objecter.cc:3509

Reproduced by attempting to make multithreaded aio requests similarly to SimpleRADOSStriper:

std::vector<std::pair<librados::bufferlist, std::unique_ptr<librados::AioCompletion>>> reads;
  for (i = 0; i < 5; i++) {
    auto& [bl_ok, aiocp] = reads.emplace_back();
    aiocp = std::unique_ptr<librados::AioCompletion>(librados::Rados::aio_create_completion());
    ret = io_ctx.aio_read(object_name, aiocp.get(), &bl_ok, read_len, 0);
  }
  for (auto& [bl, aiocp] : reads) {
    if (int rc = aiocp->wait_for_complete(); rc < 0) {
      return rc;
    }
  }

Same segfault:

(gdb) bt 5
#0  ceph::buffer::v15_2_0::list::buffers_t::clear_and_dispose (this=0x7b0c000200a0) at ../src/include/buffer.h:599
#1  ceph::buffer::v15_2_0::list::buffers_t::operator= (other=..., this=0x7b0c000200a0) at ../src/include/buffer.h:505
#2  ceph::buffer::v15_2_0::list::operator= (this=0x7b0c000200a0, other=...) at ../src/include/buffer.h:972
#3  0x00007ffff7f12ff7 in Message::claim_data (bl=..., this=0x7b540000ff00) at ../src/msg/Message.h:434
#4  Objecter::handle_osd_op_reply (this=this@entry=0x7b6c00000700, m=m@entry=0x7b540000ff00) at ../src/osdc/Objecter.cc:3513

Assumptions

It looks like we are deallocating the bufferlist before all requests using it are finished.
When using unique_ptr<librados::bufferlist>std::deque instead of vector to avoid moves with significantly more requests, the test passes succefully.

  • Test added:
     [----------] 30 tests from LibRadosAio
     [ RUN      ] LibRadosAio.MultiReads
     [       OK ] LibRadosAio.MultiReads (2251 ms)
    
    Same test using std::vector (not included in the commit)
    [----------] 30 tests from LibRadosAio
    [ RUN      ] LibRadosAio.MultiReadsVector
    [1]    662753 segmentation fault (core dumped)  ceph_test_rados_api_aio_pp
    

Fixes: https://tracker.ceph.com/issues/57152

Signed-off-by: Matan Breizman mbreizma@redhat.com

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@Matan-B Matan-B requested a review from batrick August 28, 2022 10:41
@github-actions github-actions bot added the libcephsqlite ceph SQLite3 VFS label Aug 28, 2022
@ronen-fr
Copy link
Contributor

LGTM. Not sure we need two commits, but that's your call.

@batrick
Copy link
Member

batrick commented Aug 30, 2022

From the tracker's core dump:

(gdb) thread 3
..
#5  librados::AioCompletionImpl::wait_for_complete (this=0x55d397944720) at ./src/librados/AioCompletionImpl.h:63
#6  librados::v14_2_0::AioCompletion::wait_for_complete (this=<optimized out>) at ./src/librados/librados_cxx.cc:1050
#7  0x00007fc1bae649c0 in SimpleRADOSStriper::read (this=0x55d3974e6c70, data=data@entry=0x55d3974eee38, len=len@entry=65536, off=off@entry=4129788)
    at /usr/include/c++/9/bits/unique_ptr.h:360
[Current thread is 1 ..
(gdb) bt
#0  0x00007fc1bacfac5b in ceph::buffer::v15_2_0::list::buffers_t::clear_and_dispose (this=0x55d3974d3ed0) at ./src/include/buffer.h:599
#1  ceph::buffer::v15_2_0::list::buffers_t::operator= (other=..., this=0x55d3974d3ed0) at ./src/include/buffer.h:505
#2  ceph::buffer::v15_2_0::list::operator= (this=0x55d3974d3ed0, other=...) at ./src/include/buffer.h:970
#3  0x00007fc1bad35606 in Message::claim_data (bl=..., this=0x7fc1a001b680) at ./src/msg/Message.h:431
#4  Objecter::handle_osd_op_reply (this=0x55d397435b90, m=0x7fc1a001b680) at ./src/osdc/Objecter.cc:3509

Reproduced by attempting to make multithreaded aio requests similarly to SimpleRADOSStriper:

std::vector<std::pair<librados::bufferlist, std::unique_ptr<librados::AioCompletion>>> reads;
  for (i = 0; i < 5; i++) {
    auto& [bl_ok, aiocp] = reads.emplace_back();
    aiocp = std::unique_ptr<librados::AioCompletion>(librados::Rados::aio_create_completion());
    ret = io_ctx.aio_read(object_name, aiocp.get(), &bl_ok, read_len, 0);
  }
  for (auto& [bl, aiocp] : reads) {
    if (int rc = aiocp->wait_for_complete(); rc < 0) {
      return rc;
    }
  }

Same segfault:

(gdb) bt 5
#0  ceph::buffer::v15_2_0::list::buffers_t::clear_and_dispose (this=0x7b0c000200a0) at ../src/include/buffer.h:599
#1  ceph::buffer::v15_2_0::list::buffers_t::operator= (other=..., this=0x7b0c000200a0) at ../src/include/buffer.h:505
#2  ceph::buffer::v15_2_0::list::operator= (this=0x7b0c000200a0, other=...) at ../src/include/buffer.h:972
#3  0x00007ffff7f12ff7 in Message::claim_data (bl=..., this=0x7b540000ff00) at ../src/msg/Message.h:434
#4  Objecter::handle_osd_op_reply (this=this@entry=0x7b6c00000700, m=m@entry=0x7b540000ff00) at ../src/osdc/Objecter.cc:3513

Assumptions

It looks like we are deallocating the bufferlist before all requests using it are finished. When using unique_ptr<librados::bufferlist> with significantly more requests, the test passes succefully.

Well, both structures should work, right? I wonder if you'll see invalid access when running with valgrind, before and after your patch.

@Matan-B Matan-B force-pushed the wip-matanb-rados-striper branch from c0de339 to 23a54a9 Compare August 31, 2022 08:28
@Matan-B Matan-B changed the title SimpleRADOSStriper: Use bufferlist unique ptr in read() SimpleRADOSStriper: Avoid moving bufferlists by using deque in read() Aug 31, 2022
@Matan-B Matan-B marked this pull request as ready for review August 31, 2022 08:51
@Matan-B Matan-B requested a review from cbodley August 31, 2022 08:51
Copy link
Contributor

@cbodley cbodley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good if it works 👍

@github-actions github-actions bot added the tests label Sep 1, 2022
@Matan-B
Copy link
Contributor Author

Matan-B commented Sep 1, 2022

looks good if it works +1

Test added for verification.

my_lock.unlock();
}

TEST(LibRadosAio, MultiReads) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like the idea of this test, but we really want it to test the actual code in SimpleRADOSStriper::read() instead of a copy of its logic. that way if we change SimpleRADOSStriper in the future, this test can catch issues there

@batrick is test/libcephsqlite/main.cc the right place for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not both? :)

It should go in a new file src/test/libcephsqlite/striper.cc or something instead though.

my_lock.unlock();
}

TEST(LibRadosAio, MultiReads) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not both? :)

It should go in a new file src/test/libcephsqlite/striper.cc or something instead though.

@batrick
Copy link
Member

batrick commented Sep 1, 2022

@yuriw this should be ready to merge soon -- let's try to get the forthcoming backport into the next quincy release. This bug is very disruptive.

@Matan-B Matan-B force-pushed the wip-matanb-rados-striper branch 2 times, most recently from 70b3b22 to c5475bc Compare September 1, 2022 15:44
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
@Matan-B Matan-B force-pushed the wip-matanb-rados-striper branch from c5475bc to c4a2c38 Compare September 1, 2022 15:50
@ljflores
Copy link
Member

ljflores commented Sep 6, 2022

@Matan-B can you take a look at https://tracker.ceph.com/issues/57310 and confirm whether this PR will address it? @NitzanMordhai left a comment about that tracker being related to the same issue you've documented here.

@batrick
Copy link
Member

batrick commented Sep 6, 2022

@Matan-B can you take a look at https://tracker.ceph.com/issues/57310 and confirm whether this PR will address it? @NitzanMordhai left a comment about that tracker being related to the same issue you've documented here.

That's a different RADOS striper (the original one). So I don't think so.

@rzarzynski
Copy link
Contributor

jenkins test make check

@rzarzynski
Copy link
Contributor

jenkins test api

@Matan-B
Copy link
Contributor Author

Matan-B commented Sep 7, 2022

jenkins test make check

@Matan-B
Copy link
Contributor Author

Matan-B commented Sep 18, 2022

Tests LGTM.

0 Related Failures.
6 Tracked unrelated Failures.
0 Newly Tracked Failures.
0 Tracked dead Jobs.

http://pulpito.front.sepia.ceph.com/yuriw-2022-09-14_20:04:05-rados-wip-yuri3-testing-2022-09-13-0928-distro-default-smithi/

Rerun of:
http://pulpito.front.sepia.ceph.com/yuriw-2022-09-13_20:42:45-rados-wip-yuri3-testing-2022-09-13-0928-distro-default-smithi/

Failures:
7033172, 7033177: rook: ensure CRDs are installed first (https://tracker.ceph.com/issues/57311)
7033173, 7033179: free(): invalid pointer (https://tracker.ceph.com/issues/57163)
7033174, 7033178: 'check osd count' reached maximum tries (https://tracker.ceph.com/issues/52321)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants