Skip to content

rgwlc/sync: avoid calling merge-and-store-attrs from remove_bucket_co…#47411

Merged
cbodley merged 1 commit intoceph:mainfrom
linuxbox2:wip-56997
Aug 15, 2022
Merged

rgwlc/sync: avoid calling merge-and-store-attrs from remove_bucket_co…#47411
cbodley merged 1 commit intoceph:mainfrom
linuxbox2:wip-56997

Conversation

@mattbenjamin
Copy link
Contributor

@mattbenjamin mattbenjamin commented Aug 2, 2022

…nfig()

Calling merge-and-store attrs turns out to be unsafe from the context of
the metadata sync handler--although I am doubtful that really should be
the case.

Fixes: https://tracker.ceph.com/issues/56997

Signed-off-by: Matt Benjamin mbenjamin@redhat.com

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

…nfig()

Calling merge-and-store attrs turns out to be unsafe from the context of
the metadata sync handler--although I am doubtful that really *should* be
the case.

Fixes: https://tracker.ceph.com/issues/56997

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
@yuvalif
Copy link
Contributor

yuvalif commented Aug 2, 2022

local multisite tests are mostly passing:
2 failures: "test_version_suspended_incremental_sync" and "test_bucket_sync_run_basic_incremental" (not related to the fix, and happened on the main branch as well)

@yuvalif yuvalif self-requested a review August 2, 2022 15:26
@yuvalif
Copy link
Contributor

yuvalif commented Aug 4, 2022

multisite tests in teuthology are mostly passing: http://qa-proxy.ceph.com/teuthology/yuvalif-2022-08-04_10:50:49-rgw:multisite-wip-yuval-remove-metadata-entry-distro-default-smithi/6957808/teuthology.log

(ignore the branch name, i tested multiple commits there)

have seen the following failures in other runs:

  • test_version_suspended_incremental_sync
  • test_datalog_autotrim
  • test_bucket_reshard_index_log_trim

@mattbenjamin
Copy link
Contributor Author

thanks @yuvalif -- maybe we can merge today

@yuvalif
Copy link
Contributor

yuvalif commented Aug 4, 2022

thanks @yuvalif -- maybe we can merge today

was only running multisite. does it need to run LC tests?

@mattbenjamin
Copy link
Contributor Author

it should pass an ordinary rgw suite run ,sure

@yuvalif
Copy link
Contributor

yuvalif commented Aug 8, 2022

it should pass an ordinary rgw suite run ,sure

teuthology passing with valgrind issues:
http://pulpito.front.sepia.ceph.com/yuvalif-2022-08-07_18:13:52-rgw:verify-wip-56997-distro-default-smithi/

  • 19 valgrind failures
  • 3 "pg degraded" failures
  • 1 s3 test: test_progress_expressions failed with:
    ReadTimeoutError: AWSHTTPSConnectionPool(host='smithi055.front.sepia.ceph.com', port=443): Read timed out.

@yuvalif yuvalif removed the needs-qa label Aug 8, 2022
@cbodley
Copy link
Contributor

cbodley commented Aug 8, 2022

it should pass an ordinary rgw suite run ,sure

teuthology passing with valgrind issues: http://pulpito.front.sepia.ceph.com/yuvalif-2022-08-07_18:13:52-rgw:verify-wip-56997-distro-default-smithi/

@yuvalif thanks very much for testing, but it looks like this is still not a full rgw suite, just rgw/verify

@yuvalif
Copy link
Contributor

yuvalif commented Aug 9, 2022

it should pass an ordinary rgw suite run ,sure

teuthology passing with valgrind issues: http://pulpito.front.sepia.ceph.com/yuvalif-2022-08-07_18:13:52-rgw:verify-wip-56997-distro-default-smithi/

@yuvalif thanks very much for testing, but it looks like this is still not a full rgw suite, just rgw/verify

should i use the "--subset" option?

@yuvalif
Copy link
Contributor

yuvalif commented Aug 15, 2022

teuthology results: http://pulpito.front.sepia.ceph.com/yuvalif-2022-08-14_12:52:31-rgw-wip-yuval-test-aug-14-1-distro-default-smithi/

some failures are expected, but LC failures require more investigation

1: /lib64/libpthread.so.0(+0x12ce0) [0x7f06ae28cce0]
 2: radosgw(+0xa507ce) [0x562108a9a7ce]
 3: radosgw(+0xa507da) [0x562108a9a7da]
 4: (rgw::notify::publish_commit(rgw::sal::Object*, unsigned long, std::chrono::time_point<ceph::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw::notify::EventType, rgw::notify::reservation_t&, DoutPrefixProvider const*)+0xb92) [0x562108a9dc92]
 5: (rgw::sal::RadosNotification::publish_commit(DoutPrefixProvider const*, unsigned long, std::chrono::time_point<ceph::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x2f) [0x562108ceb9bf]
 6: (RGWPutObj::execute(optional_yield)+0x3127) [0x562108b492d7]
 7: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, optional_yield, rgw::sal::Store*, bool)+0xb83) [0x562108726523]
 8: (process_request(rgw::sal::Store*, RGWREST*, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSink*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*, std::shared_ptr<RateLimiter>, int*, rgw::lua::Background*)+0xf27) [0x562108727d37]
 9: radosgw(+0x639e64) [0x562108683e64]
 10: radosgw(+0x63aa01) [0x562108684a01]
 11: make_fcontext()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants