rgw: qpid-proton amqp1.0 bucket notification by wangxuw · Pull Request #1 · wangxuw/ceph

wangxuw · 2021-06-10T15:28:48Z

Checklist

References tracker ticket
Updates documentation if necessary
Includes tests for new functionality or reproducer for bug

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox

Fixes: https://tracker.ceph.com/issues/50691 Signed-off-by: Sebastian Wagner <sewagner@redhat.com>

…ging image Fixes: https://tracker.ceph.com/issues/50687 Signed-off-by: Adam King <adking@redhat.com>

crash on multipart upload to bucket with policy Fixes: https://tracker.ceph.com/issues/50556 Signed-off-by: Or Friedmann <ofriedma@redhat.com>

extend the common logic used by the deploy, ceph-volume, and shell commands for validating the `--config` arg during bootstrap Signed-off-by: Michael Fritch <mfritch@suse.com>

use the standard error message from FileNotFound: ``` cephadm bootstrap --mon-ip 192.168.1.1 --config ~/foobar ERROR: [Errno 2] No such file or directory: '/root/foobar' ``` Signed-off-by: Michael Fritch <mfritch@suse.com>

Fixes: https://tracker.ceph.com/issues/50113 Signed-off-by: Sage Weil <sage@newdream.net>

Signed-off-by: Sage Weil <sage@newdream.net>

hash(str) is non-deterministic, probably because it is using the internal object ID or something and not the string content? In any case, explicitly hash the string content and use that instead. Also, sort the input pre-shuffle to ensure that variations in the original host list ordering don't screw with the result. Signed-off-by: Sage Weil <sage@newdream.net>

Signed-off-by: Sage Weil <sage@newdream.net>

Look in dict, not encoded JSON string Signed-off-by: Sage Weil <sage@newdream.net>

('orch ps') Signed-off-by: Sage Weil <sage@newdream.net>

This makes 'orch ls' match up daemosn to services (and probably cleans up other bits and pieces) when the old daemon id -> service name calc code can't do its thing. Signed-off-by: Sage Weil <sage@newdream.net>

The rank_map is a bit of state to keep track of which ranks are occupied by which generation and daemon_id. Signed-off-by: Sage Weil <sage@newdream.net>

DaemonDescription CephadmDaemonDeploySpec DaemonPlacement unit.meta get_unique_name() (we include it in the daemon_id) Signed-off-by: Sage Weil <sage@newdream.net>

If we are passed a rank_map, use it maintain one daemon per rank, where the ranks are consecutive non-negative integers starting from 0. A bit of refactoring in place() so that we only do the rank allocations on slots we are going to use (no more than count). Signed-off-by: Sage Weil <sage@newdream.net>

This is more informative than just the hostnames. Signed-off-by: Sage Weil <sage@newdream.net>

- we need to assign all names and update the rank_map before we start creating daemons. - if we are using ranks, we should delete old daemons first, and fence them from the cluster (where possible). Signed-off-by: Sage Weil <sage@newdream.net>

Use ranked daemons for NFS. Ganesha does not like it if multiple instances start up with the same rank, but we need stable ranks so that a rank can "fail over" to a new instance of a new daemon on another host (with the same rank) for NFS client reclaim to work. Specify a nodeid of '{service_name}.{rank}' for ganesha. Include a unique id in the daemon_id just because this avoids some issues with the create/destroy ordering, and because the daemon_id doesn't matter much anymore since we are using a stable rank. Signed-off-by: Sage Weil <sage@newdream.net>

Signed-off-by: Sage Weil <sage@newdream.net>

Do the grace file manipulation from the mgr module. For add, this isn't especially important, but for remove it is very important. Clean out old ranks from the grace table before we record that the rank has been purged from the rank_map. Signed-off-by: Sage Weil <sage@newdream.net>

This avoids any hangs due to rados. Signed-off-by: Sage Weil <sage@newdream.net>

Signed-off-by: Sage Weil <sage@newdream.net>

- use consistent hashing - statically map across ranks - disable backend checks so that clients don't move Signed-off-by: Sage Weil <sage@newdream.net>

Remove the grace object if we purge the service. Signed-off-by: Sage Weil <sage@newdream.net>

Signed-off-by: Sage Weil <sage@newdream.net>

Better to raise an error; eth0 will never be correct. Signed-off-by: Sage Weil <sage@newdream.net>

Signed-off-by: Sage Weil <sage@newdream.net>

less error-prone, and it's simpler to manage the resource using RAII Signed-off-by: Kefu Chai <kchai@redhat.com>

before this change, cot never destructs the created ObjectStore instances. after this change, they are destructed upon returning from main(). Signed-off-by: Kefu Chai <kchai@redhat.com>

just for the sake of correctness, as they don't need a full-blown std::string, what they need is but a string like object. and they always create a std::string instance as a member variable if they want to have a copy of it. Signed-off-by: Kefu Chai <kchai@redhat.com>

Following crash occured at Sepia [1]: ``` INFO 2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] ProtocolV2::start_accept(): targ et_addr=172.21.15.119:55220/0 DEBUG 2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] TRIGGER ACCEPTING, was NONE DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] SEND(26) banner: len_payload=16, supported=1, required=0, banner="ceph v2 " DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(10) banner: "ceph v2 " DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT banner: payload_len=16 DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(16) banner features: supported=1 required=0 DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] WRITE HelloFrame: my_type=osd, peer_addr=172.21.15.119:55220/0 DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT HelloFrame: my_type=client peer_addr=v2:172.21.15.119:6803/31733 INFO 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] UPDATE: peer_type=client, policy(lossy=true server=true standby=false resetcheck=false) DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] GOT AuthRequestFrame: method=2, preferred_modes={1, 2}, payload_len=174 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-4622-gaa1dc559/rpm/el8/BUILD/ceph-17.0.0-4622-gaa1dc559/src/crimson/mon/MonClient.cc:399:10: runtime error: member access within null pointer of type 'struct Connection' Segmentation fault on shard 0. Backtrace: 0# 0x000055E84CF44C1F in ceph-osd 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd 3# 0x00007F2BC88C0B20 in /lib64/libpthread.so.0 4# crimson::mon::Connection::get_conn() in ceph-osd 5# crimson::mon::Client::handle_auth_request(seastar::shared_ptr<crimson::net::Connection>, seastar::lw_shared_ptr<AuthConnectionMeta>, bool, unsigned int, ceph::buffer::v15_2_0::list const&, ceph::buffer::v15_2_0::list*) in ceph-osd 6# crimson::net::ProtocolV2::_handle_auth_request(ceph::buffer::v15_2_0::list&, bool) in ceph-osd 7# 0x000055E84DF67669 in ceph-osd 8# 0x000055E84DF68775 in ceph-osd 9# 0x000055E846F47F60 in ceph-osd 10# 0x000055E85296770F in ceph-osd 11# 0x000055E85296CC50 in ceph-osd 12# 0x000055E852B1ECBB in ceph-osd 13# 0x000055E85267C73A in ceph-osd 14# main in ceph-osd 15# __libc_start_main in /lib64/libc.so.6 16# _start in ceph-osd Fault at location: 0x98 ``` [1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/6136907 When the `handle_auth_request()` happens, there is no guarantee `active_con` is being available. This is reflected in the classical implementation: ```cpp int MonClient::handle_auth_request( Connection *con, // ... ceph::buffer::list *reply) { // ... bool isvalid = ah->verify_authorizer( cct, *rotating_secrets, payload, auth_meta->get_connection_secret_length(), reply, &con->peer_name, &con->peer_global_id, &con->peer_caps_info, &auth_meta->session_key, &auth_meta->connection_secret, ac); ``` The patch transplate the same logic to crimson. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

If the host IP/addr is known, use that. The addr might even be a FQDN instead of an IP address, in which case we want to look that up instead of the bare hostname. Signed-off-by: Sage Weil <sage@newdream.net>

Signed-off-by: Sage Weil <sage@newdream.net>

- Use a centralized method get_mgr_ip() - Look up the hostname via DNS. This is a bit more reliable than getfqdn() since it will work even when podman adds the container name to /etc/hosts. Signed-off-by: Sage Weil <sage@newdream.net>

Previously we allowed the host.addr to be a DNS name (short or fqdn). This is problematic because of the inconsistent way that docker and podman handle /etc/hosts, and undesirable because relying on external DNS is an external source of failure for the cluster without any benefit in return (simply updating DNS is not sufficient to make ceph behave). So: update any non-IP to an IP as soon as we start up (presumably on upgrade). If we get a loopback address (127.0.0.1 or 127.0.1.1), then wait and hope that the next instance of the manager has better luck. Signed-off-by: Sage Weil <sage@newdream.net>

Signed-off-by: Sage Weil <sage@newdream.net>

This reverts cfc1f91, which is no longer neceesary because (1) we don't use socket.getfqdn(), and (2) we generally do not rely on DNS or /etc/hosts at all anymore (with the exception of the upgrade transition). Signed-off-by: Sage Weil <sage@newdream.net>

…rvice-status-improvement-2021-05-26 doc/cephadm: enrich "service status" Reviewed-by: Sebastian Wagner <sewagner@redhat.com>

* refs/pull/41483/head: cephadm: stop passing --no-hosts to podman mgr/nfs: use host.addr for backend IP where possible mgr/cephadm: convert host addr if non-IP to IP mgr/dashboard,prometheus: new method of getting mgr IP doc/cephadm: remove any reference to the use of DNS or /etc/hosts mgr/cephadm: use known host addr mgr/cephadm: resolve IP at 'orch host add' time Reviewed-by: Sebastian Wagner <swagner@suse.com>

doc: 15.2.13 Release Notes Reviewed-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Casey Bodley <cbodley@redhat.com> Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com> Reviewed-by: Ramana Raja <rraja@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>

less repeating this way Signed-off-by: Kefu Chai <kchai@redhat.com>

doc/mgr: use confval directive to define options Reviewed-by: Neha Ojha <nojha@redhat.com>

crimson/monc: handle_auth_request() doesn't depend on active_con. Reviewed-by: Kefu Chai <kchai@redhat.com>

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

…xtent crimson/seastore: introduce and adopt LBAManager::get_mapping(t, offset) Reviewed-by: Kefu Chai <kchai@redhat.com>

os/bluestore: pass string_view to ctor of Allocator Reviewed-by: Igor Fedotov <ifedotov@suse.com>

os: let ObjectStore::create() return unique_ptr<> Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

…user-no-hosts mgr/cephadm: Don't call _check_host without hosts Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com> Reviewed-by: Adam King <adking@redhat.com>

…blocking-io-during-index-resharding rgw: add the description of blocking io during index resharding Reviewed-by: Matt Benjamin mbenjamin@redhat.com Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>

`OpSequencer` assumes that ID of a previous client request is always lower than ID of current one. This is reflected by the assertion in `OpSequencer::start_op()`. It triggered the following failure [1] in Teuthology: ``` DEBUG 2021-05-07 08:01:41,227 [shard 0] osd - client_request(id=1, detail=osd_op(client.4171.0:1 2.2 2.7c339972 (undecoded) ondisk+retry+read+known_if_redirected e29) v8) same_interval_since: 31 ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3910-g1b18e076/rpm/el8/BUILD/ceph- 17.0.0-3910-g1b18e076/src/crimson/osd/osd_operation_sequencer.h:38: seastar::futurize_t<Result> crimson::osd::OpSequencer::start_op(HandleT&, uint64_t, uint64_t, FuncT&&) [with HandleT = crimson::PipelineHa ndle; FuncT = crimson::interruptible::interruptor<InterruptCond>::wrap_function(Func&&) [with Func = crimson::osd::ClientRequest::start()::<lambda()> mutable::<lambda(Ref<crimson::osd::PG>)> mutable::<lambd a()> mutable::<lambda()>; InterruptCond = crimson::osd::IOInterruptCondition]::<lambda()>; Result = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<> >; seastar::futurize_t<Result> = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<> >; uint64_t = long unsigned int]: Assertion `prev_op < this_op' fai led. Aborting on shard 0. Backtrace: Segmentation fault. Backtrace: 0# 0x00005592B028932F in ceph-osd 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd 3# 0x00007F57B72E7B20 in /lib64/libpthread.so.0 4# gsignal in /lib64/libc.so.6 5# abort in /lib64/libc.so.6 6# 0x00007F57B58E2B09 in /lib64/libc.so.6 7# 0x00007F57B58F0DE6 in /lib64/libc.so.6 8# 0x00005592ABB8484D in ceph-osd 9# 0x00005592ABB8ACB3 in ceph-osd 10# seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<boost::intrusive_ptr<crimson::osd::PG> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&, seastar::future_state<boost::intrusive_ptr<crimson::osd::PG> >&&)#1}, boost::intrusive_ptr<crimson::osd::PG> >::run_and_dispose() in ceph-osd 11# 0x00005592B357F88F in ceph-osd 12# 0x00005592B3584DD0 in ceph-osd ``` [1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-07_07:41:02-rados-master-distro-basic-smithi/6104530 Crash analysis resulted in two observations: 1. during the request execution the acting set got changed, the request was interrupted and a try to re-execute it emerged; 2. the interrupted request was the very first client request the OSD has ever seen. Code analysis showed a problem in how `ClientRequest` establishes `prev_op_id`: although supposed to be performed only once for a request, it can get executed twice but only for the very first request `OpSequencer` saw. ```cpp void ClientRequest::may_set_prev_op() { // set prev_op_id if it's not set yet if (__builtin_expect(prev_op_id == 0, true)) { prev_op_id = sequencer.get_last_issued(); } } ``` Unfortunately, `0` isn't a distincted value that cannot be returned by `get_last_issued()`: ```cpp class OpSequencer { // ... uint64_t get_last_issued() const { return last_issued; } // ... // the id of last op which is issued uint64_t last_issued = 0; ``` As a result, `OpSequencer` returned on the second call a new value (actually `this_op`) violating the assertion. The commit fixes the problem by switching from a designated value to `std::optional`. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

f7181ab has optimized the client parallelism. To achieve that `PG::do_osd_ops()` were converted to return basically future pair of futures. Unfortunately, the life- time management of `OpsExecuter` was kept intact. In the result, the object was valid only till fullfying the outer future while, due to the `rollbacker` instances, it should be available till `all_completed` becomes available. This issue can explain the following problem has been observed in a Teuthology job [1]. ``` DEBUG 2021-05-20 08:03:22,617 [shard 0] osd - do_op_call: method returned ret=-17, outdata.length()=0 while num_read=1, num_write=0 DEBUG 2021-05-20 08:03:22,617 [shard 0] osd - rollback_obc_if_modified: object 19:e17d4708:test-rados-api-smithi095-38404-2::foo:head got erro r generic:17, need_rollback=false ================================================================= ==33626==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d0000b9320 at pc 0x560f486b8222 bp 0x7fffc467a1e0 sp 0x7fffc467a1d0 READ of size 4 at 0x60d0000b9320 thread T0 #0 0x560f486b8221 (/usr/bin/ceph-osd+0x2c610221) #1 0x560f4880c6b1 in seastar::continuation<seastar::internal::promise_base_with_type<boost::intrusive_ptr<MOSDOpReply> >, seastar::noncopy able_function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_ wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>, seastar::future<void>::then_impl_nrvo<seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail<crimson::osd:: IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson: :errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>, crimson::interruptible::interruptible_future_detail<crimson::osd::IOInte rruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::error ated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > >(seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail <crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_ future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>&&)::{lambda(seastar::internal::promise_base_with_type<boos t::intrusive_ptr<MOSDOpReply> >&&, seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>::run_and_dispose() (/usr/bin/ceph-osd+0x2c7646b1) #2 0x560f5352c3ae (/usr/bin/ceph-osd+0x374843ae) ceph#3 0x560f535318ef (/usr/bin/ceph-osd+0x374898ef) ceph#4 0x560f536e395a (/usr/bin/ceph-osd+0x3763b95a) ceph#5 0x560f532413d9 (/usr/bin/ceph-osd+0x371993d9) ceph#6 0x560f476af95a in main (/usr/bin/ceph-osd+0x2b60795a) ceph#7 0x7f7aa0af97b2 in __libc_start_main (/lib64/libc.so.6+0x237b2) ceph#8 0x560f477d2e8d in _start (/usr/bin/ceph-osd+0x2b72ae8d) ``` [1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-20_07:28:16-rados-master-distro-basic-smithi/6124735/ The commit deals with the problem by repacking the outer future. An alternative could be in switching from `std::unique_ptr` to `seastar::shared_ptr` for managing `OpsExecuter`. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

Following crash occured at Sepia [1]: ``` INFO 2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] ProtocolV2::start_accept(): targ et_addr=172.21.15.119:55220/0 DEBUG 2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] TRIGGER ACCEPTING, was NONE DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] SEND(26) banner: len_payload=16, supported=1, required=0, banner="ceph v2 " DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(10) banner: "ceph v2 " DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT banner: payload_len=16 DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(16) banner features: supported=1 required=0 DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] WRITE HelloFrame: my_type=osd, peer_addr=172.21.15.119:55220/0 DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT HelloFrame: my_type=client peer_addr=v2:172.21.15.119:6803/31733 INFO 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] UPDATE: peer_type=client, policy(lossy=true server=true standby=false resetcheck=false) DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] GOT AuthRequestFrame: method=2, preferred_modes={1, 2}, payload_len=174 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-4622-gaa1dc559/rpm/el8/BUILD/ceph-17.0.0-4622-gaa1dc559/src/crimson/mon/MonClient.cc:399:10: runtime error: member access within null pointer of type 'struct Connection' Segmentation fault on shard 0. Backtrace: 0# 0x000055E84CF44C1F in ceph-osd 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd 3# 0x00007F2BC88C0B20 in /lib64/libpthread.so.0 4# crimson::mon::Connection::get_conn() in ceph-osd 5# crimson::mon::Client::handle_auth_request(seastar::shared_ptr<crimson::net::Connection>, seastar::lw_shared_ptr<AuthConnectionMeta>, bool, unsigned int, ceph::buffer::v15_2_0::list const&, ceph::buffer::v15_2_0::list*) in ceph-osd 6# crimson::net::ProtocolV2::_handle_auth_request(ceph::buffer::v15_2_0::list&, bool) in ceph-osd 7# 0x000055E84DF67669 in ceph-osd 8# 0x000055E84DF68775 in ceph-osd 9# 0x000055E846F47F60 in ceph-osd 10# 0x000055E85296770F in ceph-osd 11# 0x000055E85296CC50 in ceph-osd 12# 0x000055E852B1ECBB in ceph-osd 13# 0x000055E85267C73A in ceph-osd 14# main in ceph-osd 15# __libc_start_main in /lib64/libc.so.6 16# _start in ceph-osd Fault at location: 0x98 ``` [1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/6136907 When the `handle_auth_request()` happens, there is no guarantee `active_con` is being available. This is reflected in the classical implementation: ```cpp int MonClient::handle_auth_request( Connection *con, // ... ceph::buffer::list *reply) { // ... bool isvalid = ah->verify_authorizer( cct, *rotating_secrets, payload, auth_meta->get_connection_secret_length(), reply, &con->peer_name, &con->peer_global_id, &con->peer_caps_info, &auth_meta->session_key, &auth_meta->connection_secret, ac); ``` The patch transplate the same logic to crimson. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

sebastian-philipp and others added 30 commits May 11, 2021 13:08

mgr/cephadm: Don't call _check_host without hosts

8211927

Fixes: https://tracker.ceph.com/issues/50691 Signed-off-by: Sebastian Wagner <sewagner@redhat.com>

doc/cephadm: recommend redeploying monitoring stack daemon after chan…

9be8465

…ging image Fixes: https://tracker.ceph.com/issues/50687 Signed-off-by: Adam King <adking@redhat.com>

rgw: crash on multipart upload to bucket with policy

413b23a

crash on multipart upload to bucket with policy Fixes: https://tracker.ceph.com/issues/50556 Signed-off-by: Or Friedmann <ofriedma@redhat.com>

cephadm: raise an error when --config file is not found

0e44419

extend the common logic used by the deploy, ceph-volume, and shell commands for validating the `--config` arg during bootstrap Signed-off-by: Michael Fritch <mfritch@suse.com>

cephadm: clean-up error message

870e9bd

use the standard error message from FileNotFound: ``` cephadm bootstrap --mon-ip 192.168.1.1 --config ~/foobar ERROR: [Errno 2] No such file or directory: '/root/foobar' ``` Signed-off-by: Michael Fritch <mfritch@suse.com>

doc/releases/pacific: add note about rgw on upgrade

0a9d4b3

Fixes: https://tracker.ceph.com/issues/50113 Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: document CephadmService flags

2fa80d8

Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: simplify

991515a

Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm/inventory: fix deleted check

a282c3c

Look in dict, not encoded JSON string Signed-off-by: Sage Weil <sage@newdream.net>

mgr/orchestrator: include service_name in DaemonDescription dump

15e5c0a

('orch ps') Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: include service_name is generated DaemonDescription

11ff484

This makes 'orch ls' match up daemosn to services (and probably cleans up other bits and pieces) when the old daemon id -> service name calc code can't do its thing. Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm/inventory: store optional rank_map along with specs

d0d2232

The rank_map is a bit of state to keep track of which ranks are occupied by which generation and daemon_id. Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: add rank[_generation] properties

5e8f184

DaemonDescription CephadmDaemonDeploySpec DaemonPlacement unit.meta get_unique_name() (we include it in the daemon_id) Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: make _plan show removed daemon names

1b09807

This is more informative than just the hostnames. Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: nfs: bind ganesha to appropriate ip:port

ae4ab5d

Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: nfs: shell out to rados tool for conf creation

917fb59

This avoids any hangs due to rados. Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: do not reconfigure daemons on deleted services

444663b

Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: ingress: support nfs

51f0ded

- use consistent hashing - statically map across ranks - disable backend checks so that clients don't move Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: nfs: add purge

15bdaa7

Remove the grace object if we purge the service. Signed-off-by: Sage Weil <sage@newdream.net>

mgr/orchestrator: add --port arg to 'orch apply nfs'

205bf35

Signed-off-by: Sage Weil <sage@newdream.net>

qa/tasks/vip: add 'vip.exec' task

54542fd

Signed-off-by: Sage Weil <sage@newdream.net>

cephadm: add -v arg to shell

693fd30

Signed-off-by: Sage Weil <sage@newdream.net>

qa/tasks/cephadm: allow mounting volumes in shell

b711a75

Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: ingress: remove eth0 default

9dba27e

Better to raise an error; eth0 will never be correct. Signed-off-by: Sage Weil <sage@newdream.net>

python-common: fix IngressSpec yaml dump

05fdbc8

Signed-off-by: Sage Weil <sage@newdream.net>

tchaikov and others added 26 commits May 27, 2021 23:07

osd: pass unique_ptr<ObjectStore> to ctor of OSD

b04b2f4

less error-prone, and it's simpler to manage the resource using RAII Signed-off-by: Kefu Chai <kchai@redhat.com>

tools/ceph_objectstore_tool: destruct ObjectStore using unique_ptr<>

d5445b8

before this change, cot never destructs the created ObjectStore instances. after this change, they are destructed upon returning from main(). Signed-off-by: Kefu Chai <kchai@redhat.com>

mgr/cephadm: use known host addr

9008800

If the host IP/addr is known, use that. The addr might even be a FQDN instead of an IP address, in which case we want to look that up instead of the bare hostname. Signed-off-by: Sage Weil <sage@newdream.net>

doc/cephadm: remove any reference to the use of DNS or /etc/hosts

872668a

Signed-off-by: Sage Weil <sage@newdream.net>

mgr/nfs: use host.addr for backend IP where possible

7e9f4ac

Signed-off-by: Sage Weil <sage@newdream.net>

Merge pull request ceph#41561 from zdover23/wip-doc-cephadm-s-mgmt-se…

fe258aa

…rvice-status-improvement-2021-05-26 doc/cephadm: enrich "service status" Reviewed-by: Sebastian Wagner <sewagner@redhat.com>

doc/mgr: use confval directive to define options

dfdcf2c

less repeating this way Signed-off-by: Kefu Chai <kchai@redhat.com>

Merge pull request ceph#41544 from tchaikov/wip-doc-confval

9091261

doc/mgr: use confval directive to define options Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request ceph#41578 from rzarzynski/wip-crimson-monc-auth-req

596ae33

crimson/monc: handle_auth_request() doesn't depend on active_con. Reviewed-by: Kefu Chai <kchai@redhat.com>

crimson/seastore: add stub to introduce get_mapping() without length

6f4b296

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

crimson/seastore: implement and test get_mapping(t, laddr)

c165a28

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

crimson/seastore: adopt get_mapping(t, offset) interface

88a41c3

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

Merge pull request ceph#41582 from cyx1231st/wip-seastore-swap-read-e…

0331281

…xtent crimson/seastore: introduce and adopt LBAManager::get_mapping(t, offset) Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request ceph#41573 from tchaikov/wip-allocat-ctor

2ba0f48

os/bluestore: pass string_view to ctor of Allocator Reviewed-by: Igor Fedotov <ifedotov@suse.com>

Merge pull request ceph#41520 from tchaikov/wip-osd-unique-ptr

2a35c56

os: let ObjectStore::create() return unique_ptr<> Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

Merge pull request ceph#41278 from sebastian-philipp/mgr-cephadm-set-…

2ecb738

…user-no-hosts mgr/cephadm: Don't call _check_host without hosts Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com> Reviewed-by: Adam King <adking@redhat.com>

Merge pull request ceph#41563 from cybozu/rgw-add-the-description-of-…

0cebfae

…blocking-io-during-index-resharding rgw: add the description of blocking io during index resharding Reviewed-by: Matt Benjamin mbenjamin@redhat.com Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>

feat: add proton library to amqp1.0 bucket notification

ec5935c

fix: unittest dependency and build errors

5fc6886

wangxuw closed this Jun 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rgw: qpid-proton amqp1.0 bucket notification#1

rgw: qpid-proton amqp1.0 bucket notification#1
wangxuw wants to merge 214 commits intomasterfrom
gsoc-amqp1

wangxuw commented Jun 10, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

wangxuw commented Jun 10, 2021

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants