common,mon,osd: unify 'ceph tell' and 'ceph daemon' command sets#30217

Merged

liewegas merged 51 commits intoceph:masterfrom

liewegas:wip-asok-tell

Oct 6, 2019

Member

liewegas commented Sep 6, 2019 •

edited

Loading

Allow 'ceph tell' and 'ceph daemon' to access the same set of commands, making things more flexible for the user and getting rid of a lot of duplication of commands and surrounding infrastructure in the daemons.

prereqs

future?

rev asok protocol to handle return value/error code
MgrClient: let mgr tell target any mgr? this doesn't work now because MgrStandby doesn't even bind.. we only bind when becoming the active mgr.

liewegas added the core label

liewegas changed the title ~~unify 'ceph tell' and 'ceph daemon' command sets~~ WIP: unify 'ceph tell' and 'ceph daemon' command sets

liewegas force-pushed the wip-asok-tell branch from 54f8f30 to 1d438ce Compare

September 6, 2019 21:03

liewegas added the wip-sage-testing label

liewegas force-pushed the wip-asok-tell branch from 28fc47f to 38a986b Compare

September 9, 2019 18:13

tchaikov self-requested a review

September 10, 2019 14:34

liewegas force-pushed the wip-asok-tell branch 3 times, most recently from 5f5dafa to 519d1ef Compare

September 11, 2019 15:43

batrick previously requested changes

View reviewed changes

src/common/admin_socket.cc Outdated Show resolved Hide resolved

src/common/admin_socket.cc Outdated Show resolved Hide resolved

src/common/admin_socket.h Outdated

+                  std::function<void(int,const std::string&,bufferlist&)> on_finish) {
+                  // by default, call the synchronous handler and then finish
+                  bufferlist out;
+                  int r = call(command, cmdmap, format, out);

Member

batrick Sep 11, 2019

This would seem like a good opportunity to standardize on JSON (as discussed in CLT) and transition to some standard output format. We could have a standard return code (as an int), error message, ceph version, etc. Even if it's not trivial to convert all commands to take a JSONFormatter, we could try to decode the bufferlist as json and then stick it in some output:

{rc=0,errmsg=null,output={"custom":"output"},version="ceph version..."}

otherwise if it doesn't decode:

{rc=0,errmsg=null,output="output string that is not JSON",version="ceph version..."}

What do you think?

It'd also be interesting to consider that some commands may run asynchronously and we could eventually have a way to query running commands and wait to get output.

Member Author

liewegas Sep 11, 2019

I could see us going so far as to pass a Formatter* to these call methods. My main reservation is that JSON (at least) isn't fully expressive, so if we ever want to return binary data (say, a binary dump of the osdmap), we can't.

I'm inclined to go command-by-command and make sure what we have now is dumping JSON wherever possible?

Member

batrick Oct 2, 2019

Sorry didn't see the notification for your response.

I could see us going so far as to pass a Formatter* to these call methods. My main reservation is that JSON (at least) isn't fully expressive, so if we ever want to return binary data (say, a binary dump of the osdmap), we can't.

Agreed. Formatter makes more sense in that case.

I'm inclined to go command-by-command and make sure what we have now is dumping JSON wherever possible?

I'd also advocate for a better data description language, maybe Lua? I could whip together a formatter for Lua if there's interest.

liewegas force-pushed the wip-asok-tell branch from 0f70bda to f20dd59 Compare

September 12, 2019 12:32

liewegas removed the wip-sage-testing label

liewegas force-pushed the wip-asok-tell branch 5 times, most recently from 8cad770 to 2a92646 Compare

September 24, 2019 18:34

liewegas force-pushed the wip-asok-tell branch 2 times, most recently from 5f58ff7 to 7b586b9 Compare

September 29, 2019 19:57

liewegas changed the title ~~WIP: unify 'ceph tell' and 'ceph daemon' command sets~~ common: unify 'ceph tell' and 'ceph daemon' command sets

liewegas changed the title ~~common: unify 'ceph tell' and 'ceph daemon' command sets~~ common,mon,osd: unify 'ceph tell' and 'ceph daemon' command sets

liewegas force-pushed the wip-asok-tell branch 2 times, most recently from aab9c63 to 6cf2d6c Compare

October 1, 2019 21:29

liewegas requested a review from batrick

October 1, 2019 21:30

liewegas added 6 commits

October 1, 2019 16:30


          common/admin_socket: drop unused args

675bbf3

Signed-off-by: Sage Weil <sage@redhat.com>


          include/compat: add flags arg to pipe_cloexec

058aa90

This matches pipe(2).

Signed-off-by: Sage Weil <sage@redhat.com>


          common/admin_socket: use pipe for general wakeup

aed0fe4

Use the pipe to wake up the thread.  Use a separate bool to signal a
shutdown.

Signed-off-by: Sage Weil <sage@redhat.com>


          common/admin_socket: pass cmdvec to execute_command

32ad669

Signed-off-by: Sage Weil <sage@redhat.com>


          common/admin_socket: add ability to process MCommand via asok queue

a3bf24d

Signed-off-by: Sage Weil <sage@redhat.com>


          common/admin_socket: simplify command routing

e616bec

Back in e30e937 we made it possible to
route a command via any prefix. This worked when we wanted to pass
arguments but were just dealing with a vector<string>.  These days we have
an actual prefix followed by named arguments, so we don't need this
ad hoc routing.

Derive the prefix from the cmddesc at registration time, and match that
explicitly against the prefix at execution time.

Signed-off-by: Sage Weil <sage@redhat.com>

liewegas added 20 commits

October 4, 2019 09:07


          mgr/MgrClient: empty target string for 'tell' means active mgr

22b0a99

This lets the caller say "tell the active mgr", whoever it may be.

Signed-off-by: Sage Weil <sage@redhat.com>


          PendingReleaseNotes: notes about asok changes

5d4fe85

Signed-off-by: Sage Weil <sage@redhat.com>


          mon: print entity_name along with caps to debug log

e1163b4

Signed-off-by: Sage Weil <sage@redhat.com>


          test/pybind/test_rados: fix accidental mon tell test

215d267

This test assumed 'mon dump' (a CLI command) would work when targetted at
a specific mon (i.e., a tell command).

Signed-off-by: Sage Weil <sage@redhat.com>


          common/ceph_context: return error code from asok commands

Several of these commands (config set, injectargs, etc.) can return errors.
Do so.

Signed-off-by: Sage Weil <sage@redhat.com>


          rgw: fix asok return value

ee4ba58

Signed-off-by: Sage Weil <sage@redhat.com>


          os/bluestore: fix asok hook return value

7aab81c

Signed-off-by: Sage Weil <sage@redhat.com>


          common/admin_socket: pass ostream to call() for error output

a4c0773

This separates the error stream from the output stream for the synchronous
hook.

This patch includes misc cleanup in the various implementations to make
use of the new stream for errors.

Add a test to unittest_context to ensure we're getting the error stream.

Signed-off-by: Sage Weil <sage@redhat.com>


          common/admin_socket: pass Formatter from generic infrastructure

adf1486

The implementation can choose to either use the provided Formatter, or
put something directly into outbl.  The implementation may choose to
flush the formatter to the output buffer|stream, or let the caller do it
for them (usually the latter).

Lots of fiddling/cleanup in the implementations to make this build,
including dropping the (seeminlyg unused?) ostream& output mode for
the librbd asok implementations.

Signed-off-by: Sage Weil <sage@redhat.com>


          qa/workunits/cephtool/test.sh: fix tell output error check

05bc99f

It goes to stderr instead of stdout now.

Signed-off-by: Sage Weil <sage@redhat.com>


          common/config: EPERM setting config option after startup

0f47d10

Not sure why ENOSYS was chosen before...

Signed-off-by: Sage Weil <sage@redhat.com>


          qa/suites/rados/singleton/all/admin-socket: fix test

7b644f5

We can't set the filestore setting because filestore isn't active and so
the option isn't observed, so it isn't changeable.

Signed-off-by: Sage Weil <sage@redhat.com>


          qa/tasks/dump_stuck: mon.a, not mon.0

4a48974

Signed-off-by: Sage Weil <sage@redhat.com>


          common/ceph_context: make 'config unset' idempotent

9585dd0

Signed-off-by: Sage Weil <sage@redhat.com>


          qa/workunits/cephtool: fix errpr checks for 'ceph daemon' commands

99f9406

Error output has changed slightly.

Signed-off-by: Sage Weil <sage@redhat.com>


          mgr,mgr/MgrClient: use fsid to signal mon-mgr vs cli MCommands

6f35d28

We can't use the feature bit for the MCommand connection to tell whether
it is a tell or CLI command because new clients may have to send CLI
commands via MCommand for old clusters, and they don't always know whether
this mgr is new or old yet.

Prior to octopus, MCommand contained a mon/mgr CLI command, and did not
have the fsid field set.  Start populating the fsid field, and use this
to signal whether a client is a new MgrClient that knows MCommand vs
MMgrCommand.  If we get an MCommand with the fsid set, that means it is
a tell command; otherwise, it's an old client sending a CLI command.

Signed-off-by: Sage Weil <sage@redhat.com>


          mon/MonClient: interpret numeric mon target name as rank

62f33d5

This allows us to behave when a rank is passed to mon_command(..., target=),
which will call rados_mon_command_target() -> MonClient::start_mon_command
with a (string) target name.

We could make an integer variant of rados_mon_command_target, and do the
int vs string differentiation in python, but this is much easier.

Signed-off-by: Sage Weil <sage@redhat.com>


          src/.gitignore: ignore .tox

cc8bd47

Signed-off-by: Sage Weil <sage@redhat.com>


          mon/MonClient: fix sending mon command to a specific rank

dbb53e6

The target_name check was triggering when target_name was empty (and
target_rank >= 0).

Fixes 4ef0502

Signed-off-by: Sage Weil <sage@redhat.com>


          crimson: common/admin_socket kludge so that it builds

a17e889

Signed-off-by: Sage Weil <sage@redhat.com>

liewegas force-pushed the wip-asok-tell branch from 0d2107a to a17e889 Compare

October 4, 2019 14:07

liewegas dismissed batrick’s stale review

October 4, 2019 14:07

addressed

Member Author

liewegas commented Oct 5, 2019

retest this please

liewegas added a commit that referenced this pull request


          Merge PR #30217 into master

bbc7bb5

* refs/pull/30217/head:
	crimson: common/admin_socket kludge so that it builds
	mon/MonClient: fix sending mon command to a specific rank
	src/.gitignore: ignore .tox
	mon/MonClient: interpret numeric mon target name as rank
	mgr,mgr/MgrClient: use fsid to signal mon-mgr vs cli MCommands
	qa/workunits/cephtool: fix errpr checks for 'ceph daemon' commands
	common/ceph_context: make 'config unset' idempotent
	qa/tasks/dump_stuck: mon.a, not mon.0
	qa/suites/rados/singleton/all/admin-socket: fix test
	common/config: EPERM setting config option after startup
	qa/workunits/cephtool/test.sh: fix tell output error check
	common/admin_socket: pass Formatter from generic infrastructure
	common/admin_socket: pass ostream to call() for error output
	os/bluestore: fix asok hook return value
	rgw: fix asok return value
	common/ceph_context: return error code from asok commands
	test/pybind/test_rados: fix accidental mon tell test
	mon: print entity_name along with caps to debug log
	PendingReleaseNotes: notes about asok changes
	mgr/MgrClient: empty target string for 'tell' means active mgr
	common/admin_socket: report error code as part of output string
	osd: change trigger_[deep_]scrub tommands to a pg tell command
	osd: remove old command workqueue, threadpool
	osd: drop MMonCommand handling
	osdc/Objecter: resend OSD tell commands on EAGAIN
	osd: route tell commands to asok; migrate commands
	osd: use unique_ptr<Formatter> for asok_command
	common/ceph_context: add generic asok 'injectargs'
	common/admin_socket: allow dup prefixes
	common/admin_socket: refactor with sync and async execute_command variants
	common/admin_socket: pass input bufferlist
	osd: transition to call_async() for asok
	common/admin_socket: support alternative call_async()
	mon/MonClient: send tell commands out of band via MCommand
	mon: accept tell commands via MCommand and send them to asok handler
	common/admin_socket: return int from hook call()
	mgr/DaemonServer: route MCommand (for octopus+) to asok commands
	do not use 'ceph tell mgr'
	pybind/ceph_argparse: disambiguate mgr tell and CLI commands
	ceph: make 'ceph tell mgr.*' send to the active mgr
	ceph: send 'ceph tell mgr.X' to the right mgr
	librados: add rados_mgr_command_target
	mgr/MgrClient: add start_command variant that takes a target
	common/admin_socket: drop unregister_command(); use per-hook variant
	common/admin_socket: drop explicit prefix arg to register_command
	common/admin_socket: simplify command routing
	common/admin_socket: add ability to process MCommand via asok queue
	common/admin_socket: pass cmdvec to execute_command
	common/admin_socket: use pipe for general wakeup
	include/compat: add flags arg to pipe_cloexec
	common/admin_socket: drop unused args

Reviewed-by: Neha Ojha <nojha@redhat.com>

liewegas merged commit a17e889 into ceph:master

liewegas deleted the wip-asok-tell branch

October 6, 2019 14:09

Contributor

tchaikov commented Oct 21, 2019

i suspect that this change breaks the test of ceph_test_admin_socket_output . see https://tracker.ceph.com/issues/42387

Member Author

liewegas commented Oct 21, 2019

i suspect that this change breaks the test of ceph_test_admin_socket_output . see https://tracker.ceph.com/issues/42387

It does; see #30859

phlogistonjohn mentioned this pull request

The MonCommandTarget & MonCommandTargetWithInputBuffer functions behave differently on ceph octopus ceph/go-ceph#291

Closed

tchaikov mentioned this pull request

crimson/osd: fix pg prefix error #46627

Merged

14 tasks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core