mgr/cephadm: Manage /etc/ceph/ceph.conf#35576
Conversation
094c4e4 to
dd53952
Compare
mgfritch
left a comment
There was a problem hiding this comment.
Couple thoughts/questions:
- seems like this would break any host with a multi-cluster setup?
- what about
ceph.client.admin.keyringorceph.pub? - if the action fails should it be retried on the next host check?
- what if the
ceph.conffile was changed manually? detect this? - add an action specific to sync'ing
ceph.confon demand?
| self.prometheus_alerts_path = '' | ||
| self.migration_current = None | ||
| self.config_dashboard = True | ||
| self.manage_etc_ceph_ceph_conf = True |
There was a problem hiding this comment.
| self.manage_etc_ceph_ceph_conf = True | |
| self.manage_etc_ceph_ceph_conf = False |
change this to match the default value in the MODULE_OPTION ?
There was a problem hiding this comment.
that kills the multi-cluster support. don't know if we want that.
| out, err, code = remoto.process.check( | ||
| conn, | ||
| ['mkdir', '-p', '/etc/ceph']) |
There was a problem hiding this comment.
what if a different output dir was used during bootstrap (e.g. vstart)?
$ cephadm bootstrap ... --output-dir ~/ceph/build/
There was a problem hiding this comment.
Under no circumstance, I'd automatically enable manage_etc_ceph_ceph_conf if output-dir is something different than /etc/ceph
definitely.
deploying the admin keyring? hm. maybe. might be an idea, but on which hosts? Placement spec? @ricardoasmarques + @fmount wdyt?
I think we need this in any case. Maybe
I#d make this a manual step. I really don#t want to head into the realm of automatically deploying non-containerized stuff.
I'd simply overwrite it. |
Speaking about the OpenStack context with Director deployed Ceph cluster, the mons/mgrs are 3 and they are collocated into the Controller nodes.
Right now we have this role [2] that is able to get the relevant data from the first controller (or mon) and sync them to the other existing monitors, but we still have the problem of updating
Agree with that
Manual operations should be avoided and we need a way to make the config consistent, but there are useful options [3] that can be configured on a new deployment (e.g. HCI environments). [1] https://github.com/fmount/tripleo-ceph/blob/master/roles/tripleo_cluster_set_container_cli/tasks/set_container_cli.yaml#L13 |
Interesting. Onto which hosts should the admin keyring distributed? At least I'd think this might be independent of the MONs.
Actually you can't properly manage the ceph.conf at all, simply as you're not getting notified when a new MON enters the cluster. Which files are you interested in, other than the ceph.conf?
OK, things get interesing now. If you add a new host to the cluster, cephamd will:
Is this sufficient and ok for a client machine? Otherwise you'll need to distribute the ceph.conf yourself.
By this point, ceph.conf should be nearly empty and should contain only the information to access the other MONs. If you need anything, please use
|
fd9426e to
46b9dfb
Compare
|
right now, this is IMO safe to merge, as this is disabled by default using a feature-flag. enables us to improve it later on (e.g. enable it by default, change the behaviour etc) |
In general OpenStack collocates monitors and mgrs in controller nodes where the ctlplane is found, so at least monitors should have all the relevant data to help operators interact w/ Ceph cluster. The undercloud isn't able to reach the StorageNetwork on the overcloud, so monitors are the first entrypoint for Director deployed clusters.
Not sure right now, ceph.conf and keys are required when a new mon is scheduled, but I need to understand which is the gap compared to
For a client node != mons it's ok, I'd like to copy and sync stuff myself since it's an extra logic for cephadm that cannot cover all the potential scenarios, what I'm not sure right now it's the reason why you deploy node-exporter and ceph-crash by default in a client node (it can be external to the Ceph cluster)
Ack, I'll try to push config using this approach
[1] https://github.com/fmount/tripleo-ceph/blob/master/roles/tripleo_cluster_mon_config/tasks/set_monitor_public_network.yaml |
sounds reasonable.
ok. I'm going to merge this as it is. I think we can consider adding new things to distribute later on.
that can be configured by changing the placement specification of the services. Deploying them on all known hosts is the default right now. |
|
jenkins retest this |
|
jenkins test make check |
|
jenkins test docs |
|
jenkins test make check |
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
reason is, we want to use this hook to schedule a ceph.conf update for all hosts. Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
46b9dfb to
c18ad7c
Compare
mgfritch
left a comment
There was a problem hiding this comment.
this still somewhat abuses the host refresh when I think a dedicated orch host update .. or similar command might be better ...
otherwise, lgtm
TODO
ceph/qa/tasks/cephadm.py
Lines 961 to 964 in b2de27b
Considerations
error_ok=True, as id detours the normal python error handling. Making it necessary to add special code for it.To enable the management of ceph.conf
ceph config set mgr mgr/cephadm/manage_etc_ceph_ceph_conf trueChecklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard backendjenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox