mgr/orchestrator: Fix ceph orch ls in Rook#39612
mgr/orchestrator: Fix ceph orch ls in Rook#39612sebastian-philipp merged 1 commit intoceph:masterfrom
Conversation
|
@varshar16 I know that you looked into adding a Teuthology test a while ago. Did you made any progress or did it stall? |
| 'prometheus': 'prometheus', | ||
| 'node-exporter': 'node-exporter', | ||
| 'crash': 'crash', | ||
| 'crashcollector': 'crashcollector', # Specific Rook Daemon |
There was a problem hiding this comment.
Hm. maybe we should change mgr/rook to return crash as a daemon type for crashcollector? wdyt?
There was a problem hiding this comment.
Rook's crashcollector app takes the role of the crash service type of the orchestrator interface. As both service types fill in the same role, my thinking was to keep them aligned at the orchestrator level.
There was a problem hiding this comment.
This means to "mask" reality. Two bad consequences of that:
- We are going to lose (or make more difficult) the possibility of customize/change different things for "crashcollector" pod and "crash" daemon.
- The k8s user is going to wonder about what happened with the crashcollector pod., so we will need to explain this in the documentation.
I cannot see a big improvement or advantage masking "crashcollector" pod.
There was a problem hiding this comment.
Have a look at
Rook is deploying the very same crash daemon as cephadm is. Having two crash service types in
https://github.com/ceph/ceph/blob/master/src/python-common/ceph/deployment/service_spec.py#L390-L392
is a bit awkward.
There was a problem hiding this comment.
Yes. You are right. Because we are doing a conversion from crashCollector to crash ....
ceph/src/pybind/mgr/rook/module.py
Lines 298 to 307 in 8e09ee2
This means that we have the normal output from the orchestrator:
[root@rook-ceph-tools-78cdfd976c-dfs5c /]# ceph orch ls
NAME RUNNING REFRESHED AGE PLACEMENT IMAGE NAME IMAGE ID
crash 3/3 0s ago 71m * jolmomar/ceph:rook 9a420a7fb11e
mgr 1/1 0s ago 65m count:1 jolmomar/ceph:rook 9a420a7fb11e
mon 3/3 0s ago 71m count:3 jolmomar/ceph:rook 9a420a7fb11e
But the user must guess that the crashcollector he can see in k8s is crash daemon he can see in the ceph orch command output:
[jolmomar@juanmipc ceph]$ kubectl -n rook-ceph get pods
NAME READY STATUS RESTARTS AGE
...
rook-ceph-crashcollector-ku-master-00.jm-5fc96cdf46-wq7lq 1/1 Running 0 84m
rook-ceph-crashcollector-ku-worker-00.jm-87449fc54-jf44p 1/1 Running 0 83m
rook-ceph-crashcollector-ku-worker-01.jm-dd855f587-t6rmr 1/1 Running 0 89m
rook-ceph-mgr-a-5c8554f57b-lrwk9 1/1 Running 0 83m
rook-ceph-mon-a-579657c4f-7dchd 1/1 Running 0 89m
rook-ceph-mon-b-65946c794d-9xh8j 1/1 Running 0 84m
rook-ceph-mon-c-6446cf77dd-wdvx2 1/1 Running 0 84m
rook-ceph-operator-559b6fcf59-x54tl 1/1 Running 0 91m
rook-ceph-osd-0-c7b6956df-fgnv2 1/1 Running 0 83m
rook-ceph-osd-1-6755647479-zlbwz 1/1 Running 0 83m
rook-ceph-osd-2-b48759bb9-2zqls 1/1 Running 0 83m
...
It was stalled as I was fixing mgr/rook bugs. Since most of them are fixed now, I will start looking into testing again. |
|
jenkins test make check |
|
355acc3 to
767d8c9
Compare
Fixes: https://tracker.ceph.com/issues/49411 Signed-off-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
767d8c9 to
d070cae
Compare
Fixes: https://tracker.ceph.com/issues/49411
Signed-off-by: Juan Miguel Olmo Martínez jolmomar@redhat.com
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox