mgr/dashboard: provide the service events when showing a service in the UI#40328
mgr/dashboard: provide the service events when showing a service in the UI#40328epuertat merged 1 commit intoceph:masterfrom
Conversation
40e7db3 to
09a0b5c
Compare
|
@aaSharma14 sorry for aborting your "make check" test. as it will fail anyway as i am testing the "make check" run focal test node. |
No issues @tchaikov . Thanks for informing. |
|
jenkins test make check |
src/pybind/mgr/dashboard/frontend/src/app/shared/pipes/newline.pipe.ts
Outdated
Show resolved
Hide resolved
09a0b5c to
c5f89f3
Compare
|
@pcuzner we'd love to hear your thoughts about this one! |
|
Great to see this! Couple of questions.
|
fa7f7ee to
eed7e16
Compare
No specific reason for this. Just thought that Logs might be a more familiar term for the user. However changed it to Events now.
One cell would contain all the entries. New event would start from new line. Like in this screenshot there are two daemon events for one daemon..same would be for multiple service events.
I have updated the screenshot for Daemon events with the content. It is basically the events that we are getting from
|
|
jenkins test make check |
|
@aaSharma14 thanks! |
| </cd-table> | ||
| </div> | ||
|
|
||
| <ng-template #elseBlock> |
There was a problem hiding this comment.
IMHO "elseBlock" is an undescriptive name for the template, maybe sth like "serviceDetailsTpl" would be better.
6eaf461 to
18d33e4
Compare
epuertat
left a comment
There was a problem hiding this comment.
The screenshot looks great (I know there's a lot of work here, so thank you @aaSharma14!), but I think the code generating that needs some refinement yet.
I added my comments on the performance. It'd be great if you could provide the ng.profiler.timeChangeDetection({record : true}) stats before & after this change, so we can see how worse this gets.
I still think that splitting the log lines in the back-end is preferred (especially when it's already done there), but exploring the OnPush strategy is a must, same as trying the template Cell instead of the current Pipe approach.
...rontend/src/app/ceph/cluster/services/service-daemon-list/service-daemon-list.component.html
Outdated
Show resolved
Hide resolved
...rontend/src/app/ceph/cluster/services/service-daemon-list/service-daemon-list.component.html
Outdated
Show resolved
Hide resolved
| osdspec_affinity: Optional[str] = None, | ||
| last_deployed: Optional[datetime.datetime] = None, | ||
| events: Optional[List['OrchestratorEvent']] = None, | ||
| event_dict: Optional[List[dict]] = None, |
There was a problem hiding this comment.
@avanthakkar, the idea of something like to_json(True) returning anything other than a JSON string makes me shiver... We shouldn't bury ad-hoc functionality inside the existing methods... It costs nothing to extract the common code and to come up with 2 separate methods (to_json() and to_dict()), but we should dispel the idea of reusing functions for doing many unrelated things.
src/pybind/mgr/dashboard/frontend/src/app/shared/pipes/text-to-label.pipe.ts
Outdated
Show resolved
Hide resolved
.../frontend/src/app/ceph/cluster/services/service-daemon-list/service-daemon-list.component.ts
Outdated
Show resolved
Hide resolved
src/pybind/mgr/dashboard/frontend/src/app/shared/pipes/list-to-html.pipe.spec.ts
Outdated
Show resolved
Hide resolved
.../frontend/src/app/ceph/cluster/services/service-daemon-list/service-daemon-list.component.ts
Outdated
Show resolved
Hide resolved
18d33e4 to
8f97669
Compare
8f97669 to
2dda611
Compare
epuertat
left a comment
There was a problem hiding this comment.
Looks really improved now (IMHO)! What do you think @aaSharma14 ? I just left a few extra improvements.
...rontend/src/app/ceph/cluster/services/service-daemon-list/service-daemon-list.component.html
Outdated
Show resolved
Hide resolved
...rontend/src/app/ceph/cluster/services/service-daemon-list/service-daemon-list.component.scss
Outdated
Show resolved
Hide resolved
| status = { | ||
| 'container_image_id': self.container_image_id, | ||
| 'container_image_name': self.container_image_name, | ||
| 'rados_config_location': self.rados_config_location, | ||
| 'service_url': self.service_url, | ||
| 'size': self.size, | ||
| 'running': self.running, | ||
| 'last_refresh': self.last_refresh, | ||
| 'created': self.created, | ||
| 'virtual_ip': self.virtual_ip, | ||
| 'ports': self.ports if self.ports else None, | ||
| } | ||
| for k in ['last_refresh', 'created']: | ||
| if getattr(self, k): | ||
| status[k] = datetime_to_str(getattr(self, k)) | ||
| status = {k: v for (k, v) in status.items() if v is not None} |
There was a problem hiding this comment.
Maybe it's just me but I prefer to have clear declarative data struct definitions rather than ones that are dynamically stretched:
| status = { | |
| 'container_image_id': self.container_image_id, | |
| 'container_image_name': self.container_image_name, | |
| 'rados_config_location': self.rados_config_location, | |
| 'service_url': self.service_url, | |
| 'size': self.size, | |
| 'running': self.running, | |
| 'last_refresh': self.last_refresh, | |
| 'created': self.created, | |
| 'virtual_ip': self.virtual_ip, | |
| 'ports': self.ports if self.ports else None, | |
| } | |
| for k in ['last_refresh', 'created']: | |
| if getattr(self, k): | |
| status[k] = datetime_to_str(getattr(self, k)) | |
| status = {k: v for (k, v) in status.items() if v is not None} | |
| status = { | |
| 'container_image_id': self.container_image_id, | |
| 'container_image_name': self.container_image_name, | |
| 'rados_config_location': self.rados_config_location, | |
| 'service_url': self.service_url, | |
| 'size': self.size, | |
| 'running': self.running, | |
| 'last_refresh': datetime_to_str(self.last_refresh), | |
| 'created': datetime_to_str(self.created), | |
| 'virtual_ip': self.virtual_ip, | |
| 'ports': self.ports if self.ports else None, | |
| } | |
| # REMOVE the following? | |
| # status = {k: v for (k, v) in status.items() if v is not None} |
Why do we remove the keys that are None? For data sharing it's better to have stable/fixed data structure that ones that randomly show some properties...
@sebastian-philipp @jmolmo what do you think?
There was a problem hiding this comment.
Same here. This is new code. We should deliver clean new code, and not follow suboptimal code.
There was a problem hiding this comment.
@epuertat your suggestion is also favoring explicit over implicit, which is something we should follow :)
2dda611 to
45d7611
Compare
9115c28 to
22da72f
Compare
epuertat
left a comment
There was a problem hiding this comment.
More suggestions going there! :D
| <div *ngIf="value.length != 0 && value != undefined" | ||
| class="ul-margin"> | ||
| <ul *ngFor="let event of value"> |
There was a problem hiding this comment.
Doesn't this work without the previous div? I'd assume that if value.length is 0 or undefined or null, the ngFor won't be displayed either:
| <div *ngIf="value.length != 0 && value != undefined" | |
| class="ul-margin"> | |
| <ul *ngFor="let event of value"> | |
| <ul *ngFor="let event of events" class="ul-margin"> |
Additionally, ngfors can be computationally expensive (on every change cycle, Angular will probably re-generate the whole loop). To avoid that, ngfor has an extra option trackBy, which is a function (perhaps an inline one works) returning a unique field for each component to identify if it changed (in our case, do we have a uuid for each event? perhaps the timestamp?).
BTW, I'd also suggest to change value with events as it makes clearer what the variable is about.
There was a problem hiding this comment.
Thanks @epuertat . I have updated this with a trackBy function and have changed value to events. Regarding removing the previous div - I tried removing it but it shows both the 'event details' and 'No data avialable' in this case, so decided to keep it as it is.

| def to_dict(self) -> dict: | ||
| out: Dict[str, Any] = OrderedDict() | ||
| out['daemon_type'] = self.daemon_type | ||
| out['daemon_id'] = self.daemon_id | ||
| out['hostname'] = self.hostname | ||
| out['container_id'] = self.container_id | ||
| out['container_image_id'] = self.container_image_id | ||
| out['container_image_name'] = self.container_image_name | ||
| out['container_image_digests'] = self.container_image_digests | ||
| out['memory_usage'] = self.memory_usage | ||
| out['memory_request'] = self.memory_request | ||
| out['memory_limit'] = self.memory_limit | ||
| out['version'] = self.version | ||
| out['status'] = self.status.value if self.status is not None else None | ||
| out['status_desc'] = self.status_desc | ||
| if self.daemon_type == 'osd': | ||
| out['osdspec_affinity'] = self.osdspec_affinity | ||
| out['is_active'] = self.is_active | ||
| out['ports'] = self.ports | ||
| out['ip'] = self.ip | ||
|
|
||
| for k in ['last_refresh', 'created', 'started', 'last_deployed', | ||
| 'last_configured']: | ||
| if getattr(self, k): | ||
| out[k] = datetime_to_str(getattr(self, k)) | ||
|
|
||
| if self.events: | ||
| out['events'] = [e.to_dict() for e in self.events] | ||
|
|
||
| empty = [k for k, v in out.items() if v is None] | ||
| for e in empty: | ||
| del out[e] | ||
| return out |
There was a problem hiding this comment.
And same for this, sorry!
return {
'daemon_type': ...If we drop support for Python 3.6 (I think CentOS 8 supports 3.8), we could use dataclasses.asdict().
There was a problem hiding this comment.
This refactoring actually requires a lot of code change in the cephadm code. There are a number of tests that are impacted by this. It would be better if we take this as an improvement in a separate PR as those changes would be unrelated to the purpose that this PR is serving.
There was a problem hiding this comment.
You mean the dataclasses? Because a refactoring like the one suggested shouldn't impact any existing tests given this is a new method and I don't see any unit test testing this.
BTW, no need to use an OrderedDict here as dict's in Py3.6 are by default ordered:
| def to_dict(self) -> dict: | |
| out: Dict[str, Any] = OrderedDict() | |
| out['daemon_type'] = self.daemon_type | |
| out['daemon_id'] = self.daemon_id | |
| out['hostname'] = self.hostname | |
| out['container_id'] = self.container_id | |
| out['container_image_id'] = self.container_image_id | |
| out['container_image_name'] = self.container_image_name | |
| out['container_image_digests'] = self.container_image_digests | |
| out['memory_usage'] = self.memory_usage | |
| out['memory_request'] = self.memory_request | |
| out['memory_limit'] = self.memory_limit | |
| out['version'] = self.version | |
| out['status'] = self.status.value if self.status is not None else None | |
| out['status_desc'] = self.status_desc | |
| if self.daemon_type == 'osd': | |
| out['osdspec_affinity'] = self.osdspec_affinity | |
| out['is_active'] = self.is_active | |
| out['ports'] = self.ports | |
| out['ip'] = self.ip | |
| for k in ['last_refresh', 'created', 'started', 'last_deployed', | |
| 'last_configured']: | |
| if getattr(self, k): | |
| out[k] = datetime_to_str(getattr(self, k)) | |
| if self.events: | |
| out['events'] = [e.to_dict() for e in self.events] | |
| empty = [k for k, v in out.items() if v is None] | |
| for e in empty: | |
| del out[e] | |
| return out | |
| def to_dict(self) -> dict: | |
| return { | |
| 'daemon_type': self.daemon_type, | |
| 'daemon_id': self.daemon_id, | |
| 'hostname': self.hostname, | |
| ... | |
| } |
There was a problem hiding this comment.
There are tests in cephadm and orchestrator which are affected :
Since in the approach that you are suggesting, we are not removing the attributes that are 'None', so we need to change the expected schemas in all these tests. And in-fact in each case there are 5-10 values that are expected as None.
There was a problem hiding this comment.
Ok, I now see where it fails. My bad. Sorry @aaSharma14. I don't think it's a good practice to have a dynamic schema 'because some values are None' (in that case, the Schema should define that a property can either have a value or be None).
In any case, let's go with it. We have enough code to clean-up in dashboard without having to deal with orchestrator too.
22da72f to
f0615ee
Compare
a6ca029 to
7e26f7f
Compare
…he UI When service deployment failures occur, events are generated and associated with the service. This PR intends to add these events as Daemon Logs and Service Logs to the UI for troubleshooting and doing diagnostics for the same. Fixes: https://tracker.ceph.com/issues/49262 Signed-off-by: Aashish Sharma <aasharma@redhat.com>
7e26f7f to
00ce7c9
Compare
| def to_dict(self) -> dict: | ||
| out: Dict[str, Any] = OrderedDict() | ||
| out['daemon_type'] = self.daemon_type | ||
| out['daemon_id'] = self.daemon_id | ||
| out['hostname'] = self.hostname | ||
| out['container_id'] = self.container_id | ||
| out['container_image_id'] = self.container_image_id | ||
| out['container_image_name'] = self.container_image_name | ||
| out['container_image_digests'] = self.container_image_digests | ||
| out['memory_usage'] = self.memory_usage | ||
| out['memory_request'] = self.memory_request | ||
| out['memory_limit'] = self.memory_limit | ||
| out['version'] = self.version | ||
| out['status'] = self.status.value if self.status is not None else None | ||
| out['status_desc'] = self.status_desc | ||
| if self.daemon_type == 'osd': | ||
| out['osdspec_affinity'] = self.osdspec_affinity | ||
| out['is_active'] = self.is_active | ||
| out['ports'] = self.ports | ||
| out['ip'] = self.ip | ||
|
|
||
| for k in ['last_refresh', 'created', 'started', 'last_deployed', | ||
| 'last_configured']: | ||
| if getattr(self, k): | ||
| out[k] = datetime_to_str(getattr(self, k)) | ||
|
|
||
| if self.events: | ||
| out['events'] = [e.to_dict() for e in self.events] | ||
|
|
||
| empty = [k for k, v in out.items() if v is None] | ||
| for e in empty: | ||
| del out[e] | ||
| return out |
There was a problem hiding this comment.
You mean the dataclasses? Because a refactoring like the one suggested shouldn't impact any existing tests given this is a new method and I don't see any unit test testing this.
BTW, no need to use an OrderedDict here as dict's in Py3.6 are by default ordered:
| def to_dict(self) -> dict: | |
| out: Dict[str, Any] = OrderedDict() | |
| out['daemon_type'] = self.daemon_type | |
| out['daemon_id'] = self.daemon_id | |
| out['hostname'] = self.hostname | |
| out['container_id'] = self.container_id | |
| out['container_image_id'] = self.container_image_id | |
| out['container_image_name'] = self.container_image_name | |
| out['container_image_digests'] = self.container_image_digests | |
| out['memory_usage'] = self.memory_usage | |
| out['memory_request'] = self.memory_request | |
| out['memory_limit'] = self.memory_limit | |
| out['version'] = self.version | |
| out['status'] = self.status.value if self.status is not None else None | |
| out['status_desc'] = self.status_desc | |
| if self.daemon_type == 'osd': | |
| out['osdspec_affinity'] = self.osdspec_affinity | |
| out['is_active'] = self.is_active | |
| out['ports'] = self.ports | |
| out['ip'] = self.ip | |
| for k in ['last_refresh', 'created', 'started', 'last_deployed', | |
| 'last_configured']: | |
| if getattr(self, k): | |
| out[k] = datetime_to_str(getattr(self, k)) | |
| if self.events: | |
| out['events'] = [e.to_dict() for e in self.events] | |
| empty = [k for k, v in out.items() if v is None] | |
| for e in empty: | |
| del out[e] | |
| return out | |
| def to_dict(self) -> dict: | |
| return { | |
| 'daemon_type': self.daemon_type, | |
| 'daemon_id': self.daemon_id, | |
| 'hostname': self.hostname, | |
| ... | |
| } |
| status = { | ||
| 'container_image_id': self.container_image_id, | ||
| 'container_image_name': self.container_image_name, | ||
| 'rados_config_location': self.rados_config_location, | ||
| 'service_url': self.service_url, | ||
| 'size': self.size, | ||
| 'running': self.running, | ||
| 'last_refresh': self.last_refresh, | ||
| 'created': self.created, | ||
| 'virtual_ip': self.virtual_ip, | ||
| 'ports': self.ports if self.ports else None, | ||
| } | ||
| for k in ['last_refresh', 'created']: | ||
| if getattr(self, k): | ||
| status[k] = datetime_to_str(getattr(self, k)) | ||
| status = {k: v for (k, v) in status.items() if v is not None} |
There was a problem hiding this comment.
Same here. This is new code. We should deliver clean new code, and not follow suboptimal code.
epuertat
left a comment
There was a problem hiding this comment.
Awesome work, @aaSharma14 ! 🥳
| def to_dict(self) -> dict: | ||
| out: Dict[str, Any] = OrderedDict() | ||
| out['daemon_type'] = self.daemon_type | ||
| out['daemon_id'] = self.daemon_id | ||
| out['hostname'] = self.hostname | ||
| out['container_id'] = self.container_id | ||
| out['container_image_id'] = self.container_image_id | ||
| out['container_image_name'] = self.container_image_name | ||
| out['container_image_digests'] = self.container_image_digests | ||
| out['memory_usage'] = self.memory_usage | ||
| out['memory_request'] = self.memory_request |
There was a problem hiding this comment.
There was a problem hiding this comment.
Was it a wrongly resolved conflict or what? So it's just a matter of adding the missing fields?
@aaSharma14 please have a look at this, please? Thanks!
There was a problem hiding this comment.
@epuertat, No its not a wrongly resolved commit. At that time we introduced a new to_dict method. Over the time new fields were added to the to_json method..but those were not added to the to_dict method. So, now both these two methods are not in sync with each other as there are different number of fields in both. A quick solution would be add those new fields to to_dict method as well. Or if we want to refactor the code..then I will need to find a better solution for this.
There was a problem hiding this comment.
@epuertat, No its not a wrongly resolved commit. At that time we introduced a new
to_dictmethod. Over the time new fields were added to theto_jsonmethod.
Right. That's how it is. If you want to avoid having to maintain this method every now and then, you have to unify those two again.
There was a problem hiding this comment.
Thanks @aaSharma14! I remember I already asked this, but isn't there a way to avoid this by reusing the same method? I find quite unusual to have 2 methods that do almost the same (I'd remove the to_json() one as it's not JSON (text) what that's returning but a dict with some subfields in JSON format...
There was a problem hiding this comment.
The reason I avoided calling those methods to_dict is, that they actually sometimes don't return a dict:
➜ src git:(_check_for_moved_osds) ✗ find * -name '*.py' | grep -v -e rook_client -e tox -e test | xargs egrep 'def.*to_json'
ceph-volume/venv/lib/python3.9/site-packages/pip/_internal/models/direct_url.py: def to_json(self):
pybind/mgr/cephadm/services/osd.py: def to_json(self) -> dict:
pybind/mgr/cephadm/configchecks.py: def to_json(self) -> Dict[str, Any]:
pybind/mgr/cephadm/configchecks.py: def to_json(self) -> List[Dict[str, str]]:
pybind/mgr/cephadm/upgrade.py: def to_json(self) -> dict:
pybind/mgr/cephadm/inventory.py: def to_json(self) -> Dict[str, Any]:
pybind/mgr/orchestrator/module.py: def to_json_1(obj: Any) -> Any:
pybind/mgr/orchestrator/module.py: def to_json_n(objs: List) -> List:
pybind/mgr/orchestrator/_interface.py: def to_json(self) -> dict:
pybind/mgr/orchestrator/_interface.py: def to_json(self) -> OrderedDict:
pybind/mgr/orchestrator/_interface.py: def to_json(self) -> dict:
pybind/mgr/orchestrator/_interface.py: def to_json(self) -> str:
pybind/mgr/progress/module.py: def to_json(self):
pybind/mgr/progress/module.py: def to_json(self):
pybind/mgr/rbd_support/schedule.py: def to_json(self) -> str:
pybind/mgr/rbd_support/task.py: def to_json(self) -> str:
pybind/mgr/volumes/fs/volume.py:def name_to_json(names):
python-common/ceph/deployment/inventory.py: def to_json(self):
python-common/ceph/deployment/inventory.py: def to_json(self):
python-common/ceph/deployment/hostspec.py: def to_json(self) -> Dict[str, Any]:
python-common/ceph/deployment/drive_group.py: def to_json(self):
python-common/ceph/deployment/service_spec.py: def to_json(self) -> str:
python-common/ceph/deployment/service_spec.py: def to_json(self) -> dict:
python-common/ceph/deployment/service_spec.py: def to_json(self):



When service deployment failures occur, events are generated and associated with the service. This PR intends to add these events as Daemon Events and Service Events to the UI for troubleshooting and doing diagnostics for the same.
Fixes: https://tracker.ceph.com/issues/49262
Signed-off-by: Aashish Sharma aasharma@redhat.com
Service Events-

Daemon Events-

Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox