Skip to content

mgr/progress: Global Recovery Event in ceph -s#37327

Merged
tchaikov merged 1 commit intoceph:masterfrom
kamoltat:wip-mgr-progress-global-event
Oct 31, 2020
Merged

mgr/progress: Global Recovery Event in ceph -s#37327
tchaikov merged 1 commit intoceph:masterfrom
kamoltat:wip-mgr-progress-global-event

Conversation

@kamoltat
Copy link
Member

Only show Global Progress event in ceph -s
Right now if you want to see all the sub events
You have to do ceph progress.

Signed-off-by: Kamoltat (Junior) Sirivadhna ksirivad@redhat.com

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@kamoltat
Copy link
Member Author

This is a new PR continuing from #29467

@kamoltat kamoltat requested a review from jdurgin September 23, 2020 06:49
@kamoltat kamoltat added the mgr label Sep 23, 2020
@neha-ojha neha-ojha added the core label Sep 23, 2020
Copy link
Member

@jdurgin jdurgin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! I think we'll want to follow up with further PRs to improve the efficiency here. For large clusters, looking through all the pg states for this event (and others) by serializing to json with every update is pretty expensive.

The PGMap keeps track of num_pg_by_state internally, so if we expose python accessors to that we don't need to iterate over all pgs to get the number of active+clean pgs.

@tchaikov
Copy link
Contributor

tchaikov commented Oct 5, 2020

@kamoltat looks great! could you squash these changes into a single commit?

@kamoltat kamoltat force-pushed the wip-mgr-progress-global-event branch 2 times, most recently from bd36140 to 3e8b69d Compare October 6, 2020 13:22
@kamoltat
Copy link
Member Author

kamoltat commented Oct 7, 2020

@kamoltat looks great! could you squash these changes into a single commit?

Done!

Copy link
Contributor

@tchaikov tchaikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2020-10-15T09:06:17.364 INFO:tasks.cephfs_test_runner:test_osd_cannot_recover (tasks.mgr.test_progress.TestProgress) ... FAIL
2020-10-15T09:06:17.365 INFO:tasks.cephfs_test_runner:
2020-10-15T09:06:17.365 INFO:tasks.cephfs_test_runner:======================================================================
2020-10-15T09:06:17.366 INFO:tasks.cephfs_test_runner:FAIL: test_osd_cannot_recover (tasks.mgr.test_progress.TestProgress)
2020-10-15T09:06:17.366 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2020-10-15T09:06:17.366 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2020-10-15T09:06:17.366 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-kefu-testing-2020-10-14-1541/qa/tasks/mgr/test_pro
gress.py", line 264, in test_osd_cannot_recover
2020-10-15T09:06:17.366 INFO:tasks.cephfs_test_runner:    self.assertEqual(len(self._all_events()), osd_count - pool_size)
2020-10-15T09:06:17.367 INFO:tasks.cephfs_test_runner:AssertionError: 2 != 1

https://pulpito.ceph.com/kchai-2020-10-15_08:32:19-rados-wip-kefu-testing-2020-10-14-1541-distro-basic-smithi/5526866/

@kamoltat
Copy link
Member Author

2020-10-15T09:06:17.364 INFO:tasks.cephfs_test_runner:test_osd_cannot_recover (tasks.mgr.test_progress.TestProgress) ... FAIL
2020-10-15T09:06:17.365 INFO:tasks.cephfs_test_runner:
2020-10-15T09:06:17.365 INFO:tasks.cephfs_test_runner:======================================================================
2020-10-15T09:06:17.366 INFO:tasks.cephfs_test_runner:FAIL: test_osd_cannot_recover (tasks.mgr.test_progress.TestProgress)
2020-10-15T09:06:17.366 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2020-10-15T09:06:17.366 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2020-10-15T09:06:17.366 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-kefu-testing-2020-10-14-1541/qa/tasks/mgr/test_pro
gress.py", line 264, in test_osd_cannot_recover
2020-10-15T09:06:17.366 INFO:tasks.cephfs_test_runner:    self.assertEqual(len(self._all_events()), osd_count - pool_size)
2020-10-15T09:06:17.367 INFO:tasks.cephfs_test_runner:AssertionError: 2 != 1

https://pulpito.ceph.com/kchai-2020-10-15_08:32:19-rados-wip-kefu-testing-2020-10-14-1541-distro-basic-smithi/5526866/

@tchaikov Thanks, just updated the test suite

@kamoltat kamoltat force-pushed the wip-mgr-progress-global-event branch 2 times, most recently from d8d079b to f2d50cd Compare October 19, 2020 14:44
@kamoltat
Copy link
Member Author

jenkins retest this please

@tchaikov
Copy link
Contributor

@kamoltat no need to run 400+ tests for exercising this change.

--suite rados:mgr --filter 'tasks/progress' --limit 1

would suffice.

@kamoltat kamoltat force-pushed the wip-mgr-progress-global-event branch from c5d51f3 to 1c61434 Compare October 21, 2020 11:07
@kamoltat kamoltat force-pushed the wip-mgr-progress-global-event branch from 1c61434 to 7275611 Compare October 22, 2020 07:43
@kamoltat
Copy link
Member Author

https://pulpito.ceph.com/ksirivad-2020-10-22_10:35:40-rados:mgr-wip-mgr-progress-global-event-distro-basic-smithi/
I made some changes ran some tests today and it passed. Hope this will save you some time for testing.
@tchaikov

@tchaikov tchaikov dismissed their stale review October 22, 2020 11:27

dismissed.

@kamoltat kamoltat force-pushed the wip-mgr-progress-global-event branch from 7275611 to 0118ef5 Compare October 22, 2020 14:12
@kamoltat
Copy link
Member Author

jenkins retest this please

@kamoltat kamoltat force-pushed the wip-mgr-progress-global-event branch from 0118ef5 to 7396ce2 Compare October 22, 2020 16:42
Modified the progress module and BaseMgrModule to
support Global Recovert Event. Adding more arguments
to update_progress_event, ceph_update_progress_event.
To only show global recovery event progress with `ceph -s`.
All sub events have been move to `ceph progress`

Signed-off-by: Kamoltat <ksirivad@redhat.com>
@kamoltat kamoltat force-pushed the wip-mgr-progress-global-event branch from 7396ce2 to 2af2afa Compare October 22, 2020 16:45
@kamoltat
Copy link
Member Author

https://pulpito.ceph.com/ksirivad-2020-10-23_02:26:16-rados:mgr-wip-mgr-progress-global-event-distro-basic-smithi/

@tchaikov tchaikov merged commit 038750c into ceph:master Oct 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants