Skip to content

qa/suites/rados/mgr/tasks/dashboard: add MDS_ALL_DOWN to whitelist#21549

Merged
tchaikov merged 1 commit intoceph:masterfrom
rjfd:wip-dashboard-fix-qa-whitelist
Apr 21, 2018
Merged

qa/suites/rados/mgr/tasks/dashboard: add MDS_ALL_DOWN to whitelist#21549
tchaikov merged 1 commit intoceph:masterfrom
rjfd:wip-dashboard-fix-qa-whitelist

Conversation

@rjfd
Copy link
Contributor

@rjfd rjfd commented Apr 20, 2018

Signed-off-by: Ricardo Dias rdias@suse.com

@rjfd
Copy link
Contributor Author

rjfd commented Apr 20, 2018

Dashboard QA suite has been failing with a health check error due to MDS_ALL_DOWN. By looking at the MDS logs, the MDS are killed and restarted successfully, and therefore I think this health check error should be ignored.

@batrick @jcsp is my hypothesis correct, or am I missing something?

@rjfd rjfd requested review from batrick and tchaikov April 20, 2018 08:02
@tchaikov
Copy link
Contributor

tchaikov commented Apr 20, 2018

@rjfd i think this health reporting is expected and it is introduced by a recent change at 0a59560

and by checking the log of /a/rdias-2018-04-19_15:33:17-rados:mgr-wip-rdias-testing-distro-basic-smithi/2415948

2018-04-19T15:50:50.594 INFO:tasks.ceph.mds.a:Restarting daemon
2018-04-19T15:50:50.595 INFO:teuthology.orchestra.run.smithi135:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mds -f --cluster ceph -i a'
2018-04-19T15:50:50.595 INFO:tasks.ceph.mds.c:Restarting daemon
2018-04-19T15:50:50.595 INFO:teuthology.orchestra.run.smithi135:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mds -f --cluster ceph -i c'
2018-04-19T15:50:50.596 INFO:tasks.ceph.mds.b:Restarting daemon
2018-04-19T15:50:50.596 INFO:teuthology.orchestra.run.smithi186:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mds -f --cluster ceph -i b'
2018-04-19T15:50:50.598 INFO:tasks.ceph.mds.c:Started
2018-04-19T15:50:50.598 DEBUG:teuthology.parallel:result is None
2018-04-19T15:50:50.600 INFO:tasks.ceph.mds.a:Started
2018-04-19T15:50:50.601 DEBUG:teuthology.parallel:result is None
2018-04-19T15:50:50.601 INFO:tasks.ceph.mds.b:Started
2018-04-19T15:50:50.602 DEBUG:teuthology.parallel:result is None
...
2018-04-19T15:55:21.360 INFO:tasks.ceph.mds.a:Restarting daemon
2018-04-19T15:55:21.360 INFO:teuthology.orchestra.run.smithi135:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mds -f --c
luster ceph -i a'
2018-04-19T15:55:21.361 INFO:tasks.ceph.mds.c:Restarting daemon
2018-04-19T15:55:21.361 INFO:teuthology.orchestra.run.smithi135:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mds -f --c
luster ceph -i c'
2018-04-19T15:55:21.361 INFO:tasks.ceph.mds.b:Restarting daemon
2018-04-19T15:55:21.361 INFO:teuthology.orchestra.run.smithi186:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mds -f --c
luster ceph -i b'
2018-04-19T15:55:21.374 INFO:tasks.ceph.mds.c:Started
2018-04-19T15:55:21.374 INFO:tasks.ceph.mds.a:Started
2018-04-19T15:55:21.375 DEBUG:teuthology.parallel:result is None
2018-04-19T15:55:21.375 DEBUG:teuthology.parallel:result is None
2018-04-19T15:55:21.375 INFO:tasks.ceph.mds.b:Started
2018-04-19T15:55:21.376 DEBUG:teuthology.parallel:result is None

i think we are indeed restarting mds daemons on purpose.

@tchaikov tchaikov added this to the mimic milestone Apr 20, 2018
@tchaikov
Copy link
Contributor

@tchaikov
Copy link
Contributor

tchaikov commented Apr 20, 2018

seems we need to whitelist MDS_ALL_DOWN to more tests, see http://pulpito.ceph.com/kchai-2018-04-20_14:33:42-rados-wip-kefu-testing-2018-04-20-1141-distro-basic-mira/

@rjfd rjfd force-pushed the wip-dashboard-fix-qa-whitelist branch from 7b8b342 to 717bfa1 Compare April 20, 2018 15:51
@rjfd
Copy link
Contributor Author

rjfd commented Apr 20, 2018

@tchaikov added another commit that adds MDS_UP_LESS_THAN_MAX to the whitelist as well

@tchaikov
Copy link
Contributor

retest this please

@tchaikov
Copy link
Contributor

@tchaikov
Copy link
Contributor

@rjfd could you squash these two commits into a single one?

…AX) to whitelist

Signed-off-by: Ricardo Dias <rdias@suse.com>
@rjfd rjfd force-pushed the wip-dashboard-fix-qa-whitelist branch from 717bfa1 to 7309be1 Compare April 21, 2018 14:58
@rjfd
Copy link
Contributor Author

rjfd commented Apr 21, 2018

@tchaikov done

@tchaikov tchaikov merged commit 4f73c60 into ceph:master Apr 21, 2018
@tchaikov
Copy link
Contributor

tchaikov commented Apr 21, 2018

@rjfd @batrick @jcsp i am adding MDS_ALL_DOWN to all tests in ceph/teuthology#1168 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants