qa/upgrade: fix checks to make sure upgrade is still in progress#58605
qa/upgrade: fix checks to make sure upgrade is still in progress#58605
Conversation
Without checking both for the upgrade being in progress and that the status isn't reporting an error, we can end up in a scenario where the test is just waiting for an upgrade that has already been marked failed and will never complete. This same sort of change was already done in the orch suite upgrade tests and has helped with jobs timing out there Fixes: https://tracker.ceph.com/issues/65546 Signed-off-by: Adam King <adking@redhat.com>
This test was trying to partially upgrade the mons and OSDs by kicking off an upgrade and then checking every 2 seconds if enough had been upgraded. Since staggered upgrade parameters were present in the initial reef release (not true for quincy) it makes sense to use them instead in order to do this in a more controlled manner. Signed-off-by: Adam King <adking@redhat.com>
|
@batrick this should help with some of the upgrade test timeouts you brought up in the CLT call |
yeah, I'm just going to include it in a build with a bunch of PRs for an orch run and then also run the upgrade suite |
|
jenkins test dashboard cephadm |
|
Adding this to my batch too for fun. Don't wait on me. |
|
This PR is under test in https://tracker.ceph.com/issues/67214. |
|
jenkins test dashboard cephadm |
|
dashboard cephadm e2e started breaking recently after #56331. |
|
jenkins test dashboard cephadm |
|
Sigh. https://pulpito.ceph.com/?suite=upgrade @adk3798 if you're satisfied this hasn't obviously broken anything and will fix the tracker ticket, you have my blessing to merge. |
Alright, I'm pretty sure the main issues with the suite are the |
it's possible the thrashosds task issue is https://tracker.ceph.com/issues/66698, but I haven't been able to confirm yet. Going to rerun one of the stress-split jobs |
Without checking both for the upgrade being in progress and that
the status isn't reporting an error, we can end up in a scenario
where the test is just waiting for an upgrade that has already
been marked failed and will never complete. This same sort of
change was already done in the orch suite upgrade tests and
has helped with jobs timing out there
Fixes: https://tracker.ceph.com/issues/65546
This also updates the reef-x stress-split test to make use of staggered
upgrade parameters since we can be sure any given reef image has
access to them
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e