qa/distros: add ubuntu 22 as supported distro by cbodley · Pull Request #49443 · ceph/ceph

cbodley · 2022-12-14T21:22:23Z

we're doing builds for these distros. if we're going to support them in reef, we need to start the testing

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows

yuriw · 2023-03-08T15:12:11Z

jenkins test make check

cbodley · 2023-03-08T15:12:48Z

centos9 testing seems to be blocked on #47501 at the moment; would it help if i split the centos changes out to another PR so we can try to test/merge the ubuntu changes?

Signed-off-by: Casey Bodley <cbodley@redhat.com>

…yaml Signed-off-by: Casey Bodley <cbodley@redhat.com>

…_20.04.yaml Signed-off-by: Casey Bodley <cbodley@redhat.com>

cbodley · 2023-03-08T20:16:26Z

the smoke suite was mostly successful on ubuntu22: https://pulpito.ceph.com/cbodley-2023-03-08_18:24:26-smoke-main-distro-default-smithi/

only one cluster [WRN] Health check failed: 1 pool(s) do not have an application enabled (POOL_APP_NOT_ENABLED)

cbodley · 2023-03-10T15:09:25Z

jenkins test api

yuriw · 2023-03-11T16:47:08Z

tests => https://pulpito.ceph.com/?sha1=63c2dce869c8f63c396d3b6505a21c44088ff500

ljflores · 2023-03-13T17:46:55Z

@cbodley @neha-ojha here is the rados suite review:

https://pulpito.ceph.com/?sha1=63c2dce869c8f63c396d3b6505a21c44088ff500

Failures, unrelated:
1. https://tracker.ceph.com/issues/58969
2. https://tracker.ceph.com/issues/58585
3. https://tracker.ceph.com/issues/57755
4. https://tracker.ceph.com/issues/49287
5. https://tracker.ceph.com/issues/58560
6. https://tracker.ceph.com/issues/49727

Details:
1. test_full_health: _ValError: In input['fs_map']['filesystems'][0]['mdsmap']: missing keys: {'max_xattr_size'} - Ceph - Mgr - Dashboard
2. rook: failed to pull kubelet image - Ceph - Orchestrator
3. task/test_orch_cli: test_cephfs_mirror times out - Ceph - Orchestrator
4. SELinux Denials during cephadm/workunits/test_cephadm - Ceph - Orchestrator
5. test_envlibrados_for_rocksdb.sh failed to subscribe to repo - Infrastructure
6. lazy_omap_stats_test: "ceph osd deep-scrub all" hangs - Ceph - RADOS

The only new failure that caught my attention was "1678520463.0682712 osd.2 (osd.2) 4 : cluster [WRN] WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event" in cluster log, which appeared twice on 22.04 tests:

/a/yuriw-2023-03-10_22:46:37-rados-reef-distro-default-smithi/7203358

2023-03-11T10:28:47.074 DEBUG:teuthology.orchestra.run.smithi149:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(POOL_APP_NOT_ENABLED\)' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(OBJECT_' | egrep -v '\(PG_' | egrep -v '\(SLOW_OPS\)' | egrep -v 'overall HEALTH' | egrep -v 'slow request' | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | head -n 1
2023-03-11T10:28:47.263 INFO:teuthology.orchestra.run.smithi149.stdout:1678530338.9736485 osd.3 (osd.3) 104 : cluster [WRN] WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
2023-03-11T10:28:47.263 WARNING:tasks.ceph:Found errors (ERR|WRN|SEC) in cluster log
2023-03-11T10:28:47.264 DEBUG:teuthology.orchestra.run.smithi149:> sudo egrep '\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(POOL_APP_NOT_ENABLED\)' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(OBJECT_' | egrep -v '\(PG_' | egrep -v '\(SLOW_OPS\)' | egrep -v 'overall HEALTH' | egrep -v 'slow request' | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | head -n 1
2023-03-11T10:28:47.280 DEBUG:teuthology.orchestra.run.smithi149:> sudo egrep '\[ERR\]' /var/log/ceph/ceph.log | egrep -v '\(POOL_APP_NOT_ENABLED\)' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(OBJECT_' | egrep -v '\(PG_' | egrep -v '\(SLOW_OPS\)' | egrep -v 'overall HEALTH' | egrep -v 'slow request' | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | head -n 1
2023-03-11T10:28:47.338 DEBUG:teuthology.orchestra.run.smithi149:> sudo egrep '\[WRN\]' /var/log/ceph/ceph.log | egrep -v '\(POOL_APP_NOT_ENABLED\)' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(OBJECT_' | egrep -v '\(PG_' | egrep -v '\(SLOW_OPS\)' | egrep -v 'overall HEALTH' | egrep -v 'slow request' | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | head -n 1
2023-03-11T10:28:47.392 INFO:teuthology.orchestra.run.smithi149.stdout:1678530338.9736485 osd.3 (osd.3) 104 : cluster [WRN] WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event

This doesn't seem like it would be caused by a change in distro, but @cbodley please take a look. Otherwise, I did not see anything related.

vshankar · 2023-03-14T04:37:35Z

Hmmm... fs suite has failures which I haven't seen before, so those could be related. I'll have a look.

vshankar · 2023-03-15T05:54:56Z

@yuriw - were you using a custom test suite (--suite-repo, --suite-branch) by any chance?

FWIW, could we run the fs suite with the latest main (and changes in this PR)?

ljflores · 2023-03-15T14:18:55Z

@yuriw - were you using a custom test suite (--suite-repo, --suite-branch) by any chance?

FWIW, could we run the fs suite with the latest main (and changes in this PR)?

@vshankar yes, a custom --suite-branch was used. (cbodley:wip-qa-supported-distros)

vshankar · 2023-03-15T14:21:36Z

@yuriw - were you using a custom test suite (--suite-repo, --suite-branch) by any chance?
FWIW, could we run the fs suite with the latest main (and changes in this PR)?

@vshankar yes, a custom --suite-branch was used. (cbodley:wip-qa-supported-distros)

OK. That might explain some of the failures, but not all. I'll rerun the failed jobs in the fs suite and see how it looks.

ljflores · 2023-03-15T16:06:51Z

@cbodley @neha-ojha here is the rados suite review:

https://pulpito.ceph.com/?sha1=63c2dce869c8f63c396d3b6505a21c44088ff500

Failures, unrelated: 1. https://tracker.ceph.com/issues/58969 2. https://tracker.ceph.com/issues/58585 3. https://tracker.ceph.com/issues/57755 4. https://tracker.ceph.com/issues/49287 5. https://tracker.ceph.com/issues/58560 6. https://tracker.ceph.com/issues/49727

Details: 1. test_full_health: _ValError: In input['fs_map']['filesystems'][0]['mdsmap']: missing keys: {'max_xattr_size'} - Ceph - Mgr - Dashboard 2. rook: failed to pull kubelet image - Ceph - Orchestrator 3. task/test_orch_cli: test_cephfs_mirror times out - Ceph - Orchestrator 4. SELinux Denials during cephadm/workunits/test_cephadm - Ceph - Orchestrator 5. test_envlibrados_for_rocksdb.sh failed to subscribe to repo - Infrastructure 6. lazy_omap_stats_test: "ceph osd deep-scrub all" hangs - Ceph - RADOS

The only new failure that caught my attention was "1678520463.0682712 osd.2 (osd.2) 4 : cluster [WRN] WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event" in cluster log, which appeared twice on 22.04 tests:

/a/yuriw-2023-03-10_22:46:37-rados-reef-distro-default-smithi/7203358
2023-03-11T10:28:47.074 DEBUG:teuthology.orchestra.run.smithi149:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(POOL_APP_NOT_ENABLED\)' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(OBJECT_' | egrep -v '\(PG_' | egrep -v '\(SLOW_OPS\)' | egrep -v 'overall HEALTH' | egrep -v 'slow request' | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | head -n 1
2023-03-11T10:28:47.263 INFO:teuthology.orchestra.run.smithi149.stdout:1678530338.9736485 osd.3 (osd.3) 104 : cluster [WRN] WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
2023-03-11T10:28:47.263 WARNING:tasks.ceph:Found errors (ERR|WRN|SEC) in cluster log
2023-03-11T10:28:47.264 DEBUG:teuthology.orchestra.run.smithi149:> sudo egrep '\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(POOL_APP_NOT_ENABLED\)' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(OBJECT_' | egrep -v '\(PG_' | egrep -v '\(SLOW_OPS\)' | egrep -v 'overall HEALTH' | egrep -v 'slow request' | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | head -n 1
2023-03-11T10:28:47.280 DEBUG:teuthology.orchestra.run.smithi149:> sudo egrep '\[ERR\]' /var/log/ceph/ceph.log | egrep -v '\(POOL_APP_NOT_ENABLED\)' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(OBJECT_' | egrep -v '\(PG_' | egrep -v '\(SLOW_OPS\)' | egrep -v 'overall HEALTH' | egrep -v 'slow request' | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | head -n 1
2023-03-11T10:28:47.338 DEBUG:teuthology.orchestra.run.smithi149:> sudo egrep '\[WRN\]' /var/log/ceph/ceph.log | egrep -v '\(POOL_APP_NOT_ENABLED\)' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(OBJECT_' | egrep -v '\(PG_' | egrep -v '\(SLOW_OPS\)' | egrep -v 'overall HEALTH' | egrep -v 'slow request' | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | head -n 1
2023-03-11T10:28:47.392 INFO:teuthology.orchestra.run.smithi149.stdout:1678530338.9736485 osd.3 (osd.3) 104 : cluster [WRN] WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
This doesn't seem like it would be caused by a change in distro, but @cbodley please take a look. Otherwise, I did not see anything related.

I reran the jobs that failed from this bug, and they passed on a second round. So, I don't believe this is directly related to ubuntu 22.04.

I created a tracker for the bug here, for future investigation: https://tracker.ceph.com/issues/59049

@cbodley rados approved

Same as in commit 2de2146 ("qa/workunits/rbd: use bionic version of qemu-iotests for focal"). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

idryomov · 2023-03-16T12:40:17Z

One of the RBD failures was real (caused by this PR). I have pushed a fix to Casey's branch.

A rerun succeeded: https://pulpito.ceph.com/dis-2023-03-15_16:39:35-rbd-reef-distro-default-smithi/

vshankar · 2023-03-17T02:33:55Z

Apologies for the delay in looking into fs suite failures - I'm on it today.

vshankar · 2023-03-17T08:53:32Z

I reran the fs suite failures here: https://pulpito.ceph.com/vshankar-2023-03-17_02:45:20-fs-reef-testing-default-smithi/ (10 jobs vs 25 failed,dead jobs from https://pulpito.ceph.com/yuriw-2023-03-10_22:52:36-fs-reef-distro-default-smithi/).

From the 10 jobs, most of them are knows issues. The failure:

qa suite has test code but the ceph-mds binary does not have the required functionality. Looks like the custom suite branch was forked from main? Also the missing 15 jobs did not get scheduled due to the same reason? @cbodley

cbodley · 2023-03-17T12:26:42Z

@vshankar you're right that this branch targets main

vshankar · 2023-03-17T14:28:24Z

@vshankar you're right that this branch targets main

In that case, I'm good with merging this change. I'll keep an eye on the nightly runs for reef for any new failures in fs suite.

cbodley · 2023-03-17T14:39:09Z

jenkins test make check

cbodley · 2023-03-17T14:39:14Z

jenkins test api

idryomov · 2023-03-17T15:01:40Z

@vshankar you're right that this branch targets main

But the intent is to cherry pick this to reef, right?

vshankar · 2023-03-17T15:19:39Z

@vshankar you're right that this branch targets main

But the intent is to cherry pick this to reef, right?

Right. The qa suite which was run was the main branch qa suite plus this change and the ceph binaries were from reef branch. Ideally, reef binaries + reef qa suite would have been better. Not sure why that wasn't done.

cbodley · 2023-03-17T15:39:27Z

Ideally, reef binaries + reef qa suite would have been better. Not sure why that wasn't done.

sorry. i've only been testing this against the rgw suite on main, and Yuri has only been running baselines for reef. that was close enough for most suites

if you'd like to do more reef testing in the meantime, i pushed a reef-based version of this branch to https://github.com/cbodley/ceph/commits/wip-qa-reef-ubuntu22

cbodley · 2023-03-17T17:56:06Z

jenkins test api

cbodley added the tests label Dec 14, 2022

cbodley force-pushed the wip-qa-supported-distros branch from c5b1662 to 7efe384 Compare December 14, 2022 21:45

cbodley force-pushed the wip-qa-supported-distros branch from 7efe384 to 061e0cb Compare January 30, 2023 17:18

cbodley force-pushed the wip-qa-supported-distros branch from 061e0cb to c422012 Compare February 22, 2023 18:26

cbodley force-pushed the wip-qa-supported-distros branch from c422012 to a1d9efc Compare March 3, 2023 19:42

yuriw approved these changes Mar 8, 2023

View reviewed changes

cbodley added 3 commits March 8, 2023 10:28

qa/distros/all: add ubuntu_latest.yaml symlink to ubuntu_22.04.yaml

3184694

Signed-off-by: Casey Bodley <cbodley@redhat.com>

qa/distros/supported: update ubuntu_latest.yaml and add ubuntu_20.04.…

929172b

…yaml Signed-off-by: Casey Bodley <cbodley@redhat.com>

qa/distros/supported-random: update ubuntu_latest.yaml and add ubuntu…

62e520c

…_20.04.yaml Signed-off-by: Casey Bodley <cbodley@redhat.com>

cbodley force-pushed the wip-qa-supported-distros branch from a1d9efc to 62e520c Compare March 8, 2023 15:38

cbodley changed the title ~~qa/distros: add centos stream 9 and ubuntu 22 as supported distros~~ qa/distros: add ubuntu 22 as supported distro Mar 8, 2023

cbodley added the needs-qa label Mar 10, 2023

qa/workunits/rbd: use bionic version of qemu-iotests for jammy

3b16109

Same as in commit 2de2146 ("qa/workunits/rbd: use bionic version of qemu-iotests for focal"). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

idryomov requested a review from a team as a code owner March 16, 2023 12:39

github-actions bot added the rbd label Mar 16, 2023

idryomov approved these changes Mar 16, 2023

View reviewed changes

vshankar approved these changes Mar 17, 2023

View reviewed changes

cbodley merged commit ee1ae6c into ceph:main Mar 17, 2023

cbodley mentioned this pull request Mar 17, 2023

reef: qa/distros: add ubuntu 22 as supported distro #50577

Merged

cbodley deleted the wip-qa-supported-distros branch March 18, 2023 13:44

Conversation

cbodley commented Dec 14, 2022

Uh oh!

yuriw commented Mar 8, 2023

Uh oh!

cbodley commented Mar 8, 2023

Uh oh!

cbodley commented Mar 8, 2023

Uh oh!

cbodley commented Mar 10, 2023

Uh oh!

yuriw commented Mar 11, 2023

Uh oh!

ljflores commented Mar 13, 2023

Uh oh!

vshankar commented Mar 14, 2023

Uh oh!

vshankar commented Mar 15, 2023

Uh oh!

ljflores commented Mar 15, 2023

Uh oh!

vshankar commented Mar 15, 2023

Uh oh!

ljflores commented Mar 15, 2023

Uh oh!

idryomov commented Mar 16, 2023

Uh oh!

vshankar commented Mar 17, 2023

Uh oh!

vshankar commented Mar 17, 2023

Uh oh!

cbodley commented Mar 17, 2023

Uh oh!

vshankar commented Mar 17, 2023

Uh oh!

cbodley commented Mar 17, 2023

Uh oh!

cbodley commented Mar 17, 2023

Uh oh!

idryomov commented Mar 17, 2023

Uh oh!

vshankar commented Mar 17, 2023

Uh oh!

cbodley commented Mar 17, 2023

Uh oh!

cbodley commented Mar 17, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants