Project

General

Profile

Actions

QA Run #73749

closed

wip-lflores-testing-4-2025-12-01-1527 (old wip-rocky10-branch-of-the-day-2025-11-05-1762369819)

Added by Yaarit Hatuka 4 months ago. Updated about 1 month ago.

Status:
QA Closed
Priority:
Normal
Assignee:

Description

Testing the fix on latest "rocky10 branch of the day"

**Note: this rocky10 branch is not yet stable. So, I only scheduled three tests that reproduced the neorados failure for which we are testing the fix. The purpose of these three tests is just to get some quick initial feedback on whether the fix is working. On plain main (see below), we will schedule a full rados suite.

PRs included:
https://github.com/ceph/ceph/pull/66055 – adds rocky 10 to distro matrix
https://github.com/ceph/ceph/pull/66240 – fixes for ceph-mgr on python 3.12
https://github.com/ceph/ceph/pull/66155 – fix for egrep change on rocky 10
https://github.com/ceph/ceph/pull/66069 – support r10 as a container base
https://github.com/ceph/ceph/pull/66171 – support distro-suffix to specify r10 containers for cephadm
https://github.com/ceph/ceph/pull/66244 – run modules in main interpreter by default
https://github.com/ceph/ceph/pull/66223 – fix python3-cmd2 dependency
https://github.com/ceph/ceph/pull/66232 – add Rocky Linux support to librados tests
https://github.com/ceph/ceph/pull/66396 - neorados: specify alignments for aligned_storage

Git branch:
https://github.com/ceph/ceph-ci/commits/wip-rocky10-branch-of-the-day-2025-11-26-1764177936/

Builds:
https://shaman.ceph.com/builds/ceph/wip-rocky10-branch-of-the-day-2025-11-26-1764177936/

QA runs:
https://pulpito.ceph.com/lflores-2025-12-01_21:19:40-rados:basic-wip-rocky10-branch-of-the-day-2025-11-26-1764177936-distro-default-smithi/

Testing the fix on plain main

PRs included:
https://github.com/ceph/ceph/pull/66396 - neorados: specify alignments for aligned_storage

Git branch:
https://github.com/ceph/ceph-ci/commits/wip-lflores-testing-4-2025-12-01-1527/

Builds:
https://shaman.ceph.com/builds/ceph/wip-lflores-testing-4-2025-12-01-1527/

QA runs:
https://pulpito.ceph.com/lflores-2025-12-02_17:29:40-rados-wip-lflores-testing-4-2025-12-01-1527-distro-default-smithi
https://pulpito.ceph.com/lflores-2025-12-03_16:56:55-rados-wip-lflores-testing-4-2025-12-01-1527-distro-default-smithi

Original test results

Git branch:
https://github.com/ceph/ceph-ci/commits/wip-rocky10-branch-of-the-day-2025-11-05-1762369819/

Builds:
https://shaman.ceph.com/builds/ceph/wip-rocky10-branch-of-the-day-2025-11-05-1762369819/

QA runs:
https://pulpito.ceph.com/yaarit-2025-11-06_20:06:52-rados:basic-wip-rocky10-branch-of-the-day-2025-11-05-1762369819-distro-default-smithi/


Related issues 1 (1 open0 closed)

Related to RADOS - Bug #73750: rados/basic: Segmentation fault during neorados testsFix Under ReviewAdam Emerson

Actions
Actions #1

Updated by Yaarit Hatuka 4 months ago

  • Shaman Build changed from https://shaman.ceph.com/builds/ceph/wip-rocky10-branch-of-the-day-2025-11-05-1762369819/ to wip-rocky10-branch-of-the-day-2025-11-05-1762369819/
  • QA Runs changed from https://pulpito.ceph.com/yaarit-2025-11-06_20:06:52-rados:basic-wip-rocky10-branch-of-the-day-2025-11-05-1762369819-distro-default-smithi/ to yaarit-2025-11-06_20:06:52-rados:basic-wip-rocky10-branch-of-the-day-2025-11-05-1762369819-distro-default-smithi/
  • Git Branch changed from https://github.com/ceph/ceph-ci/commits/wip-rocky10-branch-of-the-day-2025-11-05-1762369819/ to wip-rocky10-branch-of-the-day-2025-11-05-1762369819/
Actions #2

Updated by Yaarit Hatuka 4 months ago

  • Git Branch changed from wip-rocky10-branch-of-the-day-2025-11-05-1762369819/ to ceph/ceph-ci/commits/wip-rocky10-branch-of-the-day-2025-11-05-1762369819/
Actions #3

Updated by Yaarit Hatuka 4 months ago

  • QA Runs changed from yaarit-2025-11-06_20:06:52-rados:basic-wip-rocky10-branch-of-the-day-2025-11-05-1762369819-distro-default-smithi/ to wip-rocky10-branch-of-the-day-2025-11-05-1762369819
Actions #4

Updated by Laura Flores 4 months ago

  • Status changed from QA Testing to QA Needs Approval
  • Assignee set to Laura Flores
Actions #5

Updated by Laura Flores 4 months ago · Edited

The two failures in the test run are tracked here:

1. https://tracker.ceph.com/issues/62235 - Assert failure: test_ceph_osd_pool_create_utf8 - (RADOS)
2. https://tracker.ceph.com/issues/73750 - rados/basic: Segmentation fault during neorados tests - (RADOS)

1 is a known issue, but it hasn't been seen since Pacific, so we should probably analyze it further to see if it's somehow rocky 10 - specific.

2 is a new issue. At a glance, I don't see if/how it's obviously related to rocky 10, but some coredumps were generated, so I'll try to get some more debug info so we can tell for sure.

Actions #6

Updated by Laura Flores 4 months ago · Edited

The first error actually occurred on a test that used ubuntu 22.04 packages, so that can't possibly be related to rocky 10. I think the second issue is the one to focus on, as it used rocky 10 packages.

Actions #7

Updated by Laura Flores 4 months ago · Edited

The rocky test failure repeated 3x (see same pulpito link above).

I scheduled the same type of test on main so it runs on centos and ubuntu packages to verify whether the issue is exclusive to rocky:
http://pulpito.ceph.com/lflores-2025-11-10_16:32:42-rados:basic-main-distro-default-smithi/

Actions #8

Updated by Laura Flores 4 months ago

Laura Flores wrote in #note-7:

The rocky test failure repeated 3x (see same pulpito link above).

I scheduled the same type of test on main so it runs on centos and ubuntu packages to verify whether the issue is exclusive to rocky:
http://pulpito.ceph.com/lflores-2025-11-10_16:32:42-rados:basic-main-distro-default-smithi/

Tests from main came back clean, which indicates that this is specific to rocky10.

Latest updates are in https://tracker.ceph.com/issues/73750#note-5.

Actions #9

Updated by Laura Flores 4 months ago

Adam Emmerson is looking into the rocky10 issue and seeing if it's possible to reproduce locally.

Actions #10

Updated by Laura Flores 4 months ago

Laura Flores wrote in #note-9:

Adam Emmerson is looking into the rocky10 issue and seeing if it's possible to reproduce locally.

Status is still the same as of now.

Actions #11

Updated by Laura Flores 4 months ago

Latest update: https://tracker.ceph.com/issues/73750#note-11

Adam may have found a local reproducer and is working on it.

Actions #12

Updated by Laura Flores 4 months ago · Edited

Ran some tests to verify the fix on the latest rocky10 branch:
https://pulpito.ceph.com/lflores-2025-12-01_21:19:40-rados:basic-wip-rocky10-branch-of-the-day-2025-11-26-1764177936-distro-default-smithi/

./teuthology/virtualenv/bin/teuthology-suite -v -m smithi -c wip-rocky10-branch-of-the-day-2025-11-26-1764177936 -r yaarit-2025-11-06_20:06:52-rados:basic-wip-rocky10-branch-of-the-day-2025-11-05-1762369819-distro-default-smithi -p 60 -N 3 --filter-all "rados_api_tests" 

We are also testing this on plain main (details in description).

Actions #13

Updated by Laura Flores 4 months ago

  • Subject changed from wip-rocky10-branch-of-the-day-2025-11-05-1762369819 to wip-lflores-testing-4-2025-12-01-1527 (old wip-rocky10-branch-of-the-day-2025-11-05-1762369819)
  • Description updated (diff)
  • Status changed from QA Needs Approval to QA Building
  • Shaman Build changed from wip-rocky10-branch-of-the-day-2025-11-05-1762369819/ to wip-lflores-testing-4-2025-12-01-1527
  • QA Runs deleted (wip-rocky10-branch-of-the-day-2025-11-05-1762369819)
  • Git Branch changed from ceph/ceph-ci/commits/wip-rocky10-branch-of-the-day-2025-11-05-1762369819/ to ceph/ceph-ci/commits/wip-lflores-testing-4-2025-12-01-1527/
Actions #14

Updated by Laura Flores 4 months ago

  • Related to Bug #73750: rados/basic: Segmentation fault during neorados tests added
Actions #15

Updated by Laura Flores 4 months ago

  • Description updated (diff)
Actions #16

Updated by Yaarit Hatuka 4 months ago

@Laura Flores fyi https://github.com/ceph/ceph/pull/66244 is still in draft mode and there are many related failures in the rados suite as a result.

Actions #17

Updated by Laura Flores 4 months ago

  • Description updated (diff)
Actions #18

Updated by Laura Flores 4 months ago

  • Status changed from QA Building to QA Testing
  • QA Runs set to wip-lflores-testing-4-2025-12-01-1527
Actions #19

Updated by Nitzan Mordechai 4 months ago

  • Git Branch changed from ceph/ceph-ci/commits/wip-lflores-testing-4-2025-12-01-1527/ to /ceph/ceph-ci/commits/wip-lflores-testing-4-2025-12-01-1527/
Actions #20

Updated by Laura Flores 4 months ago

  • Status changed from QA Testing to QA Needs Approval
Actions #21

Updated by Laura Flores 4 months ago

  • Description updated (diff)
Actions #23

Updated by Laura Flores 4 months ago

If the test results that were scheduled by @Nitzan Mordechai on the latest rocky10 of the day branch look good, then we can consider the PR approved on all angles: https://tracker.ceph.com/issues/74070

Actions #24

Updated by Nitzan Mordechai 4 months ago

@Laura Flores, there are 3 different core dumps on ceph-mgr that we need to check which of the PRs are causing them, the list is on https://tracker.ceph.com/issues/74070 and i'm still looking on them.

Actions #25

Updated by Laura Flores 3 months ago

Nitzan Mordechai wrote in #note-24:

@Laura Flores, there are 3 different core dumps on ceph-mgr that we need to check which of the PRs are causing them, the list is on https://tracker.ceph.com/issues/74070 and i'm still looking on them.

ACK

Actions #26

Updated by Laura Flores about 2 months ago

  • Tags changed from core to core, rocky10
Actions #27

Updated by Laura Flores about 1 month ago

  • Status changed from QA Needs Approval to QA Closed

I think we can close this since this was from our efforts before the lab migration, and we have since run new tests.

Actions

Also available in: Atom PDF