Project

General

Profile

Actions

QA Run #72299

closed

wip-bharath9-testing-2025-08-05-0506(wip-bharath9-testing-2025-07-26-0628)

Added by Srinivasa Bharath Kanta 8 months ago. Updated 7 months ago.

Status:
QA Closed
Priority:
Normal
Tags:

Description

Excluded the PR#64501 from the PR list.

New PR List
-----------
https://github.com/ceph/ceph/pull/63938 - osd: recalculate coll_t::_str during decode() to fix stale values

Old PR List
-----------
The following PRs were included for testing:
https://github.com/ceph/ceph/pull/63938 - osd: recalculate coll_t::_str during decode() to fix stale values
https://github.com/ceph/ceph/pull/64501 - Optimized Erasure Coding - Fixpack 2

Actions #1

Updated by Srinivasa Bharath Kanta 8 months ago

  • Status changed from QA Testing to QA Needs Approval
  • Assignee changed from Srinivasa Bharath Kanta to Laura Flores
Actions #2

Updated by Laura Flores 8 months ago

  • Status changed from QA Needs Approval to QA Needs Rerun/Rebuilt
  • Assignee changed from Laura Flores to Srinivasa Bharath Kanta

@Srinivasa Bharath Kanta I found a possible issue with https://github.com/ceph/ceph/pull/64501. Can you drop it from the batch and rebuild/reschedule with just the one PR left?

I left a note for the author here: https://github.com/ceph/ceph/pull/64501#pullrequestreview-3080617467

Actions #3

Updated by Alex Ainscow 8 months ago

This looks like a test issue to me. No tests were changed as part of this PR.

One pool was created and destroyed during the test:

teuthology.log|2025-07-26T08:59:16.374 DEBUG:teuthology.orchestra.run.smithi001:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph osd pool create unique_pool_0 16 16 erasure teuthologyprofile
teuthology.log|2025-07-26T08:59:18.392 DEBUG:teuthology.orchestra.run.smithi001:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph osd pool application enable unique_pool_0 rados --yes-i-really-mean-it || true
teuthology.log|2025-07-26T09:09:22.930 DEBUG:teuthology.orchestra.run.smithi001:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph osd pool rm unique_pool_0 unique_pool_0 --yes-i-really-really-mean-it

The thrasher, however, still seems to be running and gets upset about there being no PGs:

teuthology.log|2025-07-26T09:09:22.593 INFO:tasks.thrashosds.thrasher:pool unique_pool_0 min_size is 2
teuthology.log|2025-07-26T09:09:23.284 INFO:tasks.thrashosds.thrasher:pool unique_pool_0 local_k=2 local_m=1
teuthology.log|2025-07-26T09:09:23.284 INFO:tasks.thrashosds.thrasher:setting k=2 from previous 0
teuthology.log|2025-07-26T09:09:23.284 INFO:tasks.thrashosds.thrasher:setting m=1 from previous 99
teuthology.log|2025-07-26T09:09:23.284 INFO:tasks.thrashosds.thrasher:using k=2, m=1
teuthology.log|2025-07-26T09:09:23.284 INFO:tasks.thrashosds.thrasher:dead_osds=[], live_osds=[0, 8, 12, 5, 9, 13, 2, 6, 10, 3, 11, 15, 4, 7, 14, 1]
teuthology.log|2025-07-26T09:09:23.284 INFO:tasks.thrashosds.thrasher:chose to kill 1 OSDs
teuthology.log|2025-07-26T09:09:23.627 INFO:tasks.thrashosds.thrasher:No pgs; trying again
teuthology.log|2025-07-26T09:09:28.990 INFO:tasks.thrashosds.thrasher:No pgs; trying again
teuthology.log|2025-07-26T09:09:34.349 INFO:tasks.thrashosds.thrasher:No pgs; trying again
teuthology.log|2025-07-26T09:09:34.351 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):

So I think this is an unfortunate timing window:

1. Thrasher detects a pool exists. 
2. Pool is deleted
3. Thrasher attempts to detect PGs in the now deleted pool, which it still remembers. 
Actions #4

Updated by Srinivasa Bharath Kanta 8 months ago

  • Status changed from QA Needs Rerun/Rebuilt to QA Testing
Actions #5

Updated by Srinivasa Bharath Kanta 8 months ago

  • Subject changed from wip-bharath9-testing-2025-07-26-0628 to wip-bharath9-testing-2025-08-05-0506(wip-bharath9-testing-2025-07-26-0628)
  • Description updated (diff)
  • Shaman Build changed from wip-bharath9-testing-2025-07-26-0628 to wip-bharath9-testing-2025-08-05-0506
  • QA Runs changed from wip-bharath9-testing-2025-07-26-0628 to wip-bharath9-testing-2025-08-05-0506
  • Git Branch changed from SrinivasaBharath/ceph/commits/wip-bharath9-testing-2025-07-26-0628 to SrinivasaBharath/ceph/commits/wip-bharath9-testing-2025-08-05-0506
Actions #6

Updated by Srinivasa Bharath Kanta 8 months ago

  • Status changed from QA Testing to QA Needs Approval
  • Assignee changed from Srinivasa Bharath Kanta to Laura Flores
Actions #7

Updated by Laura Flores 7 months ago

  • Status changed from QA Needs Approval to QA Approved
  • Assignee changed from Laura Flores to Srinivasa Bharath Kanta
Actions #8

Updated by Srinivasa Bharath Kanta 7 months ago

  • Status changed from QA Approved to QA Closed

PR is merged.

Actions

Also available in: Atom PDF