Project

General

Profile

Actions

Bug #44510

closed

osd/osd-recovery-space.sh TEST_recovery_test_simple failure

Added by Sage Weil about 6 years ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
-
% Done:

100%

Source:
Backport:
quincy,reef,squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Tags (freeform):
backport_processed
Fixed In:
v19.3.0-3841-g1c8bea0cbb
Released In:
v20.2.0~2350
Upkeep Timestamp:
2025-11-01T01:34:45+00:00

Description

2020-03-08T23:19:15.259 INFO:tasks.workunit.client.0.smithi192.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:144: TEST_recovery_test_simple:  ceph status --format=json-pretty
2020-03-08T23:19:15.703 INFO:tasks.workunit.client.0.smithi192.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:146: TEST_recovery_test_simple:  jq .health.checks.PG_RECOVERY_FULL.severity td/osd-recovery-space/stat.json
2020-03-08T23:19:15.705 INFO:tasks.workunit.client.0.smithi192.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:146: TEST_recovery_test_simple:  eval SEV=null
2020-03-08T23:19:15.705 INFO:tasks.workunit.client.0.smithi192.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:146: TEST_recovery_test_simple:  SEV=null
2020-03-08T23:19:15.706 INFO:tasks.workunit.client.0.smithi192.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:147: TEST_recovery_test_simple:  '[' null '!=' HEALTH_ERR ']'
2020-03-08T23:19:15.706 INFO:tasks.workunit.client.0.smithi192.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:148: TEST_recovery_test_simple:  echo 'PG_RECOVERY_FULL severity null not HEALTH_ERR'
2020-03-08T23:19:15.706 INFO:tasks.workunit.client.0.smithi192.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:149: TEST_recovery_test_simple:  expr 1 + 1
2020-03-08T23:19:15.706 INFO:tasks.workunit.client.0.smithi192.stdout:PG_RECOVERY_FULL severity null not HEALTH_ERR
2020-03-08T23:19:15.707 INFO:tasks.workunit.client.0.smithi192.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:149: TEST_recovery_test_simple:  ERRORS=2
2020-03-08T23:19:15.707 INFO:tasks.workunit.client.0.smithi192.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:151: TEST_recovery_test_simple:  jq .health.checks.PG_RECOVERY_FULL.summary.message td/osd-recovery-space/stat.json
2020-03-08T23:19:15.709 INFO:tasks.workunit.client.0.smithi192.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:151: TEST_recovery_test_simple:  eval MSG=null
2020-03-08T23:19:15.709 INFO:tasks.workunit.client.0.smithi192.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:151: TEST_recovery_test_simple:  MSG=null
2020-03-08T23:19:15.709 INFO:tasks.workunit.client.0.smithi192.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:152: TEST_recovery_test_simple:  '[' null '!=' 'Full OSDs blocking recovery: 1 pg recovery_toofull' ']'
2020-03-08T23:19:15.709 INFO:tasks.workunit.client.0.smithi192.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:153: TEST_recovery_test_simple:  echo 'PG_RECOVERY_FULL message '\''null'\'' mismatched'
2020-03-08T23:19:15.710 INFO:tasks.workunit.client.0.smithi192.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:154: TEST_recovery_test_simple:  expr 2 + 1
2020-03-08T23:19:15.710 INFO:tasks.workunit.client.0.smithi192.stdout:PG_RECOVERY_FULL message 'null' mismatched
2020-03-08T23:19:15.711 INFO:tasks.workunit.client.0.smithi192.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:154: TEST_recovery_test_simple:  ERRORS=3
2020-03-08T23:19:15.712 INFO:tasks.workunit.client.0.smithi192.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:156: TEST_recovery_test_simple:  rm -f td/osd-recovery-space/stat.json
2020-03-08T23:19:15.712 INFO:tasks.workunit.client.0.smithi192.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:158: TEST_recovery_test_simple:  '[' 3 '!=' 0 ']'
2020-03-08T23:19:15.712 INFO:tasks.workunit.client.0.smithi192.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:160: TEST_recovery_test_simple:  return 1
2020-03-08T23:19:15.712 INFO:tasks.workunit.client.0.smithi192.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:35: run:  return 1

/a/sage-2020-03-08_21:18:15-rados:standalone-wip-sage2-testing-2020-03-08-1456-distro-basic-smithi/4838360

Related issues 3 (0 open3 closed)

Copied to RADOS - Backport #67349: squid: osd/osd-recovery-space.sh TEST_recovery_test_simple failureResolvedNitzan MordechaiActions
Copied to RADOS - Backport #67350: quincy: osd/osd-recovery-space.sh TEST_recovery_test_simple failureResolvedNitzan MordechaiActions
Copied to RADOS - Backport #67351: reef: osd/osd-recovery-space.sh TEST_recovery_test_simple failureResolvedNitzan MordechaiActions
Actions #2

Updated by Neha Ojha over 5 years ago

  • Priority changed from Urgent to High
Actions #3

Updated by Neha Ojha over 5 years ago

  • Priority changed from High to Normal
Actions #4

Updated by Laura Flores over 2 years ago

  • Tags set to test-failure

/a/yuriw-2023-11-01_21:37:41-rados-wip-yuri6-testing-2023-11-01-0745-reef-distro-default-smithi/7443892

Actions #5

Updated by Radoslaw Zarzynski over 2 years ago

The test is basically querying ceph status for error flags, so the symptom is pretty generic and likely there are many paths leading to it. It could be something new.

Actions #6

Updated by Matan Breizman almost 2 years ago

/a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659542

Actions #7

Updated by Radoslaw Zarzynski almost 2 years ago

  • Assignee set to Nitzan Mordechai

Hi Nitzan, would you mind taking a look?

Actions #8

Updated by Nitzan Mordechai almost 2 years ago

  • Status changed from New to In Progress

from /a/yuriw-2024-04-16_23:25:35-rados-wip-yuriw-testing-20240416.150233-distro-default-smithi/7659542
we can see that the too full flag is not on (yet?)

2024-04-17T03:56:53.852 INFO:tasks.workunit.client.0.smithi138.stdout:PG_STAT  OBJECTS  MISSING_ON_PRIMARY  DEGRADED  MISPLACED  UNFOUND  BYTES    OMAP_BYTES*  OMAP_KEYS*  LOG  LOG_DUPS  DISK_LOG  STATE         STATE_STAMP                      VERSION  REPORTED  UP     UP_PRIMARY  ACTING  ACTING_PRIMARY  LAST_
SCRUB  SCRUB_STAMP                      LAST_DEEP_SCRUB  DEEP_SCRUB_STAMP                 SNAPTRIMQ_LEN  LAST_SCRUB_DURATION  SCRUB_SCHEDULING                                            OBJECTS_SCRUBBED  OBJECTS_TRIMMED
2024-04-17T03:56:53.853 INFO:tasks.workunit.client.0.smithi138.stdout:1.0          600                   0         0          0        0  3072000            0           0  600         0       600  active+clean  2024-04-17T03:56:49.179149+0000   23'600   32:1842  [1,0]           1   [1,0]               1       
  0'0  2024-04-17T03:55:16.961244+0000              0'0  2024-04-17T03:55:16.961244+0000              0                    0  periodic scrub scheduled @ 2024-04-18T11:04:26.442350+0000                 0                0
2024-04-17T03:56:53.853 INFO:tasks.workunit.client.0.smithi138.stdout:
2024-04-17T03:56:53.853 INFO:tasks.workunit.client.0.smithi138.stdout:* NOTE: Omap statistics are gathered during deep scrub and may be inaccurate soon afterwards depending on utilization. See http://docs.ceph.com/en/latest/dev/placement-group/#omap-statistics for further details.
2024-04-17T03:56:53.853 INFO:tasks.workunit.client.0.smithi138.stderr:dumped pgs
2024-04-17T03:56:53.863 INFO:tasks.workunit.client.0.smithi138.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:67: wait_for_state:  return 1
2024-04-17T03:56:53.863 INFO:tasks.workunit.client.0.smithi138.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:136: TEST_recovery_test_simple:  ERRORS=0
2024-04-17T03:56:53.864 INFO:tasks.workunit.client.0.smithi138.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:137: TEST_recovery_test_simple:  ceph pg dump pgs
2024-04-17T03:56:53.864 INFO:tasks.workunit.client.0.smithi138.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:137: TEST_recovery_test_simple:  grep +recovery_toofull
2024-04-17T03:56:53.865 INFO:tasks.workunit.client.0.smithi138.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-recovery-space.sh:137: TEST_recovery_test_simple:  wc -l

osd-recovery-space waiting for too full: # If this times out, we'll detected errors below
wait_for_recovery_toofull 30

But we didn't receive any 'too-full' flag. The 600 objects weren't written completely to the OSDs, which is why we didn't receive that flag.

Actions #9

Updated by Nitzan Mordechai almost 2 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 57193
Actions #11

Updated by Radoslaw Zarzynski over 1 year ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to quincy,reef,squid
Actions #12

Updated by Upkeep Bot over 1 year ago

  • Copied to Backport #67349: squid: osd/osd-recovery-space.sh TEST_recovery_test_simple failure added
Actions #13

Updated by Upkeep Bot over 1 year ago

  • Copied to Backport #67350: quincy: osd/osd-recovery-space.sh TEST_recovery_test_simple failure added
Actions #14

Updated by Upkeep Bot over 1 year ago

  • Copied to Backport #67351: reef: osd/osd-recovery-space.sh TEST_recovery_test_simple failure added
Actions #15

Updated by Upkeep Bot over 1 year ago

  • Tags (freeform) set to backport_processed
Actions #16

Updated by Konstantin Shalygin over 1 year ago

  • Status changed from Pending Backport to Resolved
  • % Done changed from 0 to 100
Actions #17

Updated by Upkeep Bot 8 months ago

  • Merge Commit set to 1c8bea0cbb5b6e1ad587ccd0c168ada32602b641
  • Fixed In set to v19.3.0-3841-g1c8bea0cbb
  • Upkeep Timestamp set to 2025-07-15T01:19:13+00:00
Actions #18

Updated by Upkeep Bot 5 months ago

  • Released In set to v20.2.0~2350
  • Upkeep Timestamp changed from 2025-07-15T01:19:13+00:00 to 2025-11-01T01:34:45+00:00
Actions

Also available in: Atom PDF