Project

General

Profile

Actions

Bug #72000

closed

rgw: restore bugs

Added by J. Eric Ivancich 10 months ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
% Done:

0%

Source:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Tags (freeform):
backport_processed
Fixed In:
v20.2.0-331-gaf76f477e6
Released In:
v20.2.1~141
Upkeep Timestamp:
2026-04-03T01:41:33+00:00

Description

Failures in these tests:

test_restore_object_temporary
test_restore_object_permanent
test_read_through
test_restore_noncur_obj

From here: https://qa-proxy.ceph.com/teuthology/anuchaithra-2025-07-04_16:46:39-rgw-wip-anrao2-testing-2025-07-04-1031-distro-default-smithi/8369939/teuthology.log


Related issues 1 (0 open1 closed)

Copied to rgw - Backport #74205: tentacle: rgw: restore bugsDuplicateSoumya KoduriActions
Actions #1

Updated by Soumya Koduri 10 months ago

summing up our discussion from slack - the failures seem to be specific to this particular branch. We can inspect the new commits in that branch to find the root cause. Otherwise we may need to bisect and find the changes causing transition/restore to fail.

Actions #2

Updated by Soumya Koduri 9 months ago

  • Status changed from New to Closed

These failures were unrelated to restore code changes which got merged and are addressed now.

Actions #3

Updated by Adam Emerson 8 months ago

  • Status changed from Closed to In Progress
Actions #4

Updated by Casey Bodley 8 months ago

from https://qa-proxy.ceph.com/teuthology/cbodley-2025-09-16_20:27:49-rgw-wip-63323-distro-default-gibba/8504100/teuthology.log

2025-09-16T20:55:12.747 INFO:teuthology.orchestra.run.gibba033.stdout:FAILED s3tests_boto3/functional/test_s3.py::test_restore_object_permanent - A...
2025-09-16T20:55:12.747 INFO:teuthology.orchestra.run.gibba033.stdout:FAILED s3tests_boto3/functional/test_s3.py::test_restore_noncur_obj - assert ...
Actions #5

Updated by Soumya Koduri 8 months ago

While working on https://github.com/ceph/s3-tests/pull/686 , I found an issue with current `qa/tasks/rgw_cloudtier.py` - all the rgw services were not being restarted after configuring CLOUDTIER storage class. That is resulting in second RGW server (client.1) skipping/failing transition/restore requests. That may have resulted in spurious delayed transitions or errors. I am fixing it as part of https://github.com/ceph/ceph/pull/64933 . Hopefully the tests will stabilize now.

We also have https://tracker.ceph.com/issues/72877 opened to further improve checks in restore tests (if needed).

Actions #6

Updated by Soumya Koduri 8 months ago

https://github.com/ceph/s3-tests/pull/686 has been merged, and the cloud-restore tests are now passing consistently. I’ll still keep the tracker open for a while longer to further confirm.

Actions #7

Updated by J. Eric Ivancich 8 months ago

Thank you, @Soumya Koduri !

Actions #8

Updated by Soumya Koduri 7 months ago

There was one more intermittent failure observed with "test_read_though" testcase. PRs https://github.com/ceph/ceph/pull/65926 & https://github.com/ceph/s3-tests/pull/701 fix the same.

Actions #10

Updated by Soumya Koduri 6 months ago · Edited

Hi Eric,

This bug is now fixed in main with the last PRs merged (mentioned in comment#8). However they need backport in tentacle (which is the reason for failure seen in the QE run above on tentacle branch).

I had raised https://github.com/ceph/ceph/pull/65830#issuecomment-3381527681 for the same. But QE need to pick my s3-tests branch linked to that PR .. https://github.com/soumyakoduri/s3-tests/commits/wip-skoduri-tentacle/

I already ran the restore tests and updated the PR https://github.com/ceph/ceph/pull/65830#issuecomment-3432162852 .. if no other failures are seen, I think this PR can be merged. I can then cherry-pick s3-tests commits as well to tentacle branch.

Actions #11

Updated by Adam Emerson 5 months ago

Is this in progress? I see it on tentacle and if there's a fix for it I'd like to cherry pick it.

Actions #12

Updated by Adam Emerson 5 months ago

  • Pull request ID set to 65830
Actions #13

Updated by Adam Emerson 5 months ago

  • Status changed from In Progress to Pending Backport
Actions #14

Updated by Adam Emerson 5 months ago

I see the PR, I'll cherry-pick.

Actions #15

Updated by Upkeep Bot 5 months ago

Actions #16

Updated by Upkeep Bot 5 months ago

  • Tags (freeform) set to backport_processed
Actions #17

Updated by Adam Emerson 5 months ago

Never mind, I got confused by the tag. The tentacle I was testing just hadn't incorporated all of the fixes, it looks like.

Actions #18

Updated by Adam Emerson 5 months ago

  • Backport deleted (tentacle)

Updating, this /is/ the port to tentacle, there is no backport.

Actions #19

Updated by Adam Emerson 5 months ago

  • Status changed from Pending Backport to Resolved
Actions #20

Updated by Upkeep Bot 5 months ago

  • Merge Commit set to af76f477e671f44b7fbb0c818011b950ab544931
  • Fixed In set to v20.2.0-331-gaf76f477e6
  • Upkeep Timestamp set to 2025-12-13T00:52:27+00:00
Actions #21

Updated by Upkeep Bot about 1 month ago

  • Released In set to v20.2.1~141
  • Upkeep Timestamp changed from 2025-12-13T00:52:27+00:00 to 2026-04-03T01:41:33+00:00
Actions

Also available in: Atom PDF