Project

General

Profile

Actions

Bug #20471

closed

Can't repair corrupt object info due to bad oid on all replicas

Added by David Zafman over 8 years ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Urgent
Assignee:
David Zafman
Category:
Scrub/Repair
Target version:
-
% Done:

0%

Source:
Development
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Tags (freeform):
Merge Commit:
Fixed In:
Released In:
Upkeep Timestamp:

Description

We detect a kind of corruption where the oid in the object info doesn't match the oid of the object. This was added at the request of http://tracker.ceph.com/issues/18409 in commit 9614ab556ca8e4e5daec1e71d9b6032633ba21a0.

I didn't anticipate the case where all versions of the object info have the same error. This would make it unrepairable.

According to Sam: "This came up on the sepia cluster due (presumably) to an xfs bug in the murky past." Assuming that this is caused by a bug in older bfs code we could add code to fix the object info and move on.


Related issues 1 (0 open1 closed)

Copied to RADOS - Backport #23181: jewel: Can't repair corrupt object info due to bad oid on all replicasResolvedDavid ZafmanActions
Actions #1

Updated by David Zafman over 8 years ago

  • Status changed from 12 to In Progress
2017-06-30 15:55:46.845656 osd.1 [INF] 1.6 scrub starts
2017-06-30 15:55:46.847427 osd.0 [ERR] osd.0 found object info error on pg 1.6 oid 1:602f83fe:::foo:head oid in object info: 1:602f83fe:::boo:head...repaired
2017-06-30 15:55:46.847427 osd.1 [ERR] osd.1 found object info error on pg 1.6 oid 1:602f83fe:::foo:head oid in object info: 1:602f83fe:::boo:head...repaired
2017-06-30 15:55:46.847433 osd.2 [ERR] osd.2 found object info error on pg 1.6 oid 1:602f83fe:::foo:head oid in object info: 1:602f83fe:::boo:head...repaired
2017-06-30 15:55:46.849443 osd.1 [INF] 1.6 scrub ok
Actions #2

Updated by David Zafman over 8 years ago

  • Status changed from In Progress to Fix Under Review
Actions #3

Updated by Sage Weil over 8 years ago

  • Status changed from Fix Under Review to 7
Actions #4

Updated by Sage Weil over 8 years ago

  • Status changed from 7 to Resolved
Actions #5

Updated by Nathan Cutler about 8 years ago

  • Status changed from Resolved to Pending Backport
  • Backport set to jewel
Actions #6

Updated by Nathan Cutler about 8 years ago

  • Copied to Backport #23181: jewel: Can't repair corrupt object info due to bad oid on all replicas added
Actions #7

Updated by Nathan Cutler almost 8 years ago

  • Status changed from Pending Backport to Resolved
Actions #8

Updated by Ronen Friedman about 1 year ago

I will be removing this fix in Tentacle (2025).
The problem fixed was very specific and unique, and the limited solution
provided does not justify having it run forever in every scrub.

Actions

Also available in: Atom PDF