Project

General

Profile

Actions

Bug #62925

closed

cephfs-journal-tool: Add preventive measures in the tool to avoid corruting a ceph file system

Added by Prashant D over 2 years ago. Updated 5 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Code Hygiene
Target version:
% Done:

0%

Source:
Community (dev)
Backport:
reef,quincy,squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
tools
Labels (FS):
Pull request ID:
Tags (freeform):
Fixed In:
v19.3.0-2449-g75bcfd1bbf
Released In:
v20.2.0~2808
Upkeep Timestamp:
2025-11-01T01:12:04+00:00

Description

The cephfs-journal-tool should be used by expert who has the knowledge of CephFS internals. Though we have a clear warning message on https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#recovery-from-missing-metadata-objects doc to not to use cephfs-journal-tool to reset journal without cephfs team's advice, still some users venture out to try this tools without much thought which can result in MDS crash as observed in https://tracker.ceph.com/issues/58878.

sh-4.4$ cephfs-journal-tool --rank ocs-storagecluster-cephfilesystem:0 event recover_dentries summary
Events by type:
  RESETJOURNAL: 1
Errors: 0
sh-4.4$ cephfs-journal-tool --rank ocs-storagecluster-cephfilesystem:0 journal reset
old journal was 8388608~48
new journal start will be 12582912 (4194256 bytes past old end)
writing journal head
writing EResetJournal entry
done

We should have a warning message with a prompt to continue or not when we run this tool to reset the journal. Also cephfs-journal-tool should not be run when cephfs is online or we should have a clear warning message when user attempts to run against live cephfs, mostly when "event recover_dentries summary" command to write any inodes/dentries recoverable from the journal to the RADOS store.


Related issues 3 (0 open3 closed)

Copied to CephFS - Backport #66277: squid: cephfs-journal-tool: Add preventive measures in the tool to avoid corruting a ceph file systemResolvedJos CollinActions
Copied to CephFS - Backport #66278: reef: cephfs-journal-tool: Add preventive measures in the tool to avoid corruting a ceph file systemResolvedJos CollinActions
Copied to CephFS - Backport #66280: quincy: cephfs-journal-tool: Add preventive measures in the tool to avoid corruting a ceph file systemRejectedJos CollinActions
Actions #1

Updated by Prashant D over 2 years ago

  • Description updated (diff)
Actions #2

Updated by Venky Shankar over 2 years ago

  • Category set to Code Hygiene
  • Assignee set to Jos Collin
  • Target version set to v19.0.0
  • Backport set to reef,quincy
  • Component(FS) tools added

Prashant D wrote:

The cephfs-journal-tool should be used by expert who has the knowledge of CephFS internals. Though we have a clear warning message on https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#recovery-from-missing-metadata-objects doc to not to use cephfs-journal-tool to reset journal without cephfs team's advice, still some users venture out to try this tools without much thought which can result in MDS crash as observed in https://tracker.ceph.com/issues/58878.

[...]

We should have a warning message with a prompt to continue or not when we run this tool to reset the journal. Also cephfs-journal-tool should not be run when cephfs is online or we should have a clear warning message when user attempts to run against live cephfs, mostly when "event recover_dentries summary" command to write any inodes/dentries recoverable from the journal to the RADOS store.

Fair point.

Jos, please take this one,

Actions #3

Updated by Venky Shankar over 2 years ago

  • Priority changed from Normal to Urgent
Actions #4

Updated by Venky Shankar over 2 years ago

Bumping priority since its essential that we have this functionality asap.

Actions #5

Updated by Jos Collin about 2 years ago

  • Status changed from New to In Progress
Actions #6

Updated by Jos Collin about 2 years ago

  • Tracker changed from Feature to Bug
  • Source set to Community (dev)
  • Regression set to No
  • Severity set to 3 - minor
  • Pull request ID set to 55758
Actions #7

Updated by Jos Collin about 2 years ago

  • Status changed from In Progress to Fix Under Review
Actions #8

Updated by Venky Shankar about 2 years ago

  • Backport changed from reef,quincy to reef,quincy,squid
Actions #9

Updated by Venky Shankar almost 2 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Target version changed from v19.0.0 to v20.0.0
Actions #10

Updated by Upkeep Bot almost 2 years ago

  • Copied to Backport #66277: squid: cephfs-journal-tool: Add preventive measures in the tool to avoid corruting a ceph file system added
Actions #11

Updated by Upkeep Bot almost 2 years ago

  • Copied to Backport #66278: reef: cephfs-journal-tool: Add preventive measures in the tool to avoid corruting a ceph file system added
Actions #12

Updated by Upkeep Bot almost 2 years ago

  • Copied to Backport #66280: quincy: cephfs-journal-tool: Add preventive measures in the tool to avoid corruting a ceph file system added
Actions #14

Updated by Jos Collin about 1 year ago

  • Status changed from Pending Backport to Resolved
Actions #15

Updated by Upkeep Bot 9 months ago

  • Merge Commit set to 75bcfd1bbf6cb3d32ce3b90ccdca1c03f835650e
  • Fixed In set to v19.3.0-2449-g75bcfd1bbf6
  • Upkeep Timestamp set to 2025-07-02T03:46:24+00:00
Actions #16

Updated by Upkeep Bot 8 months ago

  • Fixed In changed from v19.3.0-2449-g75bcfd1bbf6 to v19.3.0-2449-g75bcfd1bbf
  • Upkeep Timestamp changed from 2025-07-02T03:46:24+00:00 to 2025-07-14T16:45:49+00:00
Actions #17

Updated by Upkeep Bot 5 months ago

  • Released In set to v20.2.0~2808
  • Upkeep Timestamp changed from 2025-07-14T16:45:49+00:00 to 2025-11-01T01:12:04+00:00
Actions

Also available in: Atom PDF