Bug #62925
closedcephfs-journal-tool: Add preventive measures in the tool to avoid corruting a ceph file system
0%
Description
The cephfs-journal-tool should be used by expert who has the knowledge of CephFS internals. Though we have a clear warning message on https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#recovery-from-missing-metadata-objects doc to not to use cephfs-journal-tool to reset journal without cephfs team's advice, still some users venture out to try this tools without much thought which can result in MDS crash as observed in https://tracker.ceph.com/issues/58878.
sh-4.4$ cephfs-journal-tool --rank ocs-storagecluster-cephfilesystem:0 event recover_dentries summary Events by type: RESETJOURNAL: 1 Errors: 0 sh-4.4$ cephfs-journal-tool --rank ocs-storagecluster-cephfilesystem:0 journal reset old journal was 8388608~48 new journal start will be 12582912 (4194256 bytes past old end) writing journal head writing EResetJournal entry done
We should have a warning message with a prompt to continue or not when we run this tool to reset the journal. Also cephfs-journal-tool should not be run when cephfs is online or we should have a clear warning message when user attempts to run against live cephfs, mostly when "event recover_dentries summary" command to write any inodes/dentries recoverable from the journal to the RADOS store.
Updated by Venky Shankar over 2 years ago
- Category set to Code Hygiene
- Assignee set to Jos Collin
- Target version set to v19.0.0
- Backport set to reef,quincy
- Component(FS) tools added
Prashant D wrote:
The cephfs-journal-tool should be used by expert who has the knowledge of CephFS internals. Though we have a clear warning message on https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#recovery-from-missing-metadata-objects doc to not to use cephfs-journal-tool to reset journal without cephfs team's advice, still some users venture out to try this tools without much thought which can result in MDS crash as observed in https://tracker.ceph.com/issues/58878.
[...]
We should have a warning message with a prompt to continue or not when we run this tool to reset the journal. Also cephfs-journal-tool should not be run when cephfs is online or we should have a clear warning message when user attempts to run against live cephfs, mostly when "event recover_dentries summary" command to write any inodes/dentries recoverable from the journal to the RADOS store.
Fair point.
Jos, please take this one,
Updated by Venky Shankar over 2 years ago
- Priority changed from Normal to Urgent
Updated by Venky Shankar over 2 years ago
Bumping priority since its essential that we have this functionality asap.
Updated by Jos Collin about 2 years ago
- Tracker changed from Feature to Bug
- Source set to Community (dev)
- Regression set to No
- Severity set to 3 - minor
- Pull request ID set to 55758
Updated by Jos Collin about 2 years ago
- Status changed from In Progress to Fix Under Review
Updated by Venky Shankar about 2 years ago
- Backport changed from reef,quincy to reef,quincy,squid
Updated by Venky Shankar almost 2 years ago
- Status changed from Fix Under Review to Pending Backport
- Target version changed from v19.0.0 to v20.0.0
Updated by Upkeep Bot almost 2 years ago
- Copied to Backport #66277: squid: cephfs-journal-tool: Add preventive measures in the tool to avoid corruting a ceph file system added
Updated by Upkeep Bot almost 2 years ago
- Copied to Backport #66278: reef: cephfs-journal-tool: Add preventive measures in the tool to avoid corruting a ceph file system added
Updated by Upkeep Bot almost 2 years ago
- Copied to Backport #66280: quincy: cephfs-journal-tool: Add preventive measures in the tool to avoid corruting a ceph file system added
Updated by Jos Collin about 1 year ago
- Status changed from Pending Backport to Resolved
Updated by Upkeep Bot 9 months ago
- Merge Commit set to 75bcfd1bbf6cb3d32ce3b90ccdca1c03f835650e
- Fixed In set to v19.3.0-2449-g75bcfd1bbf6
- Upkeep Timestamp set to 2025-07-02T03:46:24+00:00
Updated by Upkeep Bot 8 months ago
- Fixed In changed from v19.3.0-2449-g75bcfd1bbf6 to v19.3.0-2449-g75bcfd1bbf
- Upkeep Timestamp changed from 2025-07-02T03:46:24+00:00 to 2025-07-14T16:45:49+00:00
Updated by Upkeep Bot 5 months ago
- Released In set to v20.2.0~2808
- Upkeep Timestamp changed from 2025-07-14T16:45:49+00:00 to 2025-11-01T01:12:04+00:00