tools/cephfs: recover alternate_name of dentries from journal#55792
tools/cephfs: recover alternate_name of dentries from journal#55792
Conversation
|
Obviously, I think tests need written for this but unfortunately that's not immediately easy to do because |
Good catch! |
@vshankar @lxbsz is the |
| encode('I', dentry_bl); | ||
| encode('i', dentry_bl); | ||
| ENCODE_START(2, 1, dentry_bl); | ||
| encode(fb.alternate_name, dentry_bl); |
There was a problem hiding this comment.
Is this really the encode order, where Inode has alternate_name at the front and Link has it at the back?
There was a problem hiding this comment.
In both cases alternate_name is a piece of dentry metadata. As a primary link, ino/d_type are inferred from the inode that's embedded in the dentry. It doesn't really matter in any case (as inode encoding was also made versioned [1] where before it was not [2]) but made sense to everyone at the time.
[1] https://github.com/ceph/ceph/pull/37297/files#diff-c0b626494569135ea1406cfad7ab9929a4df0a1f7d880405b5d2e734db812a03R2207-R2241
[2] https://github.com/ceph/ceph/pull/37297/files#diff-c0b626494569135ea1406cfad7ab9929a4df0a1f7d880405b5d2e734db812a03L2285-L2309
I think asok could work in this case. How much encoding validation do we currently do? Per this here [1] , this field is currently used for supporting long file names on clients who don't know about fscrypt. I proposed a custom client to validate security posture in [2]. Perhaps we add additional tests to this custom client, to further test when that becomes available. |
I'd really rather not create an asok command just for synthetic manipulation of alternate_name of a dentry, if possible.
No actually |
@batrick Sorry for late reply and I think missed this. Yeah it is. The #54725 is the last piece of the puzzle. But won't affect this tests IMO. |
5c49e33 to
5aa7484
Compare
69a8de6 to
4d9eb97
Compare
|
Example failure with the new test: /teuthology/pdonnell-2024-05-16_22:05:38-fs:fscrypt-wip-pdonnell-testing-20240515.023933-debug-distro-default-smithi/7709888/teuthology.log I'll polish this and cleanup the commits. |
816c2ae to
7cc6bdc
Compare
|
|
||
| self.fs.fail() | ||
|
|
||
| self.fs.journal_tool(['event', 'recover_dentries', 'list'], 0) |
There was a problem hiding this comment.
I'm running this patch on my local dev machine. It gets to this line and fails during decode(). Have you run this on teuthology? Can you share a log for me to look at?
There was a problem hiding this comment.
I have the example without the fix here:
but I don't have a link on hand with this fix.
Are you running an old version of the journal-tool perhaps?
There was a problem hiding this comment.
I've ran it via vstart_runner within branch: i64602.
Also ran the command manually:
./bin/cephfs-journal-tool --debug-mds=20 --debug-ms=1 --debug-objecter=1 --rank cephfs:0 event recover_dentries list
There was a problem hiding this comment.
I'll run this through QA and see what we find.
There was a problem hiding this comment.
So yes, there is a decode error: https://pulpito.ceph.com/pdonnell-2024-05-21_20:54:38-fs:fscrypt-wip-pdonnell-testing-20240521.172753-debug-distro-default-smithi/
It doesn't make much sense to me. I'll reproduce locally and gdb it.
There was a problem hiding this comment.
This is fixed now @chrisphoffman . See:
The InodeStore needed encoding wrapping as well.
|
This PR is under test in https://tracker.ceph.com/issues/66402. |
|
jenkins test make check arm64 |
* refs/pull/55792/head: tools/cephfs: recover alternate_name of dentries from journal qa: add test to verify recovery of alternate_name from journal tools/cephfs/JournalTool: amend comment to be consistent with code tools/cephfs/JournalTool: add some more debugging mds: remove extraneous 0x in debug output mds: dump alternate_name to formatter mds: add warning about encoding new fields
|
jenkins test make check |
|
I think this PR is ready for general QA. The added test now suceeds: except the job still fails due to a known issue https://tracker.ceph.com/issues/65136 |
chrisphoffman
left a comment
There was a problem hiding this comment.
LGTM
nits:
ceph/src/tools/cephfs/JournalTool.cc
Line 902 in 95df0bc
update comment, below bare=False
ceph/src/tools/cephfs/JournalTool.cc
Line 964 in 95df0bc
update comment, change case to i|l and bare=false
It's never a good idea to put new fields in the middle of other encoded values unless you're willing to break all backwards-compatibility. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Looked like:
recover_dentries: updating root 0x0x1
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Test without the fix:
2024-05-16T22:34:21.781 DEBUG:teuthology.orchestra.run.smithi044:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 300 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.a.asok --format=json dump tree /dir 0
...
[
{
"accounted_rstat": {
"rbytes": 4096,
"rctime": "2024-05-16T22:34:15.251864+0000",
"rfiles": 1,
"rsnaps": 0,
"rsubdirs": 1,
"version": 0
},
"atime": "2024-05-16T22:34:14.498880+0000",
"auth_pins": 0,
"auth_state": {
"replicas": {}
},
"authlock": {},
"backtrace_version": 14,
"btime": "2024-05-16T22:34:14.498880+0000",
"change_attr": 3,
"client_caps": [
{
"client_id": 5184,
"issued": "pAsLsXsFsx",
"last_sent": 4,
"pending": "pAsLsXsFsx",
"wanted": "pAsLsXsFsxcral"
}
],
"client_ranges": [],
"ctime": "2024-05-16T22:34:15.249864+0000",
"damage_flags": 0,
"dir_layout": {
"dir_hash": 2,
"unused1": 0,
"unused2": 0,
"unused3": 0
},
"dirfrags": [
{
"auth_pins": 0,
"auth_state": {
"replicas": {}
},
"committed_version": "0",
"committing_version": "0",
"dentries": [
{
"alternate_name": "bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5/YV7DtpeNPZnorcTqxPM/hExtWHSS4P+S+Dpwj62hMyh/77sGhiW1Filvv1gQjV+sN/GozPNwHgfleadkUs1OkRkYtgWrCjbKP0MayRtiOLrVTRuYyOp/Qt3+XCIyiS87B9bUcOFjWratF+yR0kpJ0RYriix7NKVkBJ0kGWYSCY+PYjiLeMYJBMQcCxW/nwfVku+m6fgFJvb6pjEFxIk9zT5cunSImsjr",
"auth_pins": 0,
"auth_state": {
"replicas": {}
},
"inode": 1099511628283,
"is_auth": true,
"is_freezing": false,
"is_frozen": false,
"is_new": false,
"is_null": false,
"is_primary": true,
"is_remote": false,
"lock": {},
"nref": 2,
"path": "dir/bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5,YV7DtpeNPZnorcTqxPM,hExtWHSS4P+S+Dpwj62hMyh,77sGhiW1Filvv1gQjV+sN,GozPNwHgfleadkUuZ+PMLCaKQXhuid9WvmHanxJnaabYDLj4VEz+EX2WsG",
...
# fail + journal recovery
2024-05-16T22:35:31.077 DEBUG:teuthology.orchestra.run.smithi044:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 300 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.a.asok --format=json dump tree /dir 0
...
[
{
"accounted_rstat": {
"rbytes": 4096,
"rctime": "2024-05-16T22:34:15.251864+0000",
"rfiles": 1,
"rsnaps": 0,
"rsubdirs": 1,
"version": 0
},
"atime": "2024-05-16T22:34:14.498880+0000",
"auth_pins": 0,
"auth_state": {
"replicas": {}
},
"authlock": {},
"backtrace_version": 14,
"btime": "2024-05-16T22:34:14.498880+0000",
"change_attr": 3,
"client_caps": [],
"client_ranges": [],
"ctime": "2024-05-16T22:34:15.249864+0000",
"damage_flags": 0,
"dir_layout": {
"dir_hash": 2,
"unused1": 0,
"unused2": 0,
"unused3": 0
},
"dirfrags": [
{
"auth_pins": 0,
"auth_state": {
"replicas": {}
},
"committed_version": "5",
"committing_version": "5",
"dentries": [
{
"alternate_name": "",
"auth_pins": 0,
"auth_state": {
"replicas": {}
},
"inode": 1099511628283,
"is_auth": true,
"is_freezing": false,
"is_frozen": false,
"is_new": false,
"is_null": false,
"is_primary": true,
"is_remote": false,
"lock": {},
"nref": 2,
"path": "dir/bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5,YV7DtpeNPZnorcTqxPM,hExtWHSS4P+S+Dpwj62hMyh,77sGhiW1Filvv1gQjV+sN,GozPNwHgfleadkUuZ+PMLCaKQXhuid9WvmHanxJnaabYDLj4VEz+EX2WsG",
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Right now the cephfs-journal-tool always uses the legacy encoding for dentries which will drop the alternate_name if ever set. Fixes: 39f3440 Fixes: https://tracker.ceph.com/issues/64602 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Thanks for pointing that out, fixed! |
|
jenkins test api |
|
jenkins test make check arm64 |
|
This PR is under test in https://tracker.ceph.com/issues/66433. |
|
This PR is under test in https://tracker.ceph.com/issues/66462. |
* refs/pull/55792/head: tools/cephfs: recover alternate_name of dentries from journal qa: add test to verify recovery of alternate_name from journal tools/cephfs/JournalTool: add some more debugging tools/cephfs/JournalTool: remove extraneous 0x in debug output mds: dump alternate_name to formatter mds: add warning about encoding new fields
|
|
Right now the cephfs-journal-tool always uses the legacy encoding for dentries which will drop the alternate_name if ever set.
Fixes: https://tracker.ceph.com/issues/64602
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e