Skip to content

tools/cephfs: recover alternate_name of dentries from journal#55792

Merged
batrick merged 6 commits intoceph:mainfrom
batrick:i64602
Jun 19, 2024
Merged

tools/cephfs: recover alternate_name of dentries from journal#55792
batrick merged 6 commits intoceph:mainfrom
batrick:i64602

Conversation

@batrick
Copy link
Member

@batrick batrick commented Feb 27, 2024

Right now the cephfs-journal-tool always uses the legacy encoding for dentries which will drop the alternate_name if ever set.

Fixes: https://tracker.ceph.com/issues/64602

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@batrick
Copy link
Member Author

batrick commented Feb 27, 2024

Obviously, I think tests need written for this but unfortunately that's not immediately easy to do because alternate_name is only easy to add within Client tests (src/test/client/) and not in a broader test where disaster recovery is also performed. One option is to add libcephfs/python bindings for a script but that's not trivial. Or a test-specific asok command. Any other thoughts?

@vshankar vshankar requested a review from a team February 28, 2024 13:47
@vshankar
Copy link
Contributor

Right now the cephfs-journal-tool always uses the legacy encoding for dentries which will drop the alternate_name if ever set.

Good catch!

@batrick
Copy link
Member Author

batrick commented Feb 28, 2024

Obviously, I think tests need written for this but unfortunately that's not immediately easy to do because alternate_name is only easy to add within Client tests (src/test/client/) and not in a broader test where disaster recovery is also performed. One option is to add libcephfs/python bindings for a script but that's not trivial. Or a test-specific asok command. Any other thoughts?

@vshankar @lxbsz is the fs:fscrypt suite stable enough for adding a test?

encode('I', dentry_bl);
encode('i', dentry_bl);
ENCODE_START(2, 1, dentry_bl);
encode(fb.alternate_name, dentry_bl);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really the encode order, where Inode has alternate_name at the front and Link has it at the back?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In both cases alternate_name is a piece of dentry metadata. As a primary link, ino/d_type are inferred from the inode that's embedded in the dentry. It doesn't really matter in any case (as inode encoding was also made versioned [1] where before it was not [2]) but made sense to everyone at the time.

[1] https://github.com/ceph/ceph/pull/37297/files#diff-c0b626494569135ea1406cfad7ab9929a4df0a1f7d880405b5d2e734db812a03R2207-R2241
[2] https://github.com/ceph/ceph/pull/37297/files#diff-c0b626494569135ea1406cfad7ab9929a4df0a1f7d880405b5d2e734db812a03L2285-L2309

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct.

@chrisphoffman
Copy link
Contributor

Obviously, I think tests need written for this but unfortunately that's not immediately easy to do because alternate_name is only easy to add within Client tests (src/test/client/) and not in a broader test where disaster recovery is also performed. One option is to add libcephfs/python bindings for a script but that's not trivial. Or a test-specific asok command. Any other thoughts?

I think asok could work in this case. How much encoding validation do we currently do?

Per this here [1] , this field is currently used for supporting long file names on clients who don't know about fscrypt. I proposed a custom client to validate security posture in [2]. Perhaps we add additional tests to this custom client, to further test when that becomes available.

1: https://lwn.net/Articles/843302/
2: #54725 (comment)

@batrick
Copy link
Member Author

batrick commented Mar 15, 2024

Obviously, I think tests need written for this but unfortunately that's not immediately easy to do because alternate_name is only easy to add within Client tests (src/test/client/) and not in a broader test where disaster recovery is also performed. One option is to add libcephfs/python bindings for a script but that's not trivial. Or a test-specific asok command. Any other thoughts?

I think asok could work in this case. How much encoding validation do we currently do?

I'd really rather not create an asok command just for synthetic manipulation of alternate_name of a dentry, if possible.

Per this here [1] , this field is currently used for supporting long file names on clients who don't know about fscrypt. I proposed a custom client to validate security posture in [2]. Perhaps we add additional tests to this custom client, to further test when that becomes available.

1: https://lwn.net/Articles/843302/ 2: #54725 (comment)

No actually alternate_name is only used for the encrypted file name if it's too long for a regular file name. It has a very narrow use-case (atm, samba is thinking about using it for caps-insensitive file names).

@lxbsz
Copy link
Member

lxbsz commented Mar 18, 2024

Obviously, I think tests need written for this but unfortunately that's not immediately easy to do because alternate_name is only easy to add within Client tests (src/test/client/) and not in a broader test where disaster recovery is also performed. One option is to add libcephfs/python bindings for a script but that's not trivial. Or a test-specific asok command. Any other thoughts?

@vshankar @lxbsz is the fs:fscrypt suite stable enough for adding a test?

@batrick Sorry for late reply and I think missed this.

Yeah it is. The #54725 is the last piece of the puzzle. But won't affect this tests IMO.

@batrick batrick force-pushed the i64602 branch 2 times, most recently from 5c49e33 to 5aa7484 Compare May 15, 2024 02:39
@github-actions github-actions bot added the tests label May 15, 2024
@batrick batrick force-pushed the i64602 branch 7 times, most recently from 69a8de6 to 4d9eb97 Compare May 16, 2024 22:05
@batrick
Copy link
Member Author

batrick commented May 17, 2024

Example failure with the new test:

2024-05-16T22:34:21.781 DEBUG:teuthology.orchestra.run.smithi044:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 300 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.a.asok --format=json dump tree /dir 0
...
[
  {
    "accounted_rstat": {
      "rbytes": 4096,
      "rctime": "2024-05-16T22:34:15.251864+0000",
      "rfiles": 1,
      "rsnaps": 0,
      "rsubdirs": 1,
      "version": 0
    },
    "atime": "2024-05-16T22:34:14.498880+0000",
    "auth_pins": 0,
    "auth_state": {
      "replicas": {}
    },
    "authlock": {},
    "backtrace_version": 14,
    "btime": "2024-05-16T22:34:14.498880+0000",
    "change_attr": 3,
    "client_caps": [
      {
        "client_id": 5184,
        "issued": "pAsLsXsFsx",
        "last_sent": 4,
        "pending": "pAsLsXsFsx",
        "wanted": "pAsLsXsFsxcral"
      }
    ],
    "client_ranges": [],
    "ctime": "2024-05-16T22:34:15.249864+0000",
    "damage_flags": 0,
    "dir_layout": {
      "dir_hash": 2,
      "unused1": 0,
      "unused2": 0,
      "unused3": 0
    },
    "dirfrags": [
      {
        "auth_pins": 0,
        "auth_state": {
          "replicas": {}
        },
        "committed_version": "0",
        "committing_version": "0",
        "dentries": [
          {
            "alternate_name": "bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5/YV7DtpeNPZnorcTqxPM/hExtWHSS4P+S+Dpwj62hMyh/77sGhiW1Filvv1gQjV+sN/GozPNwHgfleadkUs1OkRkYtgWrCjbKP0MayRtiOLrVTRuYyOp/Qt3+XCIyiS87B9bUcOFjWratF+yR0kpJ0RYriix7NKVkBJ0kGWYSCY+PYjiLeMYJBMQcCxW/nwfVku+m6fgFJvb6pjEFxIk9zT5cunSImsjr",
            "auth_pins": 0,
            "auth_state": {
              "replicas": {}
            },
            "inode": 1099511628283,
            "is_auth": true,
            "is_freezing": false,
            "is_frozen": false,
            "is_new": false,
            "is_null": false,
            "is_primary": true,
            "is_remote": false,
            "lock": {},
            "nref": 2,
            "path": "dir/bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5,YV7DtpeNPZnorcTqxPM,hExtWHSS4P+S+Dpwj62hMyh,77sGhiW1Filvv1gQjV+sN,GozPNwHgfleadkUuZ+PMLCaKQXhuid9WvmHanxJnaabYDLj4VEz+EX2WsG",
...
# fail + journal recovery
2024-05-16T22:35:31.077 DEBUG:teuthology.orchestra.run.smithi044:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 300 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.a.asok --format=json dump tree /dir 0
...
[
  {
    "accounted_rstat": {
      "rbytes": 4096,
      "rctime": "2024-05-16T22:34:15.251864+0000",
      "rfiles": 1,
      "rsnaps": 0,
      "rsubdirs": 1,
      "version": 0
    },
    "atime": "2024-05-16T22:34:14.498880+0000",
    "auth_pins": 0,
    "auth_state": {
      "replicas": {}
    },
    "authlock": {},
    "backtrace_version": 14,
    "btime": "2024-05-16T22:34:14.498880+0000",
    "change_attr": 3,
    "client_caps": [],
    "client_ranges": [],
    "ctime": "2024-05-16T22:34:15.249864+0000",
    "damage_flags": 0,
    "dir_layout": {
      "dir_hash": 2,
      "unused1": 0,
      "unused2": 0,
      "unused3": 0
    },
    "dirfrags": [
      {
        "auth_pins": 0,
        "auth_state": {
          "replicas": {}
        },
        "committed_version": "5",
        "committing_version": "5",
        "dentries": [
          {
            "alternate_name": "",
            "auth_pins": 0,
            "auth_state": {
              "replicas": {}
            },
            "inode": 1099511628283,
            "is_auth": true,
            "is_freezing": false,
            "is_frozen": false,
            "is_new": false,
            "is_null": false,
            "is_primary": true,
            "is_remote": false,
            "lock": {},
            "nref": 2,
            "path": "dir/bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5,YV7DtpeNPZnorcTqxPM,hExtWHSS4P+S+Dpwj62hMyh,77sGhiW1Filvv1gQjV+sN,GozPNwHgfleadkUuZ+PMLCaKQXhuid9WvmHanxJnaabYDLj4VEz+EX2WsG",

/teuthology/pdonnell-2024-05-16_22:05:38-fs:fscrypt-wip-pdonnell-testing-20240515.023933-debug-distro-default-smithi/7709888/teuthology.log

I'll polish this and cleanup the commits.

@batrick batrick force-pushed the i64602 branch 2 times, most recently from 816c2ae to 7cc6bdc Compare May 17, 2024 17:00
@batrick batrick requested a review from vshankar May 17, 2024 17:01
@batrick
Copy link
Member Author

batrick commented May 17, 2024

@lxbsz @vshankar PTAL

@batrick
Copy link
Member Author

batrick commented May 17, 2024


self.fs.fail()

self.fs.journal_tool(['event', 'recover_dentries', 'list'], 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm running this patch on my local dev machine. It gets to this line and fails during decode(). Have you run this on teuthology? Can you share a log for me to look at?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the example without the fix here:

#55792 (comment)

but I don't have a link on hand with this fix.

Are you running an old version of the journal-tool perhaps?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've ran it via vstart_runner within branch: i64602.

Also ran the command manually:
./bin/cephfs-journal-tool --debug-mds=20 --debug-ms=1 --debug-objecter=1 --rank cephfs:0 event recover_dentries list

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll run this through QA and see what we find.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So yes, there is a decode error: https://pulpito.ceph.com/pdonnell-2024-05-21_20:54:38-fs:fscrypt-wip-pdonnell-testing-20240521.172753-debug-distro-default-smithi/

It doesn't make much sense to me. I'll reproduce locally and gdb it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@batrick
Copy link
Member Author

batrick commented Jun 8, 2024

This PR is under test in https://tracker.ceph.com/issues/66402.

@batrick
Copy link
Member Author

batrick commented Jun 9, 2024

jenkins test make check arm64

batrick added a commit to batrick/ceph that referenced this pull request Jun 9, 2024
* refs/pull/55792/head:
	tools/cephfs: recover alternate_name of dentries from journal
	qa: add test to verify recovery of alternate_name from journal
	tools/cephfs/JournalTool: amend comment to be consistent with code
	tools/cephfs/JournalTool: add some more debugging
	mds: remove extraneous 0x in debug output
	mds: dump alternate_name to formatter
	mds: add warning about encoding new fields
@batrick
Copy link
Member Author

batrick commented Jun 9, 2024

jenkins test make check

@batrick
Copy link
Member Author

batrick commented Jun 10, 2024

I think this PR is ready for general QA. The added test now suceeds:

https://pulpito.ceph.com/pdonnell-2024-06-10_00:07:27-fs:fscrypt-wip-pdonnell-testing-20240608.193933-debug-distro-default-smithi/

except the job still fails due to a known issue https://tracker.ceph.com/issues/65136

Copy link
Contributor

@chrisphoffman chrisphoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

nits:

// Compose: Dentry format is dnfirst, [i|l], InodeStore(bare=true)

update comment, below bare=False

// Compose: Dentry format is dnfirst, [I|L], InodeStore(bare=true)

update comment, change case to i|l and bare=false

batrick added 6 commits June 10, 2024 12:52
It's never a good idea to put new fields in the middle of other encoded values
unless you're willing to break all backwards-compatibility.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Looked like:

    recover_dentries: updating root 0x0x1

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Test without the fix:

    2024-05-16T22:34:21.781 DEBUG:teuthology.orchestra.run.smithi044:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 300 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.a.asok --format=json dump tree /dir 0
    ...
    [
      {
        "accounted_rstat": {
          "rbytes": 4096,
          "rctime": "2024-05-16T22:34:15.251864+0000",
          "rfiles": 1,
          "rsnaps": 0,
          "rsubdirs": 1,
          "version": 0
        },
        "atime": "2024-05-16T22:34:14.498880+0000",
        "auth_pins": 0,
        "auth_state": {
          "replicas": {}
        },
        "authlock": {},
        "backtrace_version": 14,
        "btime": "2024-05-16T22:34:14.498880+0000",
        "change_attr": 3,
        "client_caps": [
          {
            "client_id": 5184,
            "issued": "pAsLsXsFsx",
            "last_sent": 4,
            "pending": "pAsLsXsFsx",
            "wanted": "pAsLsXsFsxcral"
          }
        ],
        "client_ranges": [],
        "ctime": "2024-05-16T22:34:15.249864+0000",
        "damage_flags": 0,
        "dir_layout": {
          "dir_hash": 2,
          "unused1": 0,
          "unused2": 0,
          "unused3": 0
        },
        "dirfrags": [
          {
            "auth_pins": 0,
            "auth_state": {
              "replicas": {}
            },
            "committed_version": "0",
            "committing_version": "0",
            "dentries": [
              {
                "alternate_name": "bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5/YV7DtpeNPZnorcTqxPM/hExtWHSS4P+S+Dpwj62hMyh/77sGhiW1Filvv1gQjV+sN/GozPNwHgfleadkUs1OkRkYtgWrCjbKP0MayRtiOLrVTRuYyOp/Qt3+XCIyiS87B9bUcOFjWratF+yR0kpJ0RYriix7NKVkBJ0kGWYSCY+PYjiLeMYJBMQcCxW/nwfVku+m6fgFJvb6pjEFxIk9zT5cunSImsjr",
                "auth_pins": 0,
                "auth_state": {
                  "replicas": {}
                },
                "inode": 1099511628283,
                "is_auth": true,
                "is_freezing": false,
                "is_frozen": false,
                "is_new": false,
                "is_null": false,
                "is_primary": true,
                "is_remote": false,
                "lock": {},
                "nref": 2,
                "path": "dir/bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5,YV7DtpeNPZnorcTqxPM,hExtWHSS4P+S+Dpwj62hMyh,77sGhiW1Filvv1gQjV+sN,GozPNwHgfleadkUuZ+PMLCaKQXhuid9WvmHanxJnaabYDLj4VEz+EX2WsG",
    ...
    # fail + journal recovery
    2024-05-16T22:35:31.077 DEBUG:teuthology.orchestra.run.smithi044:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 300 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-mds.a.asok --format=json dump tree /dir 0
    ...
    [
      {
        "accounted_rstat": {
          "rbytes": 4096,
          "rctime": "2024-05-16T22:34:15.251864+0000",
          "rfiles": 1,
          "rsnaps": 0,
          "rsubdirs": 1,
          "version": 0
        },
        "atime": "2024-05-16T22:34:14.498880+0000",
        "auth_pins": 0,
        "auth_state": {
          "replicas": {}
        },
        "authlock": {},
        "backtrace_version": 14,
        "btime": "2024-05-16T22:34:14.498880+0000",
        "change_attr": 3,
        "client_caps": [],
        "client_ranges": [],
        "ctime": "2024-05-16T22:34:15.249864+0000",
        "damage_flags": 0,
        "dir_layout": {
          "dir_hash": 2,
          "unused1": 0,
          "unused2": 0,
          "unused3": 0
        },
        "dirfrags": [
          {
            "auth_pins": 0,
            "auth_state": {
              "replicas": {}
            },
            "committed_version": "5",
            "committing_version": "5",
            "dentries": [
              {
                "alternate_name": "",
                "auth_pins": 0,
                "auth_state": {
                  "replicas": {}
                },
                "inode": 1099511628283,
                "is_auth": true,
                "is_freezing": false,
                "is_frozen": false,
                "is_new": false,
                "is_null": false,
                "is_primary": true,
                "is_remote": false,
                "lock": {},
                "nref": 2,
                "path": "dir/bUHUwH9E8uiVaf8xZ+zONcB1CToj53x5aUUnKdnNj5U37zbh28l1AaWwHhbOT3HyzqKjmSKKW1o4odQJc7nF9xrKIB8D3b4qqb2Cs6s7t2106hHhQk5,YV7DtpeNPZnorcTqxPM,hExtWHSS4P+S+Dpwj62hMyh,77sGhiW1Filvv1gQjV+sN,GozPNwHgfleadkUuZ+PMLCaKQXhuid9WvmHanxJnaabYDLj4VEz+EX2WsG",

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Right now the cephfs-journal-tool always uses the legacy encoding for dentries
which will drop the alternate_name if ever set.

Fixes: 39f3440
Fixes: https://tracker.ceph.com/issues/64602
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
@batrick
Copy link
Member Author

batrick commented Jun 10, 2024

LGTM

nits:

// Compose: Dentry format is dnfirst, [i|l], InodeStore(bare=true)

update comment, below bare=False

// Compose: Dentry format is dnfirst, [I|L], InodeStore(bare=true)

update comment, change case to i|l and bare=false

Thanks for pointing that out, fixed!

@batrick
Copy link
Member Author

batrick commented Jun 10, 2024

jenkins test api

@batrick
Copy link
Member Author

batrick commented Jun 11, 2024

jenkins test make check arm64

@batrick
Copy link
Member Author

batrick commented Jun 11, 2024

This PR is under test in https://tracker.ceph.com/issues/66433.

@batrick
Copy link
Member Author

batrick commented Jun 13, 2024

This PR is under test in https://tracker.ceph.com/issues/66462.

batrick added a commit to batrick/ceph that referenced this pull request Jun 13, 2024
* refs/pull/55792/head:
	tools/cephfs: recover alternate_name of dentries from journal
	qa: add test to verify recovery of alternate_name from journal
	tools/cephfs/JournalTool: add some more debugging
	tools/cephfs/JournalTool: remove extraneous 0x in debug output
	mds: dump alternate_name to formatter
	mds: add warning about encoding new fields
@batrick
Copy link
Member Author

batrick commented Jun 19, 2024

2024-06-13T14:43:04.368 INFO:tasks.cephfs_test_runner:That alternate_name can be recovered from the journal. ... ok

from http://qa-proxy.ceph.com/teuthology/pdonnell-2024-06-13_12:32:20-fs-wip-pdonnell-testing-20240613.014923-debug-distro-default-smithi/7753792/teuthology.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants