client: handle snapped inode reply in add_updated_inode()#61838
client: handle snapped inode reply in add_updated_inode()#61838
Conversation
Calling
ceph_ll_lookup(cmount, root, "/", ...)
cause the MDS to receive client request as
handle_client_request client_request(client.44210:2 lookup #0x1// ...)
The MDS interprets this as a lookup on a snapped inode (snapdir in this case)
and replies back to the client with snap based inode metadata. Using this inode
to perform further lookups causes the client to segfault since the snapdir
component (metadata) isn't available.
Fixes: https://tracker.ceph.com/issues/69624
Signed-off-by: Venky Shankar <vshankar@redhat.com>
|
|
||
| /* treated as a request under a snapdir */ | ||
| ret = ceph_ll_lookup(cmount, slash, dir_name, &diri, &stx, CEPH_STATX_INO, 0, perms); | ||
| ASSERT_EQ(ret, -ENOENT); |
There was a problem hiding this comment.
FYI, note that using the parent inode (slash), this is treated as a request under a snap since // is interpreted as a snap request by the MDS.
|
@vshankar locally it's failing with coredump in mds: eats up the standby: culprit looks to be |
ha! I saw that failure initially when I used the reproducer and mentioned is here: https://tracker.ceph.com/issues/69624#note-4, but then I used the latest HEAD and didn't see it again, which made me believe that some my local changes were contributing to it. |
|
@dparmar18 can you track down the LoC where the MDS is crashing if you are able to reproduce this with latest HEAD? |
hmm, let me rebase and try with latest head, im using old repo |
| sprintf(dir_name, "dir_%d", mypid); | ||
| sprintf(dir_path, "/%s", dir_name); | ||
|
|
||
| ASSERT_EQ(ceph_mkdir(cmount, dir_path, 0777), 0); |
There was a problem hiding this comment.
do we need to create this dir? we are making use of dir_name instead of dir_path in the last lookup (because we want ENOENT), i guess we can just omit this.
There was a problem hiding this comment.
err, actually I wanted to create the file and check if we really get back an ENOENT.
There was a problem hiding this comment.
FWIW, I could no longer reproduce the crash with the reproducer from tracker using a custom built libcephfs including the suggested fix.
# ./lookup foo cephvol
root inode = 3
first lookup at / completed
Segmentation fault (core dumped)
# LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib ./lookup foo cephvol
root inode = 3
first lookup at / completed
ceph_ll_lookup failed: No such file or directory
dparmar18
left a comment
There was a problem hiding this comment.
LGTM, tested it locally and works fine.
|
Can we move forward with the next steps to get the changes merged soon? |
Need to run this through tests. Will be doing it this week. |
|
This PR is under test in https://tracker.ceph.com/issues/70201. |
|
Tests did not run over the weekend. Trigerred it again. |
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
I will check the path walk/traverse changes in that PR and see if this change still needs to be taken in. |
|
This is interesting. With the changes from #60746 the second lookup from the reproducer is successfully completed whereas it returned With just the changes from the current PR(excluding #60746): With the current code base including #60746: |
To confirm, I ran the test case from the PR against the current code base and it failed: |
|
@anoopcs9 Thx for the updates. I will get to this tomorrow (finishing up another priority bug). |
Yeah - that path_walk changes in the case-insensitivity PR is treating looking up "/" as success and the I think the request isn't even forwarded to the MDS. Question is, what's the correct/intended behaviour? |
|
The thing is if its possible to sneak in a request to the MDS with path as |
|
So, here what required from this change: The path_walk changes from the case-insentivity changes have transparently worked around the issue that this change was aiming to fix. The question about the intended behaviour to lookup What can be done in this change is to disallow operations if the |
Here's my take:
I'm not sure it is needed. |
|
@vshankar Should we proceed with this PR further? I don't think so. Also what to do with the underlying tracker #69624? |
I think this can be close this the issue is fixed by the path_walk changes for case-insentivity. |
|
I will fixup the tracker. |
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e