Skip to content

mds: session in the importing state cannot be cleared if an export subtree task is interrupted while the state of importer is acking#51766

Merged
vshankar merged 6 commits intoceph:mainfrom
zhsgao:mds_clear_session_failed
Nov 14, 2024
Merged

Conversation

@zhsgao
Copy link
Contributor

@zhsgao zhsgao commented May 26, 2023

The related sessions in the importer are in the importing state(Session::is_importing return true) when the state of importer is acking, Migrator::import_reverse called by MDCache::handle_resolve should reverse the process to clear the importing state if the exporter restarts at this time, but it doesn't do that actually because of its bug. And it will cause these sessions to not be cleared when the client is unmounted(evicted or timeout) until the mds is restarted.

Fixes: https://tracker.ceph.com/issues/61459
Signed-off-by: Zhansong Gao zhsgao@hotmail.com

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@github-actions github-actions bot added cephfs Ceph File System tests labels May 26, 2023
@zhsgao zhsgao force-pushed the mds_clear_session_failed branch from 9620f7b to d323023 Compare May 26, 2023 06:38
@dparmar18 dparmar18 requested a review from a team May 29, 2023 11:02

case IMPORT_ACKING:
ceph_assert(dir);
// hrm. make this an ambiguous import, and wait for exporter recovery to disambiguate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line says that the export will do the disambiguation. You mentioned "should reverse the process to clear the importing state if the exporter restarts at this time, but it doesn't do that actually". So an exporter once restarted, doesn't perform recovery and thus no disambiguation, is it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line says that the export will do the disambiguation. You mentioned "should reverse the process to clear the importing state if the exporter restarts at this time, but it doesn't do that actually". So an exporter once restarted, doesn't perform recovery and thus no disambiguation, is it?

Yes, it is. import_reverse can reverse the process when the state is IMPORT_ACKING. But import_reverse has a bug and we need to fix it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is the case then I feel rather than changing the way we handle the cases, we should address the bug in import_reverse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is the case then I feel rather than changing the way we handle the cases, we should address the bug in import_reverse.

The bug in import_reverse is that it contains the code to handle state IMPORT_ACKING but it will never be executed because the state is modified to IMPORT_ABORTING at the beginning. Move stat.state = IMPORT_ABORTING to the end of import_reverse so that it can handle the state IMPORT_ACKING. And then we call import_reverse when the state is IMPORT_ACKING and the process needs to be reversed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not believe this change is correct. The comment is correct. It depends if EExport is logged by the exporter after failover to mark the importer as auth. If that event is not logged, then the exporter retains auth. This is cleaned up in:

ceph/src/mds/MDCache.cc

Lines 3270 to 3272 in d146d2b

if (claimed_by_sender) {
dout(7) << "ambiguous import failed on " << *dir << dendl;
migrator->import_reverse(dir);

Note that the import may be finished otherwise! So yes, the subtree needs to be marked amibguous here.

Your other change to fix import_reverse is correct though!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not believe this change is correct. The comment is correct. It depends if EExport is logged by the exporter after failover to mark the importer as auth. If that event is not logged, then the exporter retains auth. This is cleaned up in:

ceph/src/mds/MDCache.cc

Lines 3270 to 3272 in d146d2b

if (claimed_by_sender) {
dout(7) << "ambiguous import failed on " << *dir << dendl;
migrator->import_reverse(dir);

Note that the import may be finished otherwise! So yes, the subtree needs to be marked amibguous here.

Your other change to fix import_reverse is correct though!

Yes, you are right. I'm sorry I didn't read this part of the code carefully before. And we only need to modify import_reverse to fix this bug, so I have removed this part of the change.

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few recommended changes:

  • Test case should be a parent commit (i.e. before) the fix. This makes testing the fix before and after easier.
  • Add importing_count for session dump. should be mds: add importing_count to session dump. This can go before the qa: commit so the test works.

def test_clear_session(self):
source = 0
target = 1
kill = 9
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment this magic number

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importantly to confirm this fix: also test kill = 11.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment this magic number

I used more human readable variable names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think patrick meant to test both 9 and 11? I see its now changed to 11

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importantly to confirm this fix: also test kill = 11.

I add a new test case for it, but it cannot trigger this bug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ceph_assert(g_conf()->mds_kill_export_at != 10);

The event will not be persisted yet by the time this assert is hit (the assert really should be before the journal entry is submitted, otherwise it's racy). Since EExport will not be logged by the exporter, the importer should be forced to reverse the import during recovery.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ceph_assert(g_conf()->mds_kill_export_at != 10);

The event will not be persisted yet by the time this assert is hit (the assert really should be before the journal entry is submitted, otherwise it's racy). Since EExport will not be logged by the exporter, the importer should be forced to reverse the import during recovery.

I thought so too until I tested it in my test environment. I found the replay log of EExport in the exporter's log after the test was done, which means that EExport had been logged. I had tested it multiple times and got the same result every time. I added sleep code in MDLog::_submit_thread and tested it again, the replay log of EExport would no longer be found in the exporter's log.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that's what I meant by it being racy. The assert should really be before the journal entry submit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(You should fix that in this PR!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(You should fix that in this PR!)

OK, I've added a new commit to fix it.


case IMPORT_ACKING:
ceph_assert(dir);
// hrm. make this an ambiguous import, and wait for exporter recovery to disambiguate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not believe this change is correct. The comment is correct. It depends if EExport is logged by the exporter after failover to mark the importer as auth. If that event is not logged, then the exporter retains auth. This is cleaned up in:

ceph/src/mds/MDCache.cc

Lines 3270 to 3272 in d146d2b

if (claimed_by_sender) {
dout(7) << "ambiguous import failed on " << *dir << dendl;
migrator->import_reverse(dir);

Note that the import may be finished otherwise! So yes, the subtree needs to be marked amibguous here.

Your other change to fix import_reverse is correct though!

@zhsgao zhsgao force-pushed the mds_clear_session_failed branch from d323023 to 27008e3 Compare June 6, 2023 12:44
@dparmar18
Copy link
Contributor

Your third commit is empty, the code to fix import_reverse is added in qa commit, you'd also need to change the third commit message(explanation about stat.state = IMPORT_ABORTING; in the beginning would be troublesome for handling case IMPORT_ACKING)

@zhsgao
Copy link
Contributor Author

zhsgao commented Jun 6, 2023

Few recommended changes:

  • Test case should be a parent commit (i.e. before) the fix. This makes testing the fix before and after easier.
  • Add importing_count for session dump. should be mds: add importing_count to session dump. This can go before the qa: commit so the test works.

OK, I've modified them.

@zhsgao zhsgao force-pushed the mds_clear_session_failed branch 2 times, most recently from c396967 to aa84a75 Compare June 6, 2023 13:33
@zhsgao
Copy link
Contributor Author

zhsgao commented Jun 6, 2023

Your third commit is empty, the code to fix import_reverse is added in qa commit, you'd also need to change the third commit message(explanation about stat.state = IMPORT_ABORTING; in the beginning would be troublesome for handling case IMPORT_ACKING)

I have fixed them, thanks!

@dparmar18
Copy link
Contributor

Minor nit: the actual fix commit is after the testcase, mind shuffling it?

@zhsgao
Copy link
Contributor Author

zhsgao commented Jun 6, 2023

Minor nit: the actual fix commit is after the testcase, mind shuffling it?

I adjusted the order because #51766 (review), or am I having a wrong understanding of this?

@zhsgao zhsgao force-pushed the mds_clear_session_failed branch 2 times, most recently from 67c67f7 to 84f15a1 Compare June 16, 2023 04:43
self.fs.rank_asok(['export', 'dir', path, str(target)], rank=source, status=self.status)

def _wait_laggy(self, rank):
self.wait_until_true(lambda: "laggy_since" in self.fs.get_rank(rank=rank), timeout=self.fs.beacon_timeout)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.wait_until_true(lambda: "laggy_since" in self.fs.get_rank(rank=rank), timeout=self.fs.beacon_timeout)
self.wait_until_true(lambda: self.fs.status().hadfailover(self.status), timeout=self.fs.beacon_timeout)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the rank is replaced then it won't be laggy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a better way and I will use it.

client_id = self.mount_a.get_global_id()
self.mount_a.remount()

# timeout if buggy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# timeout if buggy
# timeout if buggy; s/evicted/removed/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the session is not actually evicted, just removed from the importer's session list.

self.mount_a.remount()

# timeout if buggy
self.wait_until_evicted(client_id, importer_rank)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my own tests, I get:

2023-06-20 15:41:32,451.451 INFO:__main__:======================================================================                                                                              
2023-06-20 15:41:32,451.451 INFO:__main__:ERROR: test_clear_session (tasks.cephfs.test_exports.TestKillExports)                                                                               
2023-06-20 15:41:32,451.451 INFO:__main__:----------------------------------------------------------------------                                                                              
2023-06-20 15:41:32,451.451 INFO:__main__:Traceback (most recent call last):
2023-06-20 15:41:32,452.452 INFO:__main__:  File "/home/pdonnell/scm/ceph/qa/tasks/cephfs/test_exports.py", line 643, in test_clear_session                                                   
2023-06-20 15:41:32,452.452 INFO:__main__:    self.wait_until_evicted(client_id, importer_rank)                                                                                               
2023-06-20 15:41:32,452.452 INFO:__main__:  File "/home/pdonnell/scm/ceph/qa/tasks/cephfs/cephfs_test_case.py", line 270, in wait_until_evicted                                               
2023-06-20 15:41:32,452.452 INFO:__main__:    self.wait_until_true(is_client_evicted, timeout)
2023-06-20 15:41:32,452.452 INFO:__main__:  File "/home/pdonnell/scm/ceph/qa/tasks/ceph_test_case.py", line 223, in wait_until_true                                                           
2023-06-20 15:41:32,452.452 INFO:__main__:    raise TestTimeoutError("Timed out after {0}s and {1} retries".format(elapsed, retry_count))                                                     
2023-06-20 15:41:32,452.452 INFO:__main__:tasks.ceph_test_case.TestTimeoutError: Timed out after 30s and 0 retries  

Importer MDS shows:

2023-06-20T15:40:34.962-0400 7fbcf4389700  7 mds.1.mig import_reverse [dir 0x10000000000 /test/ [2,head] auth{0=1} v=3 cv=0/0 dir_auth=1,0 state=1610876931|complete|frozentree|importing f(v0 m2023-06-20T15:40:12.691087-0400 1=0+1) n(v0 rc2023-06-20T15:40:12.691087-0400 1=0+1) hs=1+0,ss=0+0 dirty=1 | ptrwaiter=0 child=1 frozen=1 subtree=1 importing=1 replicated=1 dirty=1 0x55a9aa3b1b00]
2023-06-20T15:40:34.962-0400 7fbcf4389700  7 mds.1.cache adjust_subtree_auth 1,0 -> 0,-2 on [dir 0x10000000000 /test/ [2,head] auth{0=1} v=3 cv=0/0 dir_auth=1,0 state=1610874883|complete|frozentree f(v0 m2023-06-20T15:40:12.691087-0400 1=0+1) n(v0 rc2023-06-20T15:40:12.691087-0400 1=0+1) hs=1+0,ss=0+0 dirty=1 | ptrwaiter=0 child=1 frozen=1 subtree=1 importing=0 replicated=1 dirty=1 0x55a9aa3b1b00]
2023-06-20T15:40:34.962-0400 7fbcf4389700 15 mds.1.cache show_subtrees
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache |____ 1    auth [dir 0x101 ~mds1/ [2,head] auth v=1 cv=1/1 dir_auth=1 state=1073741825|complete f(v0 10=0+10) n(v0 rc2023-06-20T15:40:08.896877-0400 10=0+10) hs=10+0,ss=0+0 | child=1 subtree=1 dirty=0 authpin=0 0x55a9a5526d80]
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache |_.__ 0     rep [dir 0x1 / [2,head] rep@0.1 REP dir_auth=0 state=0 f(v0 m2023-06-20T15:40:12.645693-0400 1=0+1) n(v0 rc2023-06-20T15:40:12.691087-0400 2=0+2)/n(v0 rc2023-06-20T15:40:12.645693-0400 1=0+1) hs=1+0,ss=0+0 | child=1 subtree=1 0x55a9aa3b1200]
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache   |__ 1, 0 auth [dir 0x10000000000 /test/ [2,head] auth{0=1} v=3 cv=0/0 dir_auth=1,0 state=1610874883|complete|frozentree f(v0 m2023-06-20T15:40:12.691087-0400 1=0+1) n(v0 rc2023-06-20T15:40:12.691087-0400 1=0+1) hs=1+0,ss=0+0 dirty=1 | ptrwaiter=0 child=1 frozen=1 subtree=1 importing=0 replicated=1 dirty=1 0x55a9aa3b1b00]
2023-06-20T15:40:34.962-0400 7fbcf4389700  7 mds.1.cache  current root is [dir 0x10000000000 /test/ [2,head] auth{0=1} v=3 cv=0/0 dir_auth=1,0 state=1610874883|complete|frozentree f(v0 m2023-06-20T15:40:12.691087-0400 1=0+1) n(v0 rc2023-06-20T15:40:12.691087-0400 1=0+1) hs=1+0,ss=0+0 dirty=1 | ptrwaiter=0 child=1 frozen=1 subtree=1 importing=0 replicated=1 dirty=1 0x55a9aa3b1b00]
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache.dir(0x10000000000) setting dir_auth=0,-2 from 1,0 on [dir 0x10000000000 /test/ [2,head] auth{0=1} v=3 cv=0/0 dir_auth=1,0 state=1610874883|complete|frozentree f(v0 m2023-06-20T15:40:12.691087-0400 1=0+1) n(v0 rc2023-06-20T15:40:12.691087-0400 1=0+1) hs=1+0,ss=0+0 dirty=1 | ptrwaiter=0 child=1 frozen=1 subtree=1 importing=0 replicated=1 dirty=1 0x55a9aa3b1b00]
2023-06-20T15:40:34.962-0400 7fbcf4389700 15 mds.1.cache show_subtrees
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache |____ 1    auth [dir 0x101 ~mds1/ [2,head] auth v=1 cv=1/1 dir_auth=1 state=1073741825|complete f(v0 10=0+10) n(v0 rc2023-06-20T15:40:08.896877-0400 10=0+10) hs=10+0,ss=0+0 | child=1 subtree=1 dirty=0 authpin=0 0x55a9a5526d80]
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache |_.__ 0     rep [dir 0x1 / [2,head] rep@0.1 REP dir_auth=0 state=0 f(v0 m2023-06-20T15:40:12.645693-0400 1=0+1) n(v0 rc2023-06-20T15:40:12.691087-0400 2=0+2)/n(v0 rc2023-06-20T15:40:12.645693-0400 1=0+1) hs=1+0,ss=0+0 | child=1 subtree=1 0x55a9aa3b1200]
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache   |__ 0    auth [dir 0x10000000000 /test/ [2,head] auth{0=1} v=3 cv=0/0 dir_auth=0 state=1610874883|complete|frozentree f(v0 m2023-06-20T15:40:12.691087-0400 1=0+1) n(v0 rc2023-06-20T15:40:12.691087-0400 1=0+1) hs=1+0,ss=0+0 dirty=1 | ptrwaiter=0 child=1 frozen=1 subtree=1 importing=0 replicated=1 dirty=1 0x55a9aa3b1b00]
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache.dir(0x10000000000) mark_clean [dir 0x10000000000 /test/ [2,head] rep@0.1 dir_auth=0 state=537133059|complete|frozentree f(v0 m2023-06-20T15:40:12.691087-0400 1=0+1) n(v0 rc2023-06-20T15:40:12.691087-0400 1=0+1) hs=1+0,ss=0+0 dirty=1 | ptrwaiter=0 child=1 frozen=1 subtree=1 importing=0 replicated=0 dirty=1 0x55a9aa3b1b00] version 3
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache.den(0x10000000000 export) mark_clean [dentry #0x1/test/export [2,head] rep@0.1 (dversion lock) v=2 ino=0x10000000001 state=536870912 | inodepin=1 replicated=0 dirty=1 0x55a9aa386500]
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache.ino(0x10000000001) mark_clean [inode 0x10000000001 [...2,head] /test/export/ rep@0.1 v2 DIRTYPARENT f() n(v0 1=0+1) (iauth excl) (inest lock) (ifile excl) (ixattr excl) (iversion lock) caps={5264=-/-@0},l=5264 | importingcaps=1 dirfrag=1 caps=1 dirtyparent=1 replicated=0 dirty=1 0x55a9aa3ebb80]
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache.ino(0x10000000001) clear_dirty_parent
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache.dir(0x10000000001) mark_clean [dir 0x10000000001 /test/export/ [2,head] rep@0.1 state=536870913|complete f() n() hs=0+0,ss=0+0 | replicated=0 dirty=1 0x55a9aa3b2400] version 1
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache.ino(0x10000000001) remove_client_cap last cap, leaving realm snaprealm(0x1 seq 1 lc 0 cr 0 cps 1 snaps={} last_modified 2023-06-20T15:39:57.664552-0400 change_attr 0 0x55a9aa281440)
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.openfiles remove_dirfrag [dir 0x10000000000 /test/ [2,head] rep@0.1 dir_auth=0 state=262147|complete|frozentree f(v0 m2023-06-20T15:40:12.691087-0400 1=0+1) n(v0 rc2023-06-20T15:40:12.691087-0400 1=0+1) hs=1+0,ss=0+0 | ptrwaiter=0 child=1 frozen=1 subtree=1 importing=0 replicated=0 dirty=0 waiter=1 0x55a9aa3b1b00]
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache create_subtree_map 3 subtrees, 1 fullauth
2023-06-20T15:40:34.962-0400 7fbcf4389700 15 mds.1.cache show_subtrees
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache |____ 1    auth [dir 0x101 ~mds1/ [2,head] auth v=1 cv=1/1 dir_auth=1 state=1073741825|complete f(v0 10=0+10) n(v0 rc2023-06-20T15:40:08.896877-0400 10=0+10) hs=10+0,ss=0+0 | child=1 subtree=1 dirty=0 authpin=0 0x55a9a5526d80]
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache |_.__ 0     rep [dir 0x1 / [2,head] rep@0.1 REP dir_auth=0 state=0 f(v0 m2023-06-20T15:40:12.645693-0400 1=0+1) n(v0 rc2023-06-20T15:40:12.691087-0400 2=0+2)/n(v0 rc2023-06-20T15:40:12.645693-0400 1=0+1) hs=1+0,ss=0+0 | child=1 subtree=1 0x55a9aa3b1200]
2023-06-20T15:40:34.962-0400 7fbcf4389700 10 mds.1.cache   |__ 0     rep [dir 0x10000000000 /test/ [2,head] rep@0.1 dir_auth=0 state=3|complete|frozentree f(v0 m2023-06-20T15:40:12.691087-0400 1=0+1) n(v0 rc2023-06-20T15:40:12.691087-0400 1=0+1) hs=1+0,ss=0+0 | ptrwaiter=0 child=1 frozen=1 subtree=1 importing=0 replicated=0 dirty=0 waiter=1 0x55a9aa3b1b00]
2023-06-20T15:40:34.962-0400 7fbcf4389700 15 mds.1.cache  auth subtree [dir 0x101 ~mds1/ [2,head] auth v=1 cv=1/1 dir_auth=1 state=1073741825|complete f(v0 10=0+10) n(v0 rc2023-06-20T15:40:08.896877-0400 10=0+10) hs=10+0,ss=0+0 | child=1 subtree=1 dirty=0 authpin=0 0x55a9a5526d80]
2023-06-20T15:40:34.962-0400 7fbcf4389700 20 mds.1.journal EMetaBlob::add_dir_context final:
2023-06-20T15:40:34.962-0400 7fbcf4389700 15 mds.1.cache  subtrees {0x101=[]}
2023-06-20T15:40:34.962-0400 7fbcf4389700 15 mds.1.cache  ambiguous_subtrees
2023-06-20T15:40:34.962-0400 7fbcf4389700  7 mds.1.cache trim bytes_used=58kB limit=4GB reservation=0.05% count=1
2023-06-20T15:40:34.962-0400 7fbcf4389700  7 mds.1.cache trim_lru trimming 1 items from LRU size=12 mid=0 pintail=0 pinned=11
2023-06-20T15:40:34.962-0400 7fbcf4389700 12 mds.1.cache trim_dentry [dentry #0x1/test/export [2,head] rep@0.1 (dversion lock) v=2 ino=0x10000000001 state=0 0x55a9aa386500]
2023-06-20T15:40:34.962-0400 7fbcf4389700 12 mds.1.cache  in container [dir 0x10000000000 /test/ [2,head] rep@0.1 dir_auth=0 state=3|complete|frozentree f(v0 m2023-06-20T15:40:12.691087-0400 1=0+1) n(v0 rc2023-06-20T15:40:12.691087-0400 1=0+1) hs=1+0,ss=0+0 | ptrwaiter=0 child=1 frozen=1 subtree=1 importing=0 replicated=0 dirty=0 waiter=1 0x55a9aa3b1b00]
2023-06-20T15:40:34.962-0400 7fbcf4389700 15 mds.1.cache trim_inode [inode 0x10000000001 [...2,head] /test/export/ rep@0.1 v2 f() n(v0 1=0+1) (iauth excl) (inest lock) (ifile excl) (ixattr excl) (iversion lock) 0x55a9aa3ebb80]
2023-06-20T15:40:34.962-0400 7fbcf4389700  7 mds.1.locker rdlock_try on (idft sync) on [inode 0x10000000001 [...2,head] /test/export/ rep@0.1 v2 f() n(v0 1=0+1) (iauth excl) (inest lock) (ifile excl) (ixattr excl) (iversion lock) 0x55a9aa3ebb80]
2023-06-20T15:40:34.962-0400 7fbcf4389700 15 mds.1.cache trim_dirfrag [dir 0x10000000001 /test/export/ [2,head] rep@0.1 state=1|complete f() n() hs=0+0,ss=0+0 0x55a9aa3b2400]
2023-06-20T15:40:34.962-0400 7fbcf4389700 12 mds.1.cache   sending expire to mds.0 on   [dir 0x10000000001 /test/export/ [2,head] rep@0.1 state=1|complete f() n() hs=0+0,ss=0+0 0x55a9aa3b2400]
2023-06-20T15:40:34.962-0400 7fbcf4389700 14 mds.1.cache.ino(0x10000000001) close_dirfrag *
2023-06-20T15:40:34.962-0400 7fbcf4389700 12 mds.1.cache.dir(0x10000000001) remove_null_dentries [dir 0x10000000001 /test/export/ [2,head] rep@0.1 state=1|complete f() n() hs=0+0,ss=0+0 0x55a9aa3b2400]
2023-06-20T15:40:34.962-0400 7fbcf4389700 12 mds.1.cache   sending expire to mds.0 on [inode 0x10000000001 [...2,head] /test/export/ rep@0.1 v2 f() n(v0 1=0+1) (iauth excl) (inest lock) (ifile excl) (ixattr excl) (iversion lock) 0x55a9aa3ebb80]
2023-06-20T15:40:34.962-0400 7fbcf4389700 12 mds.1.cache.dir(0x10000000000) unlink_inode [dentry #0x1/test/export [2,head] rep@0.1 (dversion lock) v=2 ino=0x10000000001 state=0 0x55a9aa386500] [inode 0x10000000001 [...2,head] /test/export/ rep@0.1 v2 f() n(v0 1=0+1) (iauth excl) (inest lock) (ifile excl) (ixattr excl) (iversion lock) 0x55a9aa3ebb80]
2023-06-20T15:40:34.962-0400 7fbcf4389700 14 mds.1.cache remove_inode [inode 0x10000000001 [...2,head] #10000000001/ rep@-2.1 v2 f() n(v0 1=0+1) (iauth excl) (inest lock) (ifile excl) (ixattr excl) (iversion lock) 0x55a9aa3ebb80]
2023-06-20T15:40:34.962-0400 7fbcedb7c700  5 mds.1.log _submit_thread 4199216~41 : EImportFinish 0x10000000000 failed

So your patch worked but the session is not removed. Why did this work for you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only tested with the kernel client but forgot about the ceph-fuse client, and the failure was caused by another bug.
The importer will force open a session provided by the exporter but the client does not know about the new session until the exporter notifies it, and the notification cannot be sent if the exporter is interrupted. Since the client does not know about the new session in the importer, it does not send a message to the importer to close the session during unmounting. The client will be evicted by the importer after session_autoclose seconds (300 seconds by default) even if not unmounted.
I think the sessions forced open by the importer should be closed in import_reverse, and I'm trying to fix the bug in this way.

@zhsgao zhsgao force-pushed the mds_clear_session_failed branch from 84f15a1 to b50d16b Compare July 3, 2023 09:41
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Oct 17, 2023
@github-actions
Copy link

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Nov 16, 2023
@batrick
Copy link
Member

batrick commented May 30, 2024

@zhsgao sorry this fell by the wayside. Are you willing to reopen?

@zhsgao
Copy link
Contributor Author

zhsgao commented May 31, 2024

@zhsgao sorry this fell by the wayside. Are you willing to reopen?

Yes, I'm willing to reopen it.
But I don't know how to reopen it, I can't find the reopen button.

@batrick batrick reopened this Jul 22, 2024
@batrick
Copy link
Member

batrick commented Jul 22, 2024

@zhsgao sorry this fell by the wayside. Are you willing to reopen?

Yes, I'm willing to reopen it. But I don't know how to reopen it, I can't find the reopen button.

done!

@github-actions github-actions bot removed the stale label Jul 22, 2024
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Sep 20, 2024
@batrick batrick removed the stale label Sep 20, 2024
@zhsgao zhsgao force-pushed the mds_clear_session_failed branch from b50d16b to 3a02bb0 Compare September 29, 2024 02:27
@zhsgao
Copy link
Contributor Author

zhsgao commented Sep 29, 2024

rebase

@zhsgao zhsgao force-pushed the mds_clear_session_failed branch from 3a02bb0 to 6ed6174 Compare September 29, 2024 13:44
@zhsgao zhsgao requested a review from batrick October 8, 2024 13:52
@vshankar vshankar requested a review from a team October 16, 2024 07:56
@vshankar
Copy link
Contributor

rebase

Appreciate you working on this PR @zhsgao. which unfortunately didn't get attention after a round of review from @batrick and @dparmar18. Will be revisiting this soon.

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

def tearDown(self):
super().tearDown()

def _kill_export(self, rank, kill):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: suggest naming this as _kill_export_as (_kill_export could be a bit misleading).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: suggest naming this as _kill_export_as (_kill_export could be a bit misleading).

I have modified it, thanks!

@vshankar
Copy link
Contributor

This PR is under test in https://tracker.ceph.com/issues/68744.

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vshankar
Copy link
Contributor

@zhsgao This is ready for merging. Can you rebase the change please?

Signed-off-by: Zhansong Gao <zhsgao@hotmail.com>
…t's racy

Signed-off-by: Zhansong Gao <zhsgao@hotmail.com>
…ask is interrupted

Signed-off-by: Zhansong Gao <zhsgao@hotmail.com>
…btree task is interrupted while the state of importer is acking

The related sessions in the importer are in the importing state(`Session::is_importing` return true) when the state of importer is `acking`,
`Migrator::import_reverse` called by `MDCache::handle_resolve` should reverse the process to clear the importing state if the exporter restarts
at this time, but it doesn't do that actually because of its bug. And it will cause these sessions to not be cleared when the client is
unmounted(evicted or timeout) until the mds is restarted.

The bug in `import_reverse` is that it contains the code to handle state `IMPORT_ACKING` but it will never be executed because
the state is modified to `IMPORT_ABORTING` at the beginning. Move `stat.state = IMPORT_ABORTING` to the end of import_reverse
so that it can handle the state `IMPORT_ACKING`.

Fixes: https://tracker.ceph.com/issues/61459
Signed-off-by: Zhansong Gao <zhsgao@hotmail.com>
…t subtree task is interrupted

Signed-off-by: Zhansong Gao <zhsgao@hotmail.com>
The importer will force open some sessions provided by the exporter but the client does not know about
the new sessions until the exporter notifies it, and the notifications cannot be sent if the exporter
is interrupted. The client does not renew the sessions regularly that it does not know about, so the client
will be evicted by the importer after `session_autoclose` seconds (300 seconds by default).

The sessions that are forced opened in the importer need to be closed when the import process is reversed.

Signed-off-by: Zhansong Gao <zhsgao@hotmail.com>
@zhsgao zhsgao force-pushed the mds_clear_session_failed branch from d578a1d to 00b0711 Compare November 13, 2024 09:47
@zhsgao
Copy link
Contributor Author

zhsgao commented Nov 13, 2024

@zhsgao This is ready for merging. Can you rebase the change please?

OK, it's done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants