Skip to content

msg/async: allow connection reaping to be tuned; fix cephfs test#41422

Merged
batrick merged 2 commits intoceph:masterfrom
liewegas:fix-50223
Jun 14, 2021
Merged

msg/async: allow connection reaping to be tuned; fix cephfs test#41422
batrick merged 2 commits intoceph:masterfrom
liewegas:fix-50223

Conversation

@liewegas
Copy link
Member

Fixes: https://tracker.ceph.com/issues/50622

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

It is helpful to set this to 1 for tests.

Signed-off-by: Sage Weil <sage@newdream.net>
@liewegas liewegas requested a review from batrick May 19, 2021 19:54
@batrick
Copy link
Member

batrick commented May 20, 2021

Thanks for fixing this @liewegas !

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_tell_conn_close needs updated too.

See: /ceph/teuthology-archive/pdonnell-2021-06-03_03:40:33-fs-wip-pdonnell-testing-20210603.020013-distro-basic-smithi/6148218/teuthology.log

Also, I found that reaping does not happen immediately. I had to make a change suggested below to get it right.

diff --git a/qa/tasks/cephfs/test_sessionmap.py b/qa/tasks/cephfs/test_sessionmap.py
index c600811bccc..ad6fd1d609c 100644
--- a/qa/tasks/cephfs/test_sessionmap.py
+++ b/qa/tasks/cephfs/test_sessionmap.py
@@ -41,15 +41,18 @@ class TestSessionMap(CephFSTestCase):
         the conn count goes back to where it started (i.e. we aren't
         leaving connections open)
         """
+        self.config_set('mds', 'ms_async_reap_threshold', '1')
+
         self.mount_a.umount_wait()
         self.mount_b.umount_wait()
 
         status = self.fs.status()
         s = self._get_connection_count(status=status)
         self.fs.rank_tell(["session", "ls"], status=status)
-        e = self._get_connection_count(status=status)
-
-        self.assertEqual(s, e)
+        self.wait_until_true(
+            lambda: self._get_connection_count(status=status) == s,
+            timeout=30
+        )
 
     def test_mount_conn_close(self):
         """
@@ -66,9 +69,10 @@ class TestSessionMap(CephFSTestCase):
         self.mount_a.mount_wait()
         self.assertGreater(self._get_connection_count(status=status), s)
         self.mount_a.umount_wait()
-        e = self._get_connection_count(status=status)
-
-        self.assertEqual(s, e)
+        self.wait_until_true(
+            lambda: self._get_connection_count(status=status) == s,
+            timeout=30
+        )
 
     def test_version_splitting(self):
         """

We have to reap connections promptly for this test to work.

This test was broken indirectly by d51d80b,
which moved the counter decrement to reap time instead of mark_down/stop
time.

The reaping is asynchronous, so allow for a delay in the count change.

Fixes: https://tracker.ceph.com/issues/50622
Signed-off-by: Sage Weil <sage@newdream.net>
@liewegas
Copy link
Member Author

liewegas commented Jun 4, 2021

Thanks, updated!

@liewegas liewegas requested a review from batrick June 8, 2021 12:21
@batrick
Copy link
Member

batrick commented Jun 14, 2021

@batrick batrick merged commit 6a09565 into ceph:master Jun 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants