squid: qa: Fix test_with_health_warn_with_2_active_MDSs#65798
Closed
joscollin wants to merge 3 commits intoceph:squidfrom
Closed
squid: qa: Fix test_with_health_warn_with_2_active_MDSs#65798joscollin wants to merge 3 commits intoceph:squidfrom
joscollin wants to merge 3 commits intoceph:squidfrom
Conversation
MDS_CACHE_OVERSIZE warning. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit 5a7834b)
tha generates MDS_CACHE_OVERSIZE warning. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit d53be13)
The test intended to validate the failure of 'mds fail' cmd on any active mds when one of them has warning. The commit 2217002 (PR 61554) changes this behavior and allows 'mds fail' on mds without the warning. The test should have always failed with this commit. But the test never failed until tested extensively because the test mostly generated warnings for both active mdses. Occasionaly, the test generated a warning on single mds and failed. So it's a race. This patch fixes the same by changing the following. a. Changed the mds_cache_memory_limit to '50K' from '1K' as '1K' was to less and generating warning on both the mdses. b. Create a directory and pin it a single mds and open 400 files in the backend to create cache pressure on one mds. Also, there are two tests with the same name as 'test_with_health_warn_with_2_active_MDSs' but in different classes though. So changed the test name to 'test_with_health_warn_on_1_mds_with_2_active_MDSs' to avoid confusion and indicate what the test actually does. Fixes: https://tracker.ceph.com/issues/71915 Signed-off-by: Kotresh HR <khiremat@redhat.com> (cherry picked from commit f990e7d)
Member
Author
|
This PR is under test in https://tracker.ceph.com/issues/73450. |
rishabh-d-dave
requested changes
Oct 14, 2025
| return count, mds_id | ||
|
|
||
| def test_with_health_warn_with_2_active_MDSs(self): | ||
| def test_with_health_warn_on_1_mds_with_2_active_MDSs(self): |
Contributor
There was a problem hiding this comment.
This test failed in QA -
2025-10-10T22:12:59.591 INFO:tasks.cephfs_test_runner:======================================================================
2025-10-10T22:12:59.591 INFO:tasks.cephfs_test_runner:ERROR: test_with_health_warn_on_1_mds_with_2_active_MDSs (tasks.cephfs.test_admin.TestMDSFail)
2025-10-10T22:12:59.591 INFO:tasks.cephfs_test_runner:Test when a CephFS has 2 active MDSs and one of them have either
2025-10-10T22:12:59.591 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2025-10-10T22:12:59.592 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2025-10-10T22:12:59.592 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ceph_ceph-c_104ea276496eeef39137386037044178d6f67f79/qa/tasks/cephfs/test_admin.py", line 2645, in test_with_health_warn_on_1_mds_with_2_active_MDSs
2025-10-10T22:12:59.592 INFO:tasks.cephfs_test_runner: self.run_ceph_cmd(f'mds fail {mds1_id}')
2025-10-10T22:12:59.592 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ceph_ceph-c_104ea276496eeef39137386037044178d6f67f79/qa/tasks/ceph_test_case.py", line 30, in run_ceph_cmd
2025-10-10T22:12:59.592 INFO:tasks.cephfs_test_runner: return self.mon_manager.run_cluster_cmd(**kwargs)
2025-10-10T22:12:59.592 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ceph_ceph-c_104ea276496eeef39137386037044178d6f67f79/qa/tasks/ceph_manager.py", line 1635, in run_cluster_cmd
2025-10-10T22:12:59.592 INFO:tasks.cephfs_test_runner: return self.controller.run(**kwargs)
2025-10-10T22:12:59.592 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/git.ceph.com_teuthology_78c036dc9ad59cb33807dc7f21fda50de2f348d2/teuthology/orchestra/remote.py", line 575, in run
2025-10-10T22:12:59.592 INFO:tasks.cephfs_test_runner: r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
2025-10-10T22:12:59.592 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/git.ceph.com_teuthology_78c036dc9ad59cb33807dc7f21fda50de2f348d2/teuthology/orchestra/run.py", line 461, in run
2025-10-10T22:12:59.592 INFO:tasks.cephfs_test_runner: r.wait()
2025-10-10T22:12:59.592 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/git.ceph.com_teuthology_78c036dc9ad59cb33807dc7f21fda50de2f348d2/teuthology/orchestra/run.py", line 161, in wait
2025-10-10T22:12:59.592 INFO:tasks.cephfs_test_runner: self._raise_for_status()
2025-10-10T22:12:59.593 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/git.ceph.com_teuthology_78c036dc9ad59cb33807dc7f21fda50de2f348d2/teuthology/orchestra/run.py", line 181, in _raise_for_status
2025-10-10T22:12:59.593 INFO:tasks.cephfs_test_runner: raise CommandFailedError(
2025-10-10T22:12:59.593 INFO:tasks.cephfs_test_runner:teuthology.exceptions.CommandFailedError: Command failed on smithi119 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph mds fail c'
joscollin
commented
Oct 16, 2025
Member
Author
joscollin
left a comment
There was a problem hiding this comment.
QA failed:
https://tracker.ceph.com/issues/73566
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
backport tracker: https://tracker.ceph.com/issues/72280
backport of #64297
parent tracker: https://tracker.ceph.com/issues/71915
NOTE:
The PR pulls the following dependant qa commits:
5a7834b
d53be13
this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job Definition