Skip to content

qa: Honor cephfs config from yaml across test cases#63902

Closed
kotreshhr wants to merge 1 commit intoceph:mainfrom
kotreshhr:qa-fix-cephfs_config
Closed

qa: Honor cephfs config from yaml across test cases#63902
kotreshhr wants to merge 1 commit intoceph:mainfrom
kotreshhr:qa-fix-cephfs_config

Conversation

@kotreshhr
Copy link
Contributor

@kotreshhr kotreshhr commented Jun 12, 2025

The cephfs_test_runner wasn't honoring the cephfs config passed via yaml. For each test case the filesystem is newly created and destroyed as part of setup and teardown. The filesystem creation is taken care by the class 'Filesystem' in which the 'fs_config' isn't initialised from yaml. Hence the filesystem is being created with default configs without honoring the configs passed from yaml. This patch fixes the same.

Fixes: https://tracker.ceph.com/issues/71649

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

@kotreshhr kotreshhr requested a review from a team as a code owner June 12, 2025 08:58
@github-actions github-actions bot added cephadm cephfs Ceph File System tests labels Jun 12, 2025
@kotreshhr kotreshhr requested a review from a team June 12, 2025 08:58
@kotreshhr
Copy link
Contributor Author

kotreshhr commented Jun 12, 2025

@kotreshhr
Copy link
Contributor Author

jenkins test make check

@kotreshhr
Copy link
Contributor Author

kotreshhr commented Jun 13, 2025

Test to verify - QA run with fix - https://pulpito.ceph.com/khiremat-2025-06-12_09:30:24-fs:functional-main-distro-default-smithi/

Please see below that the fs_config is properly picked up for each test run.

[khiremat@vossi04 8324931]$ pwd
/teuthology/khiremat-2025-06-12_09:30:24-fs:functional-main-distro-default-smithi/8324931
[khiremat@vossi04 8324931]$ egrep "test_runner:Starting test|fs_config:" ./teuthology.log
2025-06-12T09:48:40.878 DEBUG:tasks.cephfs.filesystem:fs_config: {'ec_profile': ['m=2', 'k=2', 'crush-failure-domain=osd', 'disabled']}
2025-06-12T09:49:02.285 INFO:tasks.cephfs_test_runner:Starting test: test_barrier (tasks.cephfs.test_full.TestClusterFull)
2025-06-12T09:49:29.229 DEBUG:tasks.cephfs.filesystem:fs_config: {'ec_profile': ['m=2', 'k=2', 'crush-failure-domain=osd', 'disabled']}
2025-06-12T09:49:47.490 INFO:tasks.cephfs_test_runner:Starting test: test_full_different_file (tasks.cephfs.test_full.TestClusterFull)
2025-06-12T09:49:57.550 DEBUG:tasks.cephfs.filesystem:fs_config: {'ec_profile': ['m=2', 'k=2', 'crush-failure-domain=osd', 'disabled']}
2025-06-12T09:50:33.750 INFO:tasks.cephfs_test_runner:Starting test: test_full_fclose (tasks.cephfs.test_full.TestClusterFull)
2025-06-12T09:50:43.819 DEBUG:tasks.cephfs.filesystem:fs_config: {'ec_profile': ['m=2', 'k=2', 'crush-failure-domain=osd', 'disabled']}
2025-06-12T09:52:16.340 INFO:tasks.cephfs_test_runner:Starting test: test_full_fsync (tasks.cephfs.test_full.TestClusterFull)
2025-06-12T09:52:26.396 DEBUG:tasks.cephfs.filesystem:fs_config: {'ec_profile': ['m=2', 'k=2', 'crush-failure-domain=osd', 'disabled']}
2025-06-12T09:53:19.543 INFO:tasks.cephfs_test_runner:Starting test: test_full_same_file (tasks.cephfs.test_full.TestClusterFull)
2025-06-12T09:53:29.601 DEBUG:tasks.cephfs.filesystem:fs_config: {'ec_profile': ['m=2', 'k=2', 'crush-failure-domain=osd', 'disabled']}
2025-06-12T09:54:00.311 INFO:tasks.cephfs_test_runner:Starting test: test_barrier (tasks.cephfs.test_full.TestQuotaFull)
2025-06-12T09:54:10.397 DEBUG:tasks.cephfs.filesystem:fs_config: {'ec_profile': ['m=2', 'k=2', 'crush-failure-domain=osd', 'disabled']}
2025-06-12T09:54:33.658 INFO:tasks.cephfs_test_runner:Starting test: test_full_different_file (tasks.cephfs.test_full.TestQuotaFull)
2025-06-12T09:54:43.740 DEBUG:tasks.cephfs.filesystem:fs_config: {'ec_profile': ['m=2', 'k=2', 'crush-failure-domain=osd', 'disabled']}
2025-06-12T09:55:23.811 INFO:tasks.cephfs_test_runner:Starting test: test_full_fclose (tasks.cephfs.test_full.TestQuotaFull)
2025-06-12T09:55:33.907 DEBUG:tasks.cephfs.filesystem:fs_config: {'ec_profile': ['m=2', 'k=2', 'crush-failure-domain=osd', 'disabled']}
2025-06-12T09:57:50.181 INFO:tasks.cephfs_test_runner:Starting test: test_full_fsync (tasks.cephfs.test_full.TestQuotaFull)
2025-06-12T09:58:00.257 DEBUG:tasks.cephfs.filesystem:fs_config: {'ec_profile': ['m=2', 'k=2', 'crush-failure-domain=osd', 'disabled']}
2025-06-12T09:59:16.597 INFO:tasks.cephfs_test_runner:Starting test: test_full_same_file (tasks.cephfs.test_full.TestQuotaFull)
2025-06-12T09:59:26.663 DEBUG:tasks.cephfs.filesystem:fs_config: {'ec_profile': ['m=2', 'k=2', 'crush-failure-domain=osd', 'disabled']}

@kotreshhr
Copy link
Contributor Author

QA Run without the fix using main branch:

https://pulpito.ceph.com/khiremat-2025-06-12_10:24:03-fs:functional-main-distro-default-smithi/

Please see below the fs_config is not picked up from yaml. It's all empty.

[khiremat@vossi04 8324958]$ pwd
/teuthology/khiremat-2025-06-12_10:24:03-fs:functional-main-distro-default-smithi/8324958
[khiremat@vossi04 8324958]$ egrep "test_runner:Starting test|fs_config:" ./teuthology.log   
2025-06-12T10:46:20.562 DEBUG:tasks.cephfs.filesystem:fs_config: {'ec_profile': ['m=2', 'k=2', 'crush-failure-domain=osd', 'disabled']}
2025-06-12T10:46:43.773 INFO:tasks.cephfs_test_runner:Starting test: test_barrier (tasks.cephfs.test_full.TestClusterFull)
2025-06-12T10:47:06.849 DEBUG:tasks.cephfs.filesystem:fs_config: {}
2025-06-12T10:47:28.466 INFO:tasks.cephfs_test_runner:Starting test: test_full_different_file (tasks.cephfs.test_full.TestClusterFull)
2025-06-12T10:47:39.560 DEBUG:tasks.cephfs.filesystem:fs_config: {}
2025-06-12T10:48:16.230 INFO:tasks.cephfs_test_runner:Starting test: test_full_fclose (tasks.cephfs.test_full.TestClusterFull)
2025-06-12T10:48:33.334 DEBUG:tasks.cephfs.filesystem:fs_config: {}
2025-06-12T10:49:56.606 INFO:tasks.cephfs_test_runner:Starting test: test_full_fsync (tasks.cephfs.test_full.TestClusterFull)
2025-06-12T10:50:08.687 DEBUG:tasks.cephfs.filesystem:fs_config: {}
2025-06-12T10:51:08.925 INFO:tasks.cephfs_test_runner:Starting test: test_full_same_file (tasks.cephfs.test_full.TestClusterFull)
2025-06-12T10:51:20.016 DEBUG:tasks.cephfs.filesystem:fs_config: {}
2025-06-12T10:52:02.438 INFO:tasks.cephfs_test_runner:Starting test: test_barrier (tasks.cephfs.test_full.TestQuotaFull)
2025-06-12T10:52:19.537 DEBUG:tasks.cephfs.filesystem:fs_config: {}
2025-06-12T10:52:41.150 INFO:tasks.cephfs_test_runner:Starting test: test_full_different_file (tasks.cephfs.test_full.TestQuotaFull)
2025-06-12T10:52:52.231 DEBUG:tasks.cephfs.filesystem:fs_config: {}
2025-06-12T10:53:37.801 INFO:tasks.cephfs_test_runner:Starting test: test_full_fclose (tasks.cephfs.test_full.TestQuotaFull)
2025-06-12T10:53:54.920 DEBUG:tasks.cephfs.filesystem:fs_config: {}
2025-06-12T10:56:20.524 INFO:tasks.cephfs_test_runner:Starting test: test_full_fsync (tasks.cephfs.test_full.TestQuotaFull)
2025-06-12T10:56:31.617 DEBUG:tasks.cephfs.filesystem:fs_config: {}
2025-06-12T10:57:53.846 INFO:tasks.cephfs_test_runner:Starting test: test_full_same_file (tasks.cephfs.test_full.TestQuotaFull)
2025-06-12T10:58:10.968 DEBUG:tasks.cephfs.filesystem:fs_config: {}

@kotreshhr
Copy link
Contributor Author

jenkins test make check

@kotreshhr kotreshhr force-pushed the qa-fix-cephfs_config branch from f4b97ee to c2d12a6 Compare June 14, 2025 05:57
@kotreshhr
Copy link
Contributor Author

jenkins test dashboard cephadm

@kotreshhr kotreshhr requested a review from vshankar June 16, 2025 07:29
@kotreshhr kotreshhr force-pushed the qa-fix-cephfs_config branch 2 times, most recently from 20bec86 to d70dd07 Compare June 20, 2025 04:06
@kotreshhr
Copy link
Contributor Author

jenkins test make check

@kotreshhr
Copy link
Contributor Author

jenkins test dashboard cephadm

@kotreshhr
Copy link
Contributor Author

jenkins test make check

@kotreshhr kotreshhr requested a review from a team July 9, 2025 16:59
Copy link
Contributor

@mchangir mchangir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to do the right thing. Tagging @batrick for additional eye on the changes.

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the stated example:

------
ceph:
  cephfs:
    allow_referent_inodes: true
    max_mds: 2
    fs:
      - name: a 
        max_mds: 3
      - name: b 
------

many tests are not tolerant of a multiple rank fs. Nor would they be if there are multiple fs.

Otherwise, the idea of this fix is fine with me. Suggest removing those example configs from teh commit message, PR, and issue.

The cephfs_test_runner wasn't honoring the cephfs
config passed via yaml. For each test case the
filesystem is newly created and destroyed as part
of setup and teardown. The filesystem creation is
taken care by the class 'Filesystem' in which the
'fs_config' isn't initialised from yaml. Hence the
filesystem is being created with default configs
without honoring the configs passed from yaml.
This patch fixes the same.

Fixes: https://tracker.ceph.com/issues/71649
Signed-off-by: Kotresh HR <khiremat@redhat.com>
@kotreshhr kotreshhr force-pushed the qa-fix-cephfs_config branch from d70dd07 to 38c0e44 Compare August 5, 2025 06:28
@kotreshhr
Copy link
Contributor Author

kotreshhr commented Aug 5, 2025

For the stated example:

------
ceph:
  cephfs:
    allow_referent_inodes: true
    max_mds: 2
    fs:
      - name: a 
        max_mds: 3
      - name: b 
------

many tests are not tolerant of a multiple rank fs. Nor would they be if there are multiple fs.

Yeah. I agree. But the fix isn't only for multiple rank fs. Even with single fs, the configs passed from yaml weren't picked up by the testcases run by cephfs_test_runner

Otherwise, the idea of this fix is fine with me. Suggest removing those example configs from teh commit message, PR, and issue.

Done. @batrick ptal.

@vshankar
Copy link
Contributor

vshankar commented Aug 5, 2025

This PR is under test in https://tracker.ceph.com/issues/72416.

@kotreshhr
Copy link
Contributor Author

jenkins test make check arm64

@kotreshhr
Copy link
Contributor Author

jenkins test dashboard cephadm

1 similar comment
@kotreshhr
Copy link
Contributor Author

jenkins test dashboard cephadm

@vshankar
Copy link
Contributor

I'm seeing failures is fs suite which haven't been seen before and could be a side-effect of this change. See: https://pulpito.ceph.com/vshankar-2025-08-08_06:01:33-fs-wip-vshankar-testing-20250807.165858-debug-testing-default-smithi/

@kotreshhr I would require your assistance to verify the above.

@vshankar
Copy link
Contributor

I'm seeing failures is fs suite which haven't been seen before and could be a side-effect of this change. See: https://pulpito.ceph.com/vshankar-2025-08-08_06:01:33-fs-wip-vshankar-testing-20250807.165858-debug-testing-default-smithi/

@kotreshhr I would require your assistance to verify the above.

This is a fs suite run without this change: https://pulpito.ceph.com/vshankar-2025-08-12_11:31:54-fs-wip-vshankar-testing-20250812.045652-debug-testing-default-smithi/

The failures seen in the original run are not seen in this run, so, somehow this change is causing spurious failures. @kotreshhr please investigate.

@kotreshhr
Copy link
Contributor Author

I'm seeing failures is fs suite which haven't been seen before and could be a side-effect of this change. See: https://pulpito.ceph.com/vshankar-2025-08-08_06:01:33-fs-wip-vshankar-testing-20250807.165858-debug-testing-default-smithi/
@kotreshhr I would require your assistance to verify the above.

This is a fs suite run without this change: https://pulpito.ceph.com/vshankar-2025-08-12_11:31:54-fs-wip-vshankar-testing-20250812.045652-debug-testing-default-smithi/

The failures seen in the original run are not seen in this run, so, somehow this change is causing spurious failures. @kotreshhr please investigate.

Oh, this needs sometime to go through. I will update after going through the results

@batrick batrick removed their assignment Aug 13, 2025
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Oct 12, 2025
@github-actions
Copy link

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Nov 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants