qa: test fio with multiple versions of NFS#62031
Conversation
|
@kotreshhr @vshankar do we agree on this? |
I will (obviously) be biased since I proposed this (temporary) workaround so that we get nfs-gaesha coverage :) What do others think? |
I think it's a good idea for now to get the coverage. |
qa/tasks/cephfs/test_nfs.py
Outdated
| # TODO: Reference: https://tracker.ceph.com/issues/70203 | ||
| self.ctx.cluster.run( | ||
| args=['sudo', 'mount', '-t', 'nfs', '-o', f'port={port}', | ||
| args=['sudo', 'mount', '-t', 'nfs', '-o', f'port={port},vers=4.1', |
There was a problem hiding this comment.
I fear that once we set this to 4.1, we will never restore 4.2. Can you please reference the actual discussion in https://tracker.ceph.com/issues/69347
A better way to structure this would be to have a set of yamls like:
diff --git a/qa/suites/fs/nfs/nfs-version/v4.1.yaml b/qa/suites/fs/nfs/nfs-version/v4.1.yaml
new file mode 100644
index 00000000000..729a8533d6d
--- /dev/null
+++ b/qa/suites/fs/nfs/nfs-version/v4.1.yaml
@@ -0,0 +1,2 @@
+nfs:
+ vers: 4.1
diff --git a/qa/suites/fs/nfs/nfs-version/v4.2.yaml b/qa/suites/fs/nfs/nfs-version/v4.2.yaml
new file mode 100644
index 00000000000..850afbd6187
--- /dev/null
+++ b/qa/suites/fs/nfs/nfs-version/v4.2.yaml
@@ -0,0 +1,2 @@
+nfs:
+ vers: 4.2
diff --git a/qa/tasks/cephfs/test_nfs.py b/qa/tasks/cephfs/test_nfs.py
index 4f0e258683e..26ec4f7b1af 100644
--- a/qa/tasks/cephfs/test_nfs.py
+++ b/qa/tasks/cephfs/test_nfs.py
@@ -328,10 +328,12 @@ class TestNFS(MgrTestCase):
:param check: It denotes if i/o testing needs to be done
'''
tries = 3
+ config_nfs = self.ctx['config'].get('nfs', {})
+ config_nfs_vers = config_nfs.get('vers', '4.2')
while True:
try:
self.ctx.cluster.run(
- args=['sudo', 'mount', '-t', 'nfs', '-o', f'port={port}',
+ args=['sudo', 'mount', '-t', 'nfs', '-o', f'port={port},vers={config_nfs_vers}',
f'{ip}:{pseudo_path}', '/mnt'])
break
except CommandFailedError as e:
^ untested
Then you can filter out the 4.2 dimension conditionally using a postmerge script.
There was a problem hiding this comment.
I fear that once we set this to 4.1, we will never restore 4.2. Can you please reference the actual discussion in https://tracker.ceph.com/issues/69347
A better way to structure this would be to have a set of yamls like:
diff --git a/qa/suites/fs/nfs/nfs-version/v4.1.yaml b/qa/suites/fs/nfs/nfs-version/v4.1.yaml new file mode 100644 index 00000000000..729a8533d6d --- /dev/null +++ b/qa/suites/fs/nfs/nfs-version/v4.1.yaml @@ -0,0 +1,2 @@ +nfs: + vers: 4.1 diff --git a/qa/suites/fs/nfs/nfs-version/v4.2.yaml b/qa/suites/fs/nfs/nfs-version/v4.2.yaml new file mode 100644 index 00000000000..850afbd6187 --- /dev/null +++ b/qa/suites/fs/nfs/nfs-version/v4.2.yaml @@ -0,0 +1,2 @@ +nfs: + vers: 4.2 diff --git a/qa/tasks/cephfs/test_nfs.py b/qa/tasks/cephfs/test_nfs.py index 4f0e258683e..26ec4f7b1af 100644 --- a/qa/tasks/cephfs/test_nfs.py +++ b/qa/tasks/cephfs/test_nfs.py @@ -328,10 +328,12 @@ class TestNFS(MgrTestCase): :param check: It denotes if i/o testing needs to be done ''' tries = 3 + config_nfs = self.ctx['config'].get('nfs', {}) + config_nfs_vers = config_nfs.get('vers', '4.2') while True: try: self.ctx.cluster.run( - args=['sudo', 'mount', '-t', 'nfs', '-o', f'port={port}', + args=['sudo', 'mount', '-t', 'nfs', '-o', f'port={port},vers={config_nfs_vers}', f'{ip}:{pseudo_path}', '/mnt']) break except CommandFailedError as e:^ untested
Then you can filter out the 4.2 dimension conditionally using a postmerge script.
If we can run tests with both v4.1 and v4.2, then why do we need filter out the 4.2 dimension? I'd let that job fail, so that we never loose track of it.
There was a problem hiding this comment.
the mount option had no version mentioned before, 4.2 is latest with the distro, when in case 4.3 comes we'd another manual job to take care of i.e. to change this to 4.2 and 4.3, i think we can have yamls named old-nfs.yaml and latest-nfs.yaml and have something like:
config_nfs = self.ctx['config'].get('nfs', {})
config_nfs_vers = config_nfs.get('vers')
if config_nfs_vers <= 4.2: # 4.2 being the latest at the time of writing this code
args=['sudo', 'mount', '-t', 'nfs', '-o', f'port={port},vers={config_nfs_vers}',
f'{ip}:{pseudo_path}', '/mnt'])
else:
args=['sudo', 'mount', '-t', 'nfs', '-o', f'port={port}, f'{ip}:{pseudo_path}', '/mnt'])
There was a problem hiding this comment.
this is actually a good thing to have since we already have similar case with ceph client where we test with old client and stuff. a nice code coverage.
There was a problem hiding this comment.
Right. So, we can do this via yaml fragments for 4.1/4.2, so we know early on when things start failing.
There was a problem hiding this comment.
We can but my point is to also have a yaml that doesnt mention the version. let the code make use of the latest version shipped with distro (which is what this func did before I added vers= option)
b27b836 to
d0f7cd3
Compare
85f12f5 to
2cbcf1d
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
| @@ -0,0 +1,2 @@ | |||
| nfs: | |||
| vers: 4.1 | |||
| @@ -0,0 +1,2 @@ | |||
| nfs: | |||
| vers: latest | |||
There was a problem hiding this comment.
I like the idea of testing with vers= option unset, even if it's the same as 4.2. That may change someday or the nfs linux driver may soon "add" things beyond 4.2 without adding an explicit version bump.
IOW, please add a 4.2.yaml and a unset.yaml.
This is coming from https://tracker.ceph.com/issues/69347, basically fio fails in teuthology with v4.2 but passes with older versions (like v4.1). v4.2 on local/QE machines has worked fine. This needs RCA, looking at libcephfs logs, it doesn't look like it's CephFS. A thorough probe is needed, hence adding YAMLs to test fio with different versions which also helps testing ceph client with different version of NFS to help trace any regression. unset.yaml is basically 4.2 for now but That may change someday or the nfs linux driver may soon "add" things beyond 4.2 without adding an explicit version bump. Fixes: https://tracker.ceph.com/issues/70203 Signed-off-by: Dhairya Parmar <dparmar@redhat.com> Co-authored-by: Patrick Donnelly <pdonnell@ibm.com>
|
@batrick changes incorporated, runs pinged above, look fine |
|
jenkins test make check arm64 |
last 4.1 run was green #62031 (comment), this looks to be intermittent failure. i'll rerun for confirmation. |
|
jenkins restest this please |
I will merge this because it's an improvement that works but you need to file a ticket for the intermittent failure because it will reoccur in upstream QA. |
|
This is coming from https://tracker.ceph.com/issues/69347, basically fio fails in
teuthology with v4.2 but passes with older versions (like v4.1). v4.2 on local/QE
machines has worked fine. This needs RCA, looking at libcephfs logs, it doesn't
look like it's CephFS. A thorough probe is needed, hence adding YAMLs to test
fio with different versions which also helps testing ceph client with different
version of NFS to help trace any regression.
unset.yaml is basically 4.2 for now but That may change someday or the nfs linux
driver may soon "add" things beyond 4.2 without adding an explicit version bump.
Fixes: https://tracker.ceph.com/issues/70203
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e