Skip to content

mgr/nfs,qa: enhance logging cephfs path failure and align two test cases#51005

Merged
vshankar merged 5 commits intoceph:mainfrom
dparmar18:fix-exception-logging-49460
May 2, 2023
Merged

mgr/nfs,qa: enhance logging cephfs path failure and align two test cases#51005
vshankar merged 5 commits intoceph:mainfrom
dparmar18:fix-exception-logging-49460

Conversation

@dparmar18
Copy link
Contributor

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@dparmar18 dparmar18 force-pushed the fix-exception-logging-49460 branch 4 times, most recently from 8afd75c to cd279c7 Compare April 12, 2023 05:51
@dparmar18 dparmar18 force-pushed the fix-exception-logging-49460 branch 2 times, most recently from 89b13dc to 1b596e5 Compare April 12, 2023 07:52
@dparmar18 dparmar18 marked this pull request as ready for review April 12, 2023 07:52
@dparmar18 dparmar18 requested a review from a team April 12, 2023 07:52
@dparmar18 dparmar18 changed the title mgr/nfs: remove redundancy while logging cephfs path failure mgr/nfs,qa: enhance logging cephfs path failure and fix a test case Apr 12, 2023
@dparmar18 dparmar18 force-pushed the fix-exception-logging-49460 branch 2 times, most recently from 4eecb6d to 748bd6c Compare April 12, 2023 10:11
@dparmar18
Copy link
Contributor Author

@vshankar all the requested changes have been pushed. PTAL

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better. Minor comments.

@dparmar18 dparmar18 force-pushed the fix-exception-logging-49460 branch 3 times, most recently from 0be694b to 8ceb61f Compare April 12, 2023 13:40
@dparmar18
Copy link
Contributor Author

@vshankar pushed all the requested changes. PTAL

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dparmar18 test_nfs_export_creation_at_symlink needs fix to check ENOTDIR rather than ENOENT.

@dparmar18 dparmar18 force-pushed the fix-exception-logging-49460 branch from 8ceb61f to 3cc0996 Compare April 13, 2023 09:44
@dparmar18
Copy link
Contributor Author

@vshankar PTAL

@dparmar18 dparmar18 changed the title mgr/nfs,qa: enhance logging cephfs path failure and fix a test case mgr/nfs,qa: enhance logging cephfs path failure and align two test cases Apr 13, 2023
@dparmar18
Copy link
Contributor Author

[  PASSED  ] 1325 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] TestLibRBD.TestPendingAio

 1 FAILED TEST
  YOU HAVE 2 DISABLED TESTS


281/283 Test #145: check-generated.sh ........................   Passed  1918.60 sec
282/283 Test #146: readable.sh ...............................   Passed  1982.61 sec
283/283 Test #244: unittest-transaction-manager ..............   Passed  2251.53 sec

99% tests passed, 1 tests failed out of 283

Total Test time (real) = 2525.37 sec

The following tests FAILED:
	 32 - run-rbd-unit-tests-61.sh (Failed)
Errors while running CTest

https://jenkins.ceph.com/job/ceph-pull-requests/113753/consoleText

@dparmar18
Copy link
Contributor Author

jenkins test make check

@dparmar18
Copy link
Contributor Author

@vshankar Good to go?

@dparmar18 dparmar18 requested review from aaSharma14 and avanthakkar and removed request for a team April 26, 2023 10:19
- Renamed to cephfs_path_is_dir

- Removed exception handling to prevent redundant log statements like:
   "No such file or directory error in stat: b'/mnt/testdir_symlink': No such file or directory [Errno 2]"

  Exceptions handled inside caller eliminates this redundancy

- Set modifier flag AT_SYMLINK_NOFOLLOW

- Removed string "{path} is not a dir" when raising NotADirectoryError
  Rationale: will be handled in export.py

- change mock to cephfs_path_is_dir

Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
in create_cephfs_export()

Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
It actually didn't test the invalid path but still ended with
ENOENT(which is expected in case path is invalid) as the test
didn't create a fs, and it failed saying "FS nfs-cephfs not found"
which too raises ENOENT and thus it always passed.

Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
- test_nfs_export_creation_at_filepath:
ENOTDIR is raised instead of EINVAL which is better
aligned with the nature of the failure

- test_nfs_export_creation_at_symlink:
ENOTDIR is raised instead of ENOENT since the code
can now check if the path is symlink but won't follow
it.

Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
@dparmar18 dparmar18 force-pushed the fix-exception-logging-49460 branch from e8d7b32 to f116776 Compare April 26, 2023 10:20
@vshankar
Copy link
Contributor

@dparmar18 Let me know when this is ready for review and testing (I've tagged this to include in my next test run).

@dparmar18
Copy link
Contributor Author

@dparmar18 Let me know when this is ready for review and testing (I've tagged this to include in my next test run).

@vshankar i had made a mistake in the testcase, have corrected it and ran your build against qa changes in this PR and its all green, will cleanup the last commit and it's good to go. heres the run link http://pulpito.front.sepia.ceph.com/dparmar-2023-04-26_10:21:46-orch:cephadm-wip-vshankar-testing-20230420.132447-distro-default-smithi/

Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
@dparmar18 dparmar18 force-pushed the fix-exception-logging-49460 branch from f116776 to 7a6ab31 Compare April 26, 2023 11:30
@dparmar18
Copy link
Contributor Author

@vshankar done

@vshankar
Copy link
Contributor

jenkins make check arm64

@vshankar
Copy link
Contributor

https://pulpito.ceph.com/vshankar-2023-04-27_06:36:49-orch-wip-vshankar-testing-20230427.052001-testing-default-smithi/

failures are:

Command failed on smithi097 with status 1: "grep '^nvme_loop' /proc/modules \|\| sudo modprobe nvme_loop && sudo mkdir -p /sys/kernel/config/nvmet/hosts/hostnqn && sudo mkdir -p /sys/kernel/config/nvmet/ports/1 && echo loop \| sudo tee /sys/kernel/config/nvmet/ports/1/addr_trtype"

1 test that actually ran passed.

@vshankar
Copy link
Contributor

jenkins test make check arm64

1 similar comment
@dparmar18
Copy link
Contributor Author

jenkins test make check arm64

@vshankar
Copy link
Contributor

https://pulpito.ceph.com/vshankar-2023-04-27_06:36:49-orch-wip-vshankar-testing-20230427.052001-testing-default-smithi/

failures are:

Command failed on smithi097 with status 1: "grep '^nvme_loop' /proc/modules \|\| sudo modprobe nvme_loop && sudo mkdir -p /sys/kernel/config/nvmet/hosts/hostnqn && sudo mkdir -p /sys/kernel/config/nvmet/ports/1 && echo loop \| sudo tee /sys/kernel/config/nvmet/ports/1/addr_trtype"

1 test that actually ran passed.

@adk3798 Do you similar failures in the orch suite tests and is there a tracker?

@vshankar
Copy link
Contributor

@adk3798 - In mgr upgrade test run: https://pulpito.ceph.com/vshankar-2023-04-27_06:36:49-orch-wip-vshankar-testing-20230427.052001-testing-default-smithi/7255603, one of the ceph-mgr daemon fails to upgrade. Traces from the ceph-mgr log:

2023-04-27T08:09:42.332+0000 7f8624a88700  0 [cephadm DEBUG cephadm.serve] out: {
    "ceph_version": "ceph version 18.0.0-3621-gf75b0b14 (f75b0b14711df4ae694b28b412ed7df527622c79) reef (dev)",
    "image_id": "876a724fee704502433acce5d0987b3a2d09e0e59557dd594b0c593af8901065",
    "repo_digests": [
        "quay.ceph.io/ceph-ci/ceph@sha256:c789c3341f1f07edbb6f193c8600e46053c879904b72a0d19fdd0d37490db91b"
    ]
}
2023-04-27T08:09:42.332+0000 7f8624a88700  0 [cephadm DEBUG cephadm.serve] err: Pulling container image quay.ceph.io/ceph-ci/ceph:f75b0b14711df4ae694b28b412ed7df527622c79...
2023-04-27T08:09:42.332+0000 7f8624a88700  0 [cephadm DEBUG cephadm.serve] image quay.ceph.io/ceph-ci/ceph:f75b0b14711df4ae694b28b412ed7df527622c79 -> ContainerInspectInfo(image_id='876a724fee704502433acce5d0987b3a2d09e0e59557dd594b0c593af8901065', ceph_version='ceph version 18.0.0-3621-gf75b0b14 (f75b0b14711df4ae694b28b412ed7df527622c79) reef (dev)', repo_digests=['quay.ceph.io/ceph-ci/ceph@sha256:c789c3341f1f07edbb6f193c8600e46053c879904b72a0d19fdd0d37490db91b'])

Then this is seen:

2023-04-27T08:09:42.338+0000 7f8624a88700  0 [cephadm INFO cephadm.upgrade] Upgrade: Target is version 18.0.0-3621-gf75b0b14 (unknown)
2023-04-27T08:09:42.338+0000 7f86317de700 20 mgr Gil Switched to new thread state 0x561883c5f200
2023-04-27T08:09:42.338+0000 7f86317de700 20 mgr ~Gil Destroying new thread state 0x561883c5f200
2023-04-27T08:09:42.338+0000 7f8624a88700  0 log_channel(cephadm) log [INF] : Upgrade: Target is version 18.0.0-3621-gf75b0b14 (unknown)

followed by:

2023-04-27T08:09:42.699+0000 7f8624a88700  0 [cephadm DEBUG cephadm.serve] err: Non-zero exit code 125 from /bin/podman inspect --format {{.ID}},{{.RepoDigests}} quay.ceph.io/ceph-ci/ceph@sha256:c789c3341f1f07edbb6f193c8600e46053c879904b72a0d19fdd0d37490db91b
/bin/podman: stderr Error: inspecting object: no such object: "quay.ceph.io/ceph-ci/ceph@sha256:c789c3341f1f07edbb6f193c8600e46053c879904b72a0d19fdd0d37490db91b"
Traceback (most recent call last):
  File "/var/lib/ceph/bd681f4c-e4d1-11ed-9b00-001a4aab830c/cephadm.f77d9d71514a634758d4ad41ab6eef36d25386c99d8b365310ad41f9b74d5ce6", line 7924, in <module>
    main()
  File "/var/lib/ceph/bd681f4c-e4d1-11ed-9b00-001a4aab830c/cephadm.f77d9d71514a634758d4ad41ab6eef36d25386c99d8b365310ad41f9b74d5ce6", line 7912, in main
    r = ctx.func(ctx)
  File "/var/lib/ceph/bd681f4c-e4d1-11ed-9b00-001a4aab830c/cephadm.f77d9d71514a634758d4ad41ab6eef36d25386c99d8b365310ad41f9b74d5ce6", line 1695, in _infer_image
    return func(ctx)
  File "/var/lib/ceph/bd681f4c-e4d1-11ed-9b00-001a4aab830c/cephadm.f77d9d71514a634758d4ad41ab6eef36d25386c99d8b365310ad41f9b74d5ce6", line 3237, in command_inspect_image
    ctx.image])
  File "/var/lib/ceph/bd681f4c-e4d1-11ed-9b00-001a4aab830c/cephadm.f77d9d71514a634758d4ad41ab6eef36d25386c99d8b365310ad41f9b74d5ce6", line 1411, in call_throws
    raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: /bin/podman inspect --format {{.ID}},{{.RepoDigests}} quay.ceph.io/ceph-ci/ceph@sha256:c789c3341f1f07edbb6f193c8600e46053c879904b72a0d19fdd0d37490db91b
2023-04-27T08:09:42.699+0000 7f8624a88700  0 [cephadm INFO cephadm.upgrade] Upgrade: Pulling quay.ceph.io/ceph-ci/ceph@sha256:c789c3341f1f07edbb6f193c8600e46053c879904b72a0d19fdd0d37490db91b on smithi153

Is this something you are aware of?

@vshankar
Copy link
Contributor

vshankar commented May 2, 2023

@dparmar18
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants