Skip to content

mgr/dashboard: pin xmlsec to 1.3.13#56974

Merged
nizamial09 merged 2 commits intoceph:mainfrom
rhcs-dashboard:fix-xmlsec-issue
Apr 18, 2024
Merged

mgr/dashboard: pin xmlsec to 1.3.13#56974
nizamial09 merged 2 commits intoceph:mainfrom
rhcs-dashboard:fix-xmlsec-issue

Conversation

@nizamial09
Copy link
Member

@nizamial09 nizamial09 commented Apr 18, 2024

Also has: qa/vstart_runner: increase timeout for vstart.sh command which fixes Fixes: https://tracker.ceph.com/issues/65565

xmlsec is an inner dependency used by python3-saml. A newer release of it broke the import.
xmlsec/python-xmlsec#314

Fixes the run-tox-mgr-dashboard-py3 failure in make check

__init__.py:60: in <module>
    from .module import Module, StandbyModule  # noqa: F401
module.py:36: in <module>
    from .services.sso import SSO_COMMANDS, handle_sso_command
services/sso.py:20: in <module>
    from onelogin.saml2.settings import OneLogin_Saml2_Settings as Saml2Settings
.tox/py3/lib/python3.10/site-packages/onelogin/saml2/settings.py:18: in <module>
    from onelogin.saml2.metadata import OneLogin_Saml2_Metadata
.tox/py3/lib/python3.10/site-packages/onelogin/saml2/metadata.py:15: in <module>
    from onelogin.saml2.utils import OneLogin_Saml2_Utils
.tox/py3/lib/python3.10/site-packages/onelogin/saml2/utils.py:23: in <module>
    import xmlsec
E   xmlsec.Error: (100, 'lxml & xmlsec libxml2 library version mismatch')

Fixes: https://tracker.ceph.com/issues/65571

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@nizamial09 nizamial09 requested a review from a team as a code owner April 18, 2024 05:49
@nizamial09 nizamial09 requested review from Pegonzal and ivoalmeida and removed request for a team April 18, 2024 05:49
@nizamial09
Copy link
Member Author

NOTE: I am intending this as a temporary fix. When a proper fix from xmlsec has been released, I might revert this change since I don't want to pin an inner dependency in our project.

Copy link
Contributor

@tchaikov tchaikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@nizamial09
Copy link
Member Author

jenkins test dashboard

@nizamial09
Copy link
Member Author

jenkins test dashboard cephadm

@nizamial09
Copy link
Member Author

jenkins test api

@nizamial09
Copy link
Member Author

jenkins test windows

@rishabh-d-dave
Copy link
Contributor

rishabh-d-dave commented Apr 18, 2024

@nizamial09 Is the fix on this PR related to the failure here - https://jenkins.ceph.com/job/ceph-pull-requests/133510/consoleFull#-1417082282e2f9bd1b-f570-4eac-a6c0-0080730e3135? I see this issue on other PRs too.

@rishabh-d-dave
Copy link
Contributor

rishabh-d-dave commented Apr 18, 2024

@nizamial09 The Ceph API test failure (https://jenkins.ceph.com/job/ceph-api/72608/consoleFull#1610380190c212b007-e891-4176-9ee7-2f60eca393b7) can be fixed with following patch -

diff --git a/qa/tasks/vstart_runner.py b/qa/tasks/vstart_runner.py
index 3ef987f4d7c..c2098f9abaa 100644
--- a/qa/tasks/vstart_runner.py
+++ b/qa/tasks/vstart_runner.py
@@ -1399,7 +1399,7 @@ def exec_test():
         log.info('\nrunning vstart.sh now...')
         # usually, i get vstart.sh running completely in less than 100
         # seconds.
-        remote.run(args=args, env=vstart_env, timeout=(5 * 60))
+        remote.run(args=args, env=vstart_env)
         log.info('\nvstart.sh finished running')
 
         # Wait for OSD to come up so that subsequent injectargs etc will

For CephFS, 180 seconds is enough. But I've set it to 300 seconds on latest main. Since it's not enough for Ceph API tests sometimes (some Ceph API tests are passing and some are failing), I recommend, instead of removing timeout, setting it to a more reasonable number.

EDIT: PR for above patch - #56977, feel free to cherry-pick the commit here.

EDIT 2: Updated above #56977.

@nizamial09
Copy link
Member Author

@nizamial09 Is the fix on this PR related to the failure here - https://jenkins.ceph.com/job/ceph-pull-requests/133510/consoleFull#-1417082282e2f9bd1b-f570-4eac-a6c0-0080730e3135? I see this issue on other PRs too.

yup, this address that failure.

 40/293 Test   #7: run-tox-mgr-dashboard-py3 .................   Passed  298.19 sec

its passing in the PR but make check is failing on unrelated issue.

etting it to a more reasonable number.

EDIT: PR for above patch - #56977, feel free to cherry-pick the commit here.

@rishabh-d-dave Thank you, I'll cherry-pick that here so both gets fixed in one go

@cbodley
Copy link
Contributor

cbodley commented Apr 18, 2024

thanks @nizamial09, i'm seeing these failures on squid PRs too. can you please make sure there's a tracker issue for backport(s)?

nizamial09 and others added 2 commits April 18, 2024 18:54
xmlsec is an inner dependency used by python3-saml. A newer release of
it broke the import.
xmlsec/python-xmlsec#314

Fixes: https://tracker.ceph.com/issues/65571
Signed-off-by: Nizamudeen A <nia@redhat.com>
Since the timeout bug was fixed (https://tracker.ceph.com/issues/65533)
"Ceph API tests" sometimes fails because vstart.sh command had to be
aborted due to timeout.

Currently, "timeout" is set to 300 seconds which sometimes is not enough
for vstart.sh to run successfully for "Ceph API tests" CI job. 180
seconds usually suffices for vstart.sh to run successfully when used for
CephFS.

Increase value of "timeout" to avoid such failures on "Ceph API tests" CI.

Fixes: https://tracker.ceph.com/issues/65565
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit f779b42)
@nizamial09
Copy link
Member Author

thanks @nizamial09, i'm seeing these failures on squid PRs too. can you please make sure there's a tracker issue for backport(s)?

opened the tracker and attached to the commit

@nizamial09
Copy link
Member Author

unrelated failure but its the second time i am seeing this

293/293 Test #261: unittest-seastore .........................***Timeout 7200.34 sec
WARNING: debug mode. Not for benchmarking or production

@nizamial09
Copy link
Member Author

jenkins test make check

@nizamial09
Copy link
Member Author

dashboard cephadm e2e can be ignored. Its unrelated cypress failure.

@rishabh-d-dave
Copy link
Contributor

@nizamial09 Let's merge this PR?

@nizamial09 nizamial09 merged commit c46a99d into ceph:main Apr 18, 2024
@nizamial09 nizamial09 deleted the fix-xmlsec-issue branch April 18, 2024 18:21
@nizamial09
Copy link
Member Author

@rishabh-d-dave does your fix need squid backport?

@rishabh-d-dave
Copy link
Contributor

@rishabh-d-dave does your fix need squid backport?

@cbodley mentioned that we need Squid backport here but the patch that brought this issue to surface wasn't backported to Squid - https://github.com/ceph/ceph/blob/squid/qa/tasks/vstart_runner.py#L1396. Without that patch, timeout parameter has no effect. So Squid backports hitting same issue is weird.

Is vstart_runner.py from main being used for Squid backport's CI jobs? In this case we don't need backporting.

@rishabh-d-dave
Copy link
Contributor

@rishabh-d-dave does your fix need squid backport?

@cbodley mentioned that we need Squid backport here but the patch that brought this issue to surface wasn't backported to Squid - https://github.com/ceph/ceph/blob/squid/qa/tasks/vstart_runner.py#L1396. Without that patch, timeout parameter has no effect. So Squid backports hitting same issue is weird.

Is vstart_runner.py from main being used for Squid backport's CI jobs? In this case we don't need backporting.

@nizamial09 ping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants