Skip to content

qa/tests: skip rgw site tests in api#63126

Merged
epuertat merged 1 commit intoceph:mainfrom
rhcs-dashboard:rgw-site-test-skip
May 8, 2025
Merged

qa/tests: skip rgw site tests in api#63126
epuertat merged 1 commit intoceph:mainfrom
rhcs-dashboard:rgw-site-test-skip

Conversation

@nizamial09
Copy link
Member

Failure in https://jenkins.ceph.com/job/ceph-api/95185/

dashboard.rest_client.RequestException: RGW REST API GET timed out after 45 seconds (url=http://172.21.5.32:8000/admin/realm?list).

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

@nizamial09 nizamial09 requested a review from a team as a code owner May 6, 2025 10:27
@nizamial09 nizamial09 requested review from Achintk1491 and Pegonzal and removed request for a team May 6, 2025 10:27
@nizamial09 nizamial09 force-pushed the rgw-site-test-skip branch 3 times, most recently from 7e23c4d to d61e3ee Compare May 6, 2025 13:23
return self._get('/api/rgw/user/{}?stats={}'.format(uid, stats))


@unittest.skipUnless(os.environ.get('ghprbPullTitle', '').starswith('mgr/dashboard:'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need (and surely we'll need it) to use this in more places, maybe we can create a couple of aliases :

# helper.py
skipUnlessDashboardPR = unittest.skipUnless(
        os.environ.get('ghprbPullTitle', '').starswith('mgr/dashboard:')
    )

skipUnlessPRDescription = lambda keyword="run_all_api_tests": unittest.skipUnless(keyword in os.environ.get('ghprbPullLongDescription', ''))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now I've added the alias for skipping unless dashboard PR. I didn't added the other one.


@unittest.skipUnless(
os.environ.get('ghprbPullTitle', '').startswith('mgr/dashboard:'),
'Skipping because PR title does not start with mgr/dashboard')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great addition!

@epuertat
Copy link
Member

epuertat commented May 6, 2025

@nizamial09 for this specific one, I'm not sure we should disable it. It looks like the time out is set to 45 seconds and this is taking 47 to complete...

2025-05-05T17:25:30.719+0000 7f978e797640  1 ====== req done req=0x7f9766726248 op=list_realms bucket= status=0 http_status=200 latency=47.165985107s request_id=tx000000a9baaf4b97ad4b4-006818f45b-4556-default ======
2025-05-05T17:25:30.719+0000 7f978e797640  1 beast: 0x7f9766726248: 172.21.5.32 - admin [05/May/2025:17:24:43.553 +0000] "GET /admin/realm?list HTTP/1.1" 200 0 - "python-requests/2.25.1" - latency=47.165985107s

Let's wait for @cbodley's feedback on the potential impact of the recently merged #62398, and also on guidance about reasonable time-out for RGW realm listing (current one is 45 secs, and failures report 46-47 secs).

@nizamial09
Copy link
Member Author

@epuertat that sounds good to me. I'll wait for their confirmation. I've seen some tests goes beyond 50s as well in the logs and in other cases it didn't get logged because the tests was stopped and clients were destroyed before fetching the response. If I can find those logs I can share it later.

@epuertat
Copy link
Member

epuertat commented May 7, 2025

@nizamial09 we might need to merge this PR, while RGW team troubleshoots the issue, as this failure is affecting other PRs, and re-enable after that.

@nizamial09
Copy link
Member Author

@nizamial09 we might need to merge this PR, while RGW team troubleshoots the issue, as this failure is affecting other PRs, and re-enable after that.

so should I just go with the current one then? instead of increasing timeout?

@nizamial09
Copy link
Member Author

btw, make check's started failing :/. Ongoing discussion in a slack thread

@cbodley
Copy link
Contributor

cbodley commented May 7, 2025

opened https://tracker.ceph.com/issues/71239 to track the regression

Failure in https://jenkins.ceph.com/job/ceph-api/95185/

```
dashboard.rest_client.RequestException: RGW REST API GET timed out after 45 seconds (url=http://172.21.5.32:8000/admin/realm?list).
```

Signed-off-by: Nizamudeen A <nia@redhat.com>
@nizamial09 nizamial09 force-pushed the rgw-site-test-skip branch from d61e3ee to 1479e11 Compare May 7, 2025 14:40
@nizamial09 nizamial09 requested a review from epuertat May 7, 2025 14:41
@cbodley
Copy link
Contributor

cbodley commented May 7, 2025

https://jenkins.ceph.com/job/ceph-pull-requests-arm64/73706/

The following tests FAILED:
10 - run-tox-mgr-dashboard-lint (Failed)

@cbodley
Copy link
Contributor

cbodley commented May 7, 2025

jenkins test make check arm64

@cbodley
Copy link
Contributor

cbodley commented May 7, 2025

https://jenkins.ceph.com/job/ceph-pull-requests-arm64/73729/

The following tests FAILED:
10 - run-tox-mgr-dashboard-lint (Failed)
34 - run-rbd-unit-tests-127.sh (Failed)

@cbodley
Copy link
Contributor

cbodley commented May 7, 2025

jenkins test make check arm64

@github-project-automation github-project-automation bot moved this from New to Reviewer approved in Ceph-Dashboard May 8, 2025
@nizamial09
Copy link
Member Author

jenkins test make check arm64

@cbodley
Copy link
Contributor

cbodley commented May 8, 2025

out of curiosity, have we seen this timeout failure in teuthology?

@epuertat epuertat merged commit a47177e into ceph:main May 8, 2025
14 checks passed
@github-project-automation github-project-automation bot moved this from Reviewer approved to Done in Ceph-Dashboard May 8, 2025
@epuertat epuertat deleted the rgw-site-test-skip branch May 8, 2025 20:48
@cbodley
Copy link
Contributor

cbodley commented May 8, 2025

i shared some updates in the #sepia slack discussion, but i doubt there's a fix for this coming soon

is it possible to enable retries on the http client you're using in these tests? i would expect that to help with their stability under vstart

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants