monitoring/ceph-mixin: Cleanup of variables, queries and tests (to fix showMultiCluster=True) by frittentheke · Pull Request #55495 · ceph/ceph

frittentheke · 2024-02-08T13:53:38Z

Rendering the dashboards with showMultiCluster=True allows for them to work with multiple clusters storing their metrics in a single Prometheus instance. This works via the (configurable) cluster label and that functionality already existed.

This commit simply fixes some inconsistencies in applying the label filters which I found after rendering and using the
dashboards with a Prometheus instance holding metrics for multiple Ceph clusters.

There are also issues with the tests. I started working on them as well, but would like some feedback on how to best test with either showMultiCluster set to True or False. This would then ensure the support for multi cluster doesn't break on future changes and additions to the dashboards.

Fixes: https://tracker.ceph.com/issues/64321
Signed-off-by: Christian Rohmann christian.rohmann@inovex.de

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e

frittentheke · 2024-02-09T15:11:51Z

I admit the PR got a little bigger than just "fixing" the queries, but I believe I somewhat stayed in context.
See commit msg for some of my reasoning.

Before any more cleanup, I suggest to first convert to https://github.com/grafana/grafonnet to ensure compatibility with more recent Grafana releases.

frittentheke · 2024-02-09T16:28:30Z

@Javlopez @nizamial09 PTAL.

cloudbehl · 2024-02-12T05:44:16Z

@frittentheke Thanks for PR and fixing the queries.

Just wanted to understand, rather than having a flag to enable the cluster variable why don't we have dashboards default have cluster variable enabled. So user doesn't need to build it to support multicluster. Thoughts?

cloudbehl · 2024-02-12T05:44:46Z

Also can you attach the small recording that shows fixes you have done as part of the PR?

github-actions · 2024-02-13T07:07:38Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

frittentheke · 2024-02-14T14:26:39Z

@frittentheke Thanks for PR and fixing the queries.

Just wanted to understand, rather than having a flag to enable the cluster variable why don't we have dashboards default have cluster variable enabled. So user doesn't need to build it to support multicluster. Thoughts?

No objections on my part. We ourselves do use a single Prometheus instance (could also be Grafana Mimir / Thanos / Cortex ) holding metrics for multiple Ceph cluster and therefore make use of the templating via the cluster label.

frittentheke · 2024-02-14T14:43:16Z

Also can you attach the small recording that shows fixes you have done as part of the PR?

@cloudbehl you mean like a screen recording of me clicking through the various Grafana dashboards?

cloudbehl · 2024-02-15T05:55:54Z

@frittentheke Thanks for PR and fixing the queries.
Just wanted to understand, rather than having a flag to enable the cluster variable why don't we have dashboards default have cluster variable enabled. So user doesn't need to build it to support multicluster. Thoughts?

No objections on my part. We ourselves do use a single Prometheus instance (could also be Grafana Mimir / Thanos / Cortex ) holding metrics for multiple Ceph cluster and therefore make use of the templating via the cluster label.

That's my understanding as well, as if user is having multiple ceph pointing to single prometheus this would work out of the box for them and they don't need to build grafana dashboards just fr doing it. Even if single ceph cluster data in prometheus showing a cluster ID on top would do no harm.

So lets do and have this flag default enabled for all dashboards. I can help test this.

Also there are new dashboards added recently to grafana recently, can you help make sure those also hold the cluster variable and we are using same queries to fill values in those variables.

cloudbehl · 2024-02-15T05:56:45Z

ik

I have never tried it, so just want to see how it looks on one or two dashboards if you can show. That would be great

frittentheke · 2024-02-16T09:05:41Z

So lets do and have this flag default enabled for all dashboards. I can help test this.

@cloudbehl It might not be as easy. Even a single cluster would need to have the cluster label on its metrics then.
With enabled showMultiCluster all/most queries will filter on this label to distinguish between clusters.

And most likely folks will not have Prometheus add this cluster label in their scraping config. The only thing the docs at
https://docs.ceph.com/en/latest/mgr/prometheus/#honor-labels mention, is overriding the instance with a fixed value due to the responding mgr host changing. But there is no mention of a cluster variable.

I am wondering if there is any way to make this work opportunistically ... but I doubt it.
There might have been a reason to have this as a configurable option after all ...

Also there are new dashboards added recently to grafana recently, can you help make sure those also hold the cluster variable and we are using same queries to fill values in those variables.

I'll check again if all boards and queries are instrumented.

frittentheke · 2024-02-16T14:07:45Z

@cloudbehl

I seem to have fallen into a rabbit hole, just tryining to fix "a few" issues with multicluster ...

I pushed a new revision now which also updates the recently added "RGW S3 Analystics" boards that came in via https://tracker.ceph.com/issues/64359. Unfortunately there was also #55314 merged, which breaks the queries for <=Reef I suppose, so there so no simply backporting this anymore (if accepted and merged).

In any case, this PR has gotten was bigger than I expected. I gladly will provide a little recording of me browsing through the dashboard. But honestly this is not enough as a review. Especially the labels instance and hostname are used weirdly (I raised https://tracker.ceph.com/issues/64288 a while back). Then there is lots of label_replacing happening.
While it's nice to make the queries work on the most cluttered instance labeles (e.g. with port numbers), they could look much much cleaner, if it was simply expected from the source to provide them in a certain syntax (e.g. have Prometheus write clean instance labels. I tried to apply some cleaning and alignment, but this requires some good pair of eyes in reviewing the changes. I don't want to break thing for some folks, but only help fixing the multicluster dashboards.

All in all I believe the mixins deserve a full refactoring a some point to ensure they are maintained

a) Upgrade / Switch to https://github.com/grafana/grafonnet (from https://github.com/grafana/grafonnet-lib which is deprecated)
b) Upgrade to the lastest Grafana panels and align all the boards even more in their style and naming and remove all of the boilerplate or explicit config with sane defaults where possible.
c) Review if certain patterns could not be moved into little helpers, just like the matchers are now. My first idea would be the label_replacement. Requiring so much post-processing of metrics and their labels and so much code in the generator code makes any change too risky.

cloudbehl · 2024-02-22T04:39:20Z

So lets do and have this flag default enabled for all dashboards. I can help test this.

@cloudbehl It might not be as easy. Even a single cluster would need to have the cluster label on its metrics then. With enabled showMultiCluster all/most queries will filter on this label to distinguish between clusters.

And most likely folks will not have Prometheus add this cluster label in their scraping config. The only thing the docs at https://docs.ceph.com/en/latest/mgr/prometheus/#honor-labels mention, is overriding the instance with a fixed value due to the responding mgr host changing. But there is no mention of a cluster variable.

I am wondering if there is any way to make this work opportunistically ... but I doubt it. There might have been a reason to have this as a configurable option after all ...

Also there are new dashboards added recently to grafana recently, can you help make sure those also hold the cluster variable and we are using same queries to fill values in those variables.

I'll check again if all boards and queries are instrumented.
Thanks for looking into it.

This is something we will have soon in main branch via this PR((#54964) ), so all the new cluster will have the cluster label by default attached to it in prometheus metrics.

We have seen lot of admin doing the same just to support the multi-cluster so its makes sense to add the cluster label by default to all the queries. Also as we have progressing with multi-cluster management and monitoring for ceph cluster. This should become standard to have the label so we don't need to rely on instance label.

cloudbehl · 2024-02-22T04:57:39Z

@cloudbehl

I seem to have fallen into a rabbit hole, just tryining to fix "a few" issues with multicluster ...

What's the major issue that you are seeing with it?

I pushed a new revision now which also updates the recently added "RGW S3 Analystics" boards that came in via https://tracker.ceph.com/issues/64359. Unfortunately there was also #55314 merged, which breaks the queries for <=Reef I suppose, so there so no simply backporting this anymore (if accepted and merged).

can we have a different commit for the "RGW" realted dashboards in different PR so this could be possibly backport to fix potential issue?

In any case, this PR has gotten was bigger than I expected. I gladly will provide a little recording of me browsing through the dashboard. But honestly this is not enough as a review. Especially the labels instance and hostname are used weirdly (I raised https://tracker.ceph.com/issues/64288 a while back). Then there is lots of label_replacing happening. While it's nice to make the queries work on the most cluttered instance labeles (e.g. with port numbers), they could look much much cleaner, if it was simply expected from the source to provide them in a certain syntax (e.g. have Prometheus write clean instance labels. I tried to apply some cleaning and alignment, but this requires some good pair of eyes in reviewing the changes. I don't want to break thing for some folks, but only help fixing the multicluster dashboards.

All in all I believe the mixins deserve a full refactoring a some point to ensure they are maintained

a) Upgrade / Switch to https://github.com/grafana/grafonnet (from https://github.com/grafana/grafonnet-lib which is deprecated)
I agree we should migrate.

b) Upgrade to the lastest Grafana panels and align all the boards even more in their style and naming and remove all of the boilerplate or explicit config with sane defaults where possible.

Agreed this is much required, something I have been talking with the monitoring team for a while as well.

Few improvement areas that I see:

All the dashboard needs to be revisited and see how we can reduce the count of dashboards. Like we have two cluster dashboard we have four rgw dashboard. I think some dashboard could be merged into 1 and create less confusion for admins.
Too many variables in some dashboard. Some are not even working and some are working but doesn't have proper filtering.
Adding proper text helper for all the graphs.
All the graph/tables needs to be migrated to newer graphs/tables.

frittentheke · 2024-02-22T13:33:38Z

I seem to have fallen into a rabbit hole, just tryining to fix "a few" issues with multicluster ...

What's the major issue that you are seeing with it?

If you look at the changes, I also cleaned up (and hopefully did not break) quite a few queries that were not suitable for Prometheus holding data for e.g. other hosts than those hosting Ceph (e.g. listing all "instance" values for a Ceph dashboard template).

I still believe I did some good to all of them dashboards, even if there was a rewrite coming in, having a good base of working queries makes that process a lot easier. So I gladly invested the time.

frittentheke · 2024-02-22T13:45:03Z

I pushed a new revision now which also updates the recently added "RGW S3 Analystics" boards that came in via https://tracker.ceph.com/issues/64359. Unfortunately there was also #55314 merged, which breaks the queries for <=Reef I suppose, so there so no simply backporting this anymore (if accepted and merged).

can we have a different commit for the "RGW" realted dashboards in different PR so this could be possibly backport to fix potential issue?

You sure can, I'll look into it.
I also renamed / aligned the target filenames to universally use the radosgw- prefix. Do you like that part or should I remove that altogether?

cloudbehl · 2024-02-22T14:37:33Z

You sure can, I'll look into it. I also renamed / aligned the target filenames to universally use the radosgw- prefix. Do you like that part or should I remove that altogether?

I think renaming we can have a separate small PR after all is done just for squid and main branch

aaSharma14 · 2024-05-07T12:22:06Z

@frittentheke , There are two related issues added to this tracker - https://tracker.ceph.com/issues/64321, one tracker is for squid branch and the second one is for reef branch. Steps to open backport PR's are -

Checkout the squid branch
Do cd src/script
Run ./ceph-backport.sh --setup
The script will ask you to verify or enter some details like redmine key, github username, github token etc.
If the setup is okay, it should return - ceph-backport.sh: setup is OK
Now run ./ceph-backport.sh <squid_tracker_number> for eg. ./ceph-backport.sh 65838
This will open the backport PR for squid
Similarly you can do it for reef backport as well.

frittentheke · 2024-05-13T08:05:30Z

@frittentheke , There are two related issues added to this tracker - https://tracker.ceph.com/issues/64321, one tracker is for squid branch and the second one is for reef branch. Steps to open backport PR's are -
1. Checkout the squid branch
2. Do `cd src/script`
3. Run `./ceph-backport.sh --setup`
4. The script will ask you to verify or enter some details like redmine key, github username, github token etc.
5. If the setup is okay, it should return - `ceph-backport.sh: setup is OK`

Done.

6. Now run `./ceph-backport.sh <squid_tracker_number>` for eg. `./ceph-backport.sh 65838`
7. This will open the backport PR for squid

@aaSharma14
I suppose I should actually cherry-pick and adjust the commit to be backported, right?
Or is that done automagically?

Also it refuses due to the issues being yours:

ceph-backport.sh: my Redmine username is crohmann (ID 12304)
ceph-backport.sh: ERROR: https://tracker.ceph.com/issues/65838 is assigned to someone else: Aashish Sharma (ID 11319)
ceph-backport.sh: (my ID is 12304)
ceph-backport.sh: Cowardly refusing to continue

aaSharma14 · 2024-05-14T07:06:48Z

@frittentheke , There are two related issues added to this tracker - https://tracker.ceph.com/issues/64321, one tracker is for squid branch and the second one is for reef branch. Steps to open backport PR's are -
1. Checkout the squid branch
2. Do `cd src/script`
3. Run `./ceph-backport.sh --setup`
4. The script will ask you to verify or enter some details like redmine key, github username, github token etc.
5. If the setup is okay, it should return - `ceph-backport.sh: setup is OK`
Done.
6. Now run `./ceph-backport.sh <squid_tracker_number>` for eg. `./ceph-backport.sh 65838`
7. This will open the backport PR for squid
@aaSharma14 I suppose I should actually cherry-pick and adjust the commit to be backported, right? Or is that done automagically?

Also it refuses due to the issues being yours:
ceph-backport.sh: my Redmine username is crohmann (ID 12304)
ceph-backport.sh: ERROR: https://tracker.ceph.com/issues/65838 is assigned to someone else: Aashish Sharma (ID 11319)
ceph-backport.sh: (my ID is 12304)
ceph-backport.sh: Cowardly refusing to continue

@frittentheke , The cherry-pick is done automatically with this script. However if there are any conflicts, you can just resolve the conflicts..Do a git add and then re-run the script and that should be it. Also i have changed the assignee to you, You can try again. Thanks

aaSharma14 · 2024-05-15T07:17:47Z

Thank you for the backport @frittentheke , Can you please open the reef backport for this as well?

frittentheke · 2024-05-16T09:58:30Z

Thank you for the backport @frittentheke , Can you please open the reef backport for this as well?

Yes, but this just needs some more attention, at least due to the renaming of rgw counters, see
#55495 (comment)

Following PR ceph#55495 fixing the dashboard in regards to multiple clusters storing their metrics in a single Prometheus instance, this PR addresses the issues for alerts. Fixes: https://tracker.ceph.com/issues/64321 Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de>

Following PR ceph#55495 fixing the dashboard in regards to multiple clusters storing their metrics in a single Prometheus instance, this PR addresses the issues for alerts. Fixes: https://tracker.ceph.com/issues/64321 Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de> (cherry picked from commit 810c706)

- gateway submodule Fixes: https://tracker.ceph.com/issues/64777 This PR adds high availability support for the nvmeof Ceph service. High availability means that even in the case that a certain GW is down, there will be another available path for the initiator to be able to continue the IO through another GW. High availability is achieved by running nvmeof service consisting of at least 2 nvmeof GWs in the Ceph cluster. Every GW will be seen by the host (initiator) as a separate path to the nvme namespaces (volumes). The implementation consists of the following main modules: - NVMeofGWMon - a PaxosService. It is a monitor that tracks the status of the nvmeof running services, and take actions in case that services fail, and in case services restored. - NVMeofGwMonitorClient – It is an agent that is running as a part of each nvmeof GW. It is sending beacons to the monitor to signal that the GW is alive. As a part of the beacon, the client also sends information about the service. This information is used by the monitor to take decisions and perform some operations. - MNVMeofGwBeacon – It is a structure used by the client and the monitor to send/recv the beacons. - MNVMeofGwMap – The map is tracking the nvmeof GWs status. It also defines what should be the new role of every GW. So in the events of GWs go down or GWs restored, the map will reflect the new role of each GW resulted by these events. The map is distributed to the NVMeofGwMonitorClient on each GW, and it knows to update the GW with the required changes. It is also adding 3 new mon commands: - nvme-gw create - nvme-gw delete - nvme-gw show The commands are used by the ceph adm to update the monitor that a new GW is deployed. The monitor will update the map accordingly and will start tracking this GW until it is deleted. Signed-off-by: Leonid Chernin <lechernin@gmail.com> Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit 5843c6b) mon: add NVMe-oF gateway monitor and HA doc Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit bb75dde) mgr/cephadm: ceph nvmeof monitor support Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit 2946b19) mon/NVMeofGwMap.cc: tabbing, line length, formatting - Retabs file to match emacs/vim modelines at top - Fixes bracing - Adjusts line length to 80 char Signed-off-by: Samuel Just <sjust@redhat.com> (cherry picked from commit 8bf309e) mon/NVMeofGwMap.h: tabbing, line length, formatting - Adjust method signatures to better match mon/ - Adjust line length to 80 characthers Signed-off-by: Samuel Just <sjust@redhat.com> (cherry picked from commit 58d16c7) mon/NVMeofGwMon.h: tabbing, line length, formatting Signed-off-by: Samuel Just <sjust@redhat.com> (cherry picked from commit 1f470f0) mon/NVMeofGwMon.cc: tabbing, line length, formatting - Retabs file to match emacs/vim modelines at top - Fixes bracing - Adjusts line length to 80 char Signed-off-by: Samuel Just <sjust@redhat.com> (cherry picked from commit bff9dd4) mon/NVMeofGwTypes.h: tabbing, bracing, line length fixes Signed-off-by: Samuel Just <sjust@redhat.com> (cherry picked from commit e0f0469) mon/NVMeofGwSerialize.h: tabbing, bracing, line length fixes Signed-off-by: Samuel Just <sjust@redhat.com> (cherry picked from commit d5e013f) mgr/orchestrator: require "group" field for nvmeof specs Signed-off-by: Adam King <adking@redhat.com> (cherry picked from commit f6d552d) mgr/cephadm: migrate nvmeof specs without group field As we have added the group field as a requirement for new nvmeof specs and check for it in spec validation, we need a migration to populate this field for specs we find that don't have it. Signed-off-by: Adam King <adking@redhat.com> (cherry picked from commit d7b00ea) mgr/cephadm: make nvme-gw adds be able to handle multiple services/groups Before this was grabbing the service spec for the first daemon description in the list. This meant every daemon would be added with the pool/group of whatever that spec happened to specify. This patch grabs the spec, and therefore also the pool/group individually for each nvmeof daemon Signed-off-by: Adam King <adking@redhat.com> (cherry picked from commit 2a6b105) qa/cephadm: add group param when applying nvmeof Since it will now be required Signed-off-by: Adam King <adking@redhat.com> (cherry picked from commit 41c5dbe) include/ceph_features: remove stray available marker Should have been removed in caa9e7a. Signed-off-by: Samuel Just <sjust@redhat.com> include/ceph_features: add NVMEOFHA feature bit Normally, we'd just use the SERVER_SQUID or SERVER_T flags instead of using an extra feature bit. However, the nvmeof ha monitor paxos service has had a more complex development journey. There are users interested in using the nvmeof ha feature in squid, but it didn't make the cutoff for backporting it. There's an upstream nvmeof-squid branch in the ceph.git repository with the patches backported for anyone interested in building it. However, that means that users of our normal stable releases will see the feature added to the monitor one release after anyone who chooses to use the nvmeof-squid branch. We could disallow upgrades from nvmeof-squid to T, but by adding a feature bit here we make such a restriction unnecessary. Signed-off-by: Samuel Just <sjust@redhat.com> mon/NVMeofGw*: support upgrades from prior out-of-tree nvmeofha implementation (nvmeof-reef) This commit adds upgrade support for users running an experimental nvmeofha implementation which can be found in the nvmeof-reef branch in ceph.git. Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> mon/NVMeofGw*: fixing bugs - handle gw fast-reboot, proper handle of gw delete scenarios Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> nvmeof/NVMeofGwMonitorClient: use a separate mutex for beacons Add beacon_lock to mitigate potential beacon delays caused by slow message handling, particularly in handle_nvmeof_gw_map. Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit 0dc4185) cephadm: mount nvmeof certs into container ceph@2946b19 incorrectly removed this line and since then these certs are not being properly mounted into the container. This commit adds the line back Signed-off-by: Adam King <adking@redhat.com> (cherry picked from commit 8cc3a35) qa/suites/rbd/nvmeof: add multi-subsystem setup and thrash test 1. qa/tasks/nvmeof.py: 1.1. create multiple rbd images for all subsystems 1.2. add NvmeofThrasher and ThrashTest 2. qa/tasks/mon_thrash.py: add 'switch_thrashers' option 3. nvmeof_setup_subsystem.sh: create multiple subsystems and enable HA 4. Restructure qa/suites/rbd/nvmeof: Create two sub-suites - "basic" (nvmeof_initiator job) - "thrash" (new: nvmeof_mon_thrash and nvmeof_thrash jobs) Resolves: rhbz#2302243 Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit d0c4182) Revert "mgr/orchestrator: require "group" field for nvmeof specs" This reverts commit f6d552d. It was decided by the nvmeof team to stick with defaulting to an empty string rather than forcing the users onto other non-empty names when they upgrade Signed-off-by: Adam King <adking@redhat.com> (cherry picked from commit 3e5e85a) Revert "mgr/cephadm: migrate nvmeof specs without group field" This reverts commit d7b00ea. It was decided by the nvmeof team to stick with defaulting to an empty string rather than forcing the users onto other non-empty names when they upgrade Signed-off-by: Adam King <adking@redhat.com> (cherry picked from commit e63d4b0) mgr/orchestrator: allow passing group to apply/add nvmeof commands We no longer require the group when applying an nvmeof spec but we still want to allow the commands to take a group parameter (and this will at least make a group name required when creating a new service on the command line) Signed-off-by: Adam King <adking@redhat.com> (cherry picked from commit b377085) mon/NVMeofGw*: Fix issue when ana group of deleted GW was not serviced. Introduced GW Deleting state Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> Resolves: rhbz#2310380 (cherry picked from commit d4f961a) mon/NVMeofGw*: 1. fix blocklist bug - blockist was not called 2. originally monitor only bloklisted specific ana groups but since we allow the changing of ns ana grp on the fly for the sake of ns load balance, it is not good enough and we need to blocklist all the cluster contexts of the failing gateway Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 936d3af) mon/NVMeofGw*: fix issue that GW was down when last subsystem was deleted Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> Resolves: rhbz#2301460 (cherry picked from commit 698e4c5) Merge pull request ceph#59999 from leonidc/tracking-gw-deleting mon/nvmeofgw*: fix tracking gateways in DELETING state Resolves: rhbz#2314625 (cherry picked from commit 381a408) Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> mgr/cephadm: change ceph-nvmeof gw image version to 1.3 Resolves: rhbz#2309667 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 783f868) mgr/cephadm: Make the discovery and gateway IPs configurable in NVMEof configuration Resolves: rhbz#2311459 (cherry picked from commit 9f6d1ec) Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> pybind/mgr/cephadm/services/nvmeof.py: allow setting '0.0.0.0' as address in the spec file - Partial revert of ceph@9eb3b99 - Part of ceph#59738 (cherry picked from commit 62a4247) python-common/ceph/deployment/service_spec.py: Allow the cephadm deployment to determine the default addresses Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit 0997e4c) Resolves: rhbz#2311996 (cherry picked from commit 2db7559) qa/tasks/nvmeof.py: add nvmeof gw-group to deployment Groups was made a required parameter to be `ceph orch apply nvmeof <pool> <group>` in ceph#58860. That broke the `nvmeof` suite so this PR fixes that. Right now, all gateway are deployed in a single group. Later, this would be changed to have multi groups for a better test. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit c9a6fed) qa: Expand nvmeof thrasher and add nvmeof_namespaces.yaml job 1. qa/tasks/nvmeof.py: add other methods to stop nvmeof daemons 2. add qa/workunits/rbd/nvmeof_namespace_test.sh which adds and deletes new namespaces. It is run in nvmeof_namespaces.yaml job where fio happens to other namespaces in background. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 58d8be9) qa/suites/nvmeof/basic: add nvmeof_scalability test Add test to upscale/downscale nvmeof gateways. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit e5a9cda) qa: move nvmeof shell scripts to qa/workunits/nvmeof Move all scripts qa/workunits/rbd/nvmeof_*.sh to qa/workunits/nvmeof/*.sh Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 2ed818e) Conflicts: qa/workunits/nvmeof/setup_subsystem.sh qa/suites/nvmeof: increase hosts in cluster setup In "nvmeof" task, change "client" config to "installer" which allows to take inputs like "host.a". nvmeof/basic: change 2-gateway-2-initiator to 4-gateway-2-inititator cluster nvmeof/thrash: change 3-gateway-1-initiator to 4-gateway-1-inititaor cluster Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 4d97b1a) qa/suites/nvmeof: add mtls test Add qa/workunits/nvmeof/mtls_test.sh which enables mtls config and redeploy, then verify and disables mtls config. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit fdc93ad) Conflicts: qa/tasks/nvmeof.py qa/suite/nvmeof/thrash: increase number of thrashing - Run fio for 15 mins (instead of 10min). - nvmeof.py: change daemon_max_thrash_times default from 3 to 5 - nvmeof.py: run nvme list in do_checks() Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 51743e6) qa/suites/nvmeof: add nvmeof warnings to log-ignorelist Add NVMEOF_SINGLE_GATEWAY and NVMEOF_GATEWAY_DOWN warnings to nvmeof:thrash job's log-ignorelist Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 73d5c01) qa/suites/nvmeof/thrash: Add "is unavailable" to log-ignorelist This commit also: - Remove --rbd_iostat from thrasher fio - Log iteration details before printing stats in nvmeof_tharsher Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit c0ca0eb) qa/tasks/nvmeof.py: Improve thrasher and rbd image creation Create rbd images in one command using ";" to queue them, instead of running "cephadm shell -- rbd create" again and again for each image. Improve the method to select to-be-thrashed daemons. Use randint() and sample(), instead of weights/skip. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 82118e1) qa/tasks/ceph: provide configuration for setting configs via mon These configs may be set using: ceph: cluster-config: entity: foo: bar same as the current: ceph: config: entity: foo: bar The configs will be set in parallel using the `ceph config set` command. The main benefit here is to avoid using the ceph.conf to set configs which cannot be overriden using subsequent `ceph config` command. The only way to override is to change the ceph.conf in the test (yuck) or the admin socket (which gets reset when the daemon restarts). Finally, we can now exploit the `ceph config reset` command will let us trivially rollback config changes after a test completes. That is exposed as the `ctx.config_epoch` variable. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 9d485ae) python-common/ceph/deployment: add SPDK log level to nvmeof configuration Fixes https://tracker.ceph.com/issues/67258 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit d3cc237) mgr/cephadm: add SPDK log level to nvmeof configuration Fixes https://tracker.ceph.com/issues/67258 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 19399de) python-common/ceph/deployment: change SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67629 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit d18e6fb) mgr/cephadm: change SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67629 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit d208242) python-common/ceph/deployment: revert SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67844 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit cb28d39) mgr/cephadm: revert SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67844 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 11de53f) python-common/ceph/deployment: Add namespace netmask parameters to nvmeof configuration Fixes https://tracker.ceph.com/issues/68542 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit dd4b357) mgr/cephadm: Add namespace netmask parameters to nvmeof configuration Fixes https://tracker.ceph.com/issues/68542 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 0dcc207) python-common/ceph/deployment: Add resource limits to nvmeof configuration Fixes https://tracker.ceph.com/issues/68967 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 4269d7c) mgr/cephadm: Add resource limits to nvmeof configuration Fixes https://tracker.ceph.com/issues/68967 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 1807a55) Signed-off-by: Gil Bregman <gbregman@il.ibm.com> mgr/cephadm/nvmeof: Add auto rebalance fields to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69176 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit bfc8fb6) mgr/cephadm/nvmeof: Rewrite NVMEoF fields validation. Fixes https://tracker.ceph.com/issues/69176 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 31283c0) mgr/cephadm/nvmeof: Add key verification field to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69413 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 26a0f9a) Signed-off-by: Gil Bregman <gbregman@il.ibm.com> pybind/mgr/orchestrator/module.py: NvmeofServiceSpec service_id - make service_id better alligned with default/empty group (ceph@f6d552d) - fix service_id in nvmeof daemon add Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit e1612d0) cephadm/nvmeof: support no huge pages for nvmeof spdk depends on: ceph/ceph-nvmeof#898 Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit 38513cb) cephadm/nvmeof: support per-node gateway addresses Added gateway and discovery address maps to the service specification. These maps store per-node service addresses. The address is first searched in the map, then in the spec address configuration. If neither is defined, the host IP is used as a fallback. Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit 2f47f9d) cephadm/nvmeof: fix ports when default values are overridden Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit e717a92) src/nvmeof/NVMeofGwMonitorClient: remove MDS client, not needed Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit f806872) mon: add nvmeof healthchecks Add NVMeofGwMap::get_health_checks which raises NVMEOF_SINGLE_GATEWAY if any of the groups have 1 gateway. In NVMeofGwMon, call `encode_health` and `load_health` to register healthchecks. This will add nvmeof healthchecks to "ceph health" output. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 1cad040) mon: add warning NVMEOF_GATEWAY_DOWN In src/mon/NVMeofGwMap.cc, add warning NVMEOF_GATEWAY_DOWN when any gateway is in GW_UNAVAILABLE state. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 0006599) monitoring: Add prometheus alert NVMeoFMultipleNamespacesOfRBDImage NVMeoFMultipleNamespacesOfRBDImage alerts the user if a RBD image is used for multiple namespaces. This is important alerts for cases where namespaces are created on same image for different gateway group. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 61b3289) monitoring: add 2 nvmeof alerts to prometheus_alerts.yaml - `NVMeoFMissingListener`: trigger if all listeners are not created for each gateway in a subsystem - `NVMeoFZeroListenerSubsystem`: trigger if a subsystem has no listeners Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit f02e312) monitoring: add 2 new nvmeof alerts Add NVMeoFMissingListener and NVMeoFZeroListenerSubsystem alerts to prometheus_alerts.libsonnet. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7994fea) monitoring: add tests for 2 new nvmeof alerts Add test for alerts NVMeoFMissingListener and NVMeoFZeroListenerSubsystem to test_alerts.yml. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit a878460) monitoring: Add alert NVMeoFTooManyNamespaces NVMeoFTooManyNamespaces helps to alert user if total number of namespaces across subsystems are more than 1024. Change NVMeoFTooManySubsystems limit to 128 from 16. Fixes: ceph/ceph-nvmeof#948 Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 614e146) mon/NVMeofGwMap: add healthcheck warning NVMEOF_GATEWAY_DELETING Add a warning when NVMeoF gateways are in DELETING state. This happens when there are namespaces under the deleted gateway's ANA group ID. The gateways are removed completely after users manually move these namespaces to another load balancing group. Or if a new gateway is deployed on that host. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 571dd53) src/common/options/mon.yaml.in: add mon_nvmeofgw_delete_grace This config allows to configure the delay in triggering NVMEOF_GATEWAY_DELETING healthcheck warning, which is triggered when NVMeoF gateways are in DELETEING state for too long (indicating a problem in namespace load-balacing). The default value for this config is 15 mins. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7b33f77) mon/NVMeofGwMap: add delay to NVMEOF_GATEWAY_DELETING warning Instead of immediately triggering, have this healthcheck trigger after some time has elasped. This delay can be configured by mon_nvmeofgw_delete_grace. Track the time when gateways go into DELETING state in a new member var (of NVMeofGwMon) 'gws_deleting_time'. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 56cf512) qa/workunits/nvmeof/basic_tests.sh: fix connect-all assert There seems to be change in 'nvme list' json output which caused failures in asserts after 'nvme connect-all' command. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 22f91cd) mon/nvmeofgw*:fix monitor database corruption upon add gw Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 417c544) mon/nvmeofgw*: fix HA usecase when gateway has no listeners: behaves like no-subsystems Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 47e7a24) mon/nvmeofgw*: monitors publish in nvme-gw show ana group responsible for namespace rebalance Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit c358483) nvmeofgw* : fix publishing rebalance index Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit ceb62c0) mgr/cephadm: change ceph-nvmeof gw image version to 1.4 Fixes https://tracker.ceph.com/issues/69099 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> mon/nvme: fix unused lambda capture warnings Signed-off-by: Ronen Friedman <rfriedma@redhat.com> (cherry picked from commit edb0321) Add multi-cluster support (showMultiCluster=True) to alerts Following PR ceph#55495 fixing the dashboard in regards to multiple clusters storing their metrics in a single Prometheus instance, this PR addresses the issues for alerts. Fixes: https://tracker.ceph.com/issues/64321 Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de> (cherry picked from commit 810c706) monitoring: Update nvmeof alert limits in config Update these in config.libsonnet: - NVMeoFMaxGatewaysPerGroup (4->8) - NVMeoFMaxGatewaysPerCluster (4->32) - NVMeoFMaxNamespaces (1024->2048) - NVMeoFHighClientCount (32->128) Also update prometheus_alerts.yml and test_alerts.yml accordingly. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit f3c1881) mon: do not show nvmeof in 'ceph versions' output NVMeoF gateway version is independent of ceph version so 'ceph version' shows wrong nvmeof version in output (i.e. instead of gateway version, it shows Ceph version). Hence, remove nvmeof in 'ceph versions' output. To check for gateway version, use 'gw info' command. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 73c935d) mgr/cephadm/nvmeof: Add verify_listener_ip field to NVMeOF configuration and remove obsolete enable_key_encryption Fixes https://tracker.ceph.com/issues/69731 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 744b04a) mgr/cephadm/nvmeof: Add max_hosts field to NVMeOF configuration and update default values Fixes https://tracker.ceph.com/issues/69759 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 0d8bd4d) mgr/cephadm/nvmeof: Add SPDK iobuf options field to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69554 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 42bac97) monitoring: add NVMeoFMaxGatewayGroups Add config NVMeoFMaxGatewayGroups to config.libsonnet and set it to 4 (groups). Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit c5c4b10) monitoring: add alert NVMeoFMaxGatewayGroups Add alert NVMeoFMaxGatewayGroups to prometheus_alerts.yml and prometheus_alerts.libsonnet. This alerts is to indicate if max number of NVMeoF gateway groups have been reached in a cluster. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit ab4a1dd) monitoring: add tests for NVMeoFMaxGatewayGroups Add unit tests for alert NVMeoFMaxGatewayGroups in monitoring/ceph-mixin/tests_alerts/test_alerts.yml Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e5cb5db) qa/tasks/nvmeof: Add --refresh flag in do_checks() cmds This is to ensure latest state of the services are displayed. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 023c209) qa: Add qa/suites/nvmeof/thrash/gateway-initiator-setup/2-subsys-8-namespace.yaml This allows to run nvmeof thrasher test on smaller confgurations which finshes faster than 120subsys-8ns config. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit d7551f7) qa/tasks/nvmeof.py: Add stop_and_join method to thrasher Also add nvme-gw show command output in do_checks() and revive daemons with 'ceph orch daemon start' in revive_daemon() method. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 0b0f450) qa/workunits/nvmeof/fio_test.sh: fix fio filenames Filenames were provided to fio as nvme1n1:nvme1n2, it should be pull path (/dev/nvme1n1:/dev/nvme1n2). Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 06811a4) qa/tasks/nvmeof.py: Do not use 'systemctl start' in thrasher Instead use 'daemon start' in revive_daemon() to bring up gateways thrashed with 'systemctl stop'. This is because 'systemctl start' method seems to temporary issues. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit b5e6a0c) qa/tasks/nvmeof.py: make seperate calls in do_checks() When running 'nvme list-subsys <device>' command in do_checks(), instead of combining command for all devices with '&&', make seperate calls. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 5a58114) qa/tasks/nvmeof.py: Fix do_checks() method All checks currently run on initator node, now run all "ceph" commands on one of gateway hosts instead of initator nodes. And run "nvme list" and "nvme list-subsys" checks on initator node. Add retry (5 times) to do_checks if any command fails. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7dfd3d3) qa/tasks/nvmeof.py: Ignore systemctl_stop thrashing method Do not use systemctl_stop method to thrash daemons, just use 'ceph orch daemon stop' and 'ceph orch daemon rm' methods to thrash nvmeof gateways. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit d4aec58) qa/tasks/nvmeof.py: Add teardown() method Add teardown method to remove nvmeof service before rest of the cluster tearsdown. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e8201d3) qa/suites/nvmeof: Remove watchdog from thrasher This commit does the following: 1. remove watchdog from thrasher 1. remove wait from fio_test 3. change thrasher switcher wait-time to 10 mins Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 76b4028) qa/suites/nvmeof: use SCALING_DELAYS: '120' Increase delays for qa/workunits/nvmeof/scalability_test.sh as namespace rebalancing takes more time. After upscaling, gateway initially could be 'CREATED', it is a valid state during gateway initialization, but then the state should progress to 'AVAILABLE' within couple of seconds. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 3b9b290) nvmeofgw*: change log level of critical nvmeof monitor events to 1 Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 57c4e16) nvmeofgw*: 2 fixes - for duplicated optimized pathes and fix for GW startup 1. fix duplicated optimized host's pathes - trigger process_gw_down upon fast-gw reboot, removed old fast-reboot handlers 2. fix GW startup - trigger process_gw_down when expired WAIT_BLOCKLIST timer Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 4397c02) qa/workunits/nvmeof/fio_test: Log cluster status if fio fails Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e450406) qa/suites/nvmeof: add more asserts to scalability_test Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 877c726) qa/suites/nvmeof: Run fio with scalability test Run fio in parallel with scalability test. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e2f3bed) qa/workunits/nvmeof/fio_test.sh: add more debug commands Add more commands to debug when fio fails: - nvme list-subsys /dev/nvme1n2 - nvme list from the initiator - nvme list | wc -l - nvme id-ns /dev/nvme1n2 Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit fd8fbea) monitoring: fix NVMeoFSubsystemNamespaceLimit Alert is not triggered as expected, change the query to fix that. BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2282348 Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 4a7866a) mgr/cephadm/nvmeof: Add QOS timeslice field to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69952 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 7b4af1f) Merge pull request ceph#60871 from leonidc/leonidc-epoch-filter Epoch filtering Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Aviv Caro <Aviv.Caro@ibm.com> Reviewed-by: Ronen Friedman <rfriedma@redhat.com> (cherry picked from commit 3cdf529) mon/nvmeofgw*: fix no-listeners FSM, fix detection of no-listeners condition Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 66ca80e) restore proper no-listeners logic Signed-off-by: leonidc <leonidc@il.ibm.com>

Following PR ceph#55495 fixing the dashboard in regards to multiple clusters storing their metrics in a single Prometheus instance, this PR addresses the issues for alerts. Fixes: https://tracker.ceph.com/issues/64321 Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de> (cherry picked from commit 810c706)

======================================== Resolves: rhbz#2350962 qa/tasks/nvmeof.py: add nvmeof gw-group to deployment Groups was made a required parameter to be `ceph orch apply nvmeof <pool> <group>` in ceph/ceph#58860. That broke the `nvmeof` suite so this PR fixes that. Right now, all gateway are deployed in a single group. Later, this would be changed to have multi groups for a better test. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit c9a6fed) qa: Expand nvmeof thrasher and add nvmeof_namespaces.yaml job 1. qa/tasks/nvmeof.py: add other methods to stop nvmeof daemons 2. add qa/workunits/rbd/nvmeof_namespace_test.sh which adds and deletes new namespaces. It is run in nvmeof_namespaces.yaml job where fio happens to other namespaces in background. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 58d8be9) qa/suites/nvmeof/basic: add nvmeof_scalability test Add test to upscale/downscale nvmeof gateways. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit e5a9cda) qa: move nvmeof shell scripts to qa/workunits/nvmeof Move all scripts qa/workunits/rbd/nvmeof_*.sh to qa/workunits/nvmeof/*.sh Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 2ed818e) qa/suites/nvmeof: increase hosts in cluster setup In "nvmeof" task, change "client" config to "installer" which allows to take inputs like "host.a". nvmeof/basic: change 2-gateway-2-initiator to 4-gateway-2-inititator cluster nvmeof/thrash: change 3-gateway-1-initiator to 4-gateway-1-inititaor cluster Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 4d97b1a) qa/suites/nvmeof: wait for service "nvmeof.mypool.mygroup0" This is because nvmeof gateway group names are now part of service id. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit da8e95c) labeler: add nvmeof labelers Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit d513cc5) qa/suites/nvmeof: use "latest" image of gateway and cli Change nvmeof gateway and cli image from 1.2 to "latest". Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 0bab553) qa/workunits/nvmeof/setup_subsystem.sh: use --no-group-append In newer version of nvmeof cli, "subsystem add" needs this tag to ensure subsystem name is value of --subsystem. Otherwise, in newer cli version, the gateway group is appended at the end of the subsystem name. This fixes the teuthology nvmeof suite (currently all jobs fails because of this). Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 303f18b) mon: add nvmeof healthchecks Add NVMeofGwMap::get_health_checks which raises NVMEOF_SINGLE_GATEWAY if any of the groups have 1 gateway. In NVMeofGwMon, call `encode_health` and `load_health` to register healthchecks. This will add nvmeof healthchecks to "ceph health" output. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 1cad040) mon: add warning NVMEOF_GATEWAY_DOWN In src/mon/NVMeofGwMap.cc, add warning NVMEOF_GATEWAY_DOWN when any gateway is in GW_UNAVAILABLE state. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 0006599) qa/suites/nvmeof: add mtls test Add qa/workunits/nvmeof/mtls_test.sh which enables mtls config and redeploy, then verify and disables mtls config. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit fdc93ad) monitoring: add 2 nvmeof alerts to prometheus_alerts.yaml - `NVMeoFMissingListener`: trigger if all listeners are not created for each gateway in a subsystem - `NVMeoFZeroListenerSubsystem`: trigger if a subsystem has no listeners Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit f02e312) monitoring: add 2 new nvmeof alerts Add NVMeoFMissingListener and NVMeoFZeroListenerSubsystem alerts to prometheus_alerts.libsonnet. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7994fea) monitoring: add tests for 2 new nvmeof alerts Add test for alerts NVMeoFMissingListener and NVMeoFZeroListenerSubsystem to test_alerts.yml. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit a878460) qa/suites/nvmeof: add nvmeof warnings to log-ignorelist Add NVMEOF_SINGLE_GATEWAY and NVMEOF_GATEWAY_DOWN warnings to nvmeof:thrash job's log-ignorelist Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 73d5c01) qa/suites/nvmeof: fix nvmeof_namespaces.yaml When basic_tests.sh is executed in parallel with namespace_test.sh, sometimes namespace_test.sh starts before fio_test.sh which would break the test. So this change ensures "fio_test.sh" is started before and executed in parallel with "namespace_test.sh". Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 6e15b5e) qa/suite/nvmeof: add asserts to scalability_test.sh Add assertions to 'status_checks()' function. Use "apply" and "redeploy", instead of "orch rm" and "apply" to upscale/downscale gateways. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 9393509) qa/suite/nvmeof/thrash: increase number of thrashing - Run fio for 15 mins (instead of 10min). - nvmeof.py: change daemon_max_thrash_times default from 3 to 5 - nvmeof.py: run nvme list in do_checks() Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 51743e6) qa/suites/nvmeof/basic: use default image in nvmeof_initiator.yaml Instead of using quay.io/ceph/nvmeof:latest, use default image in ceph build. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit f670916) qa/suites/nvmeof/thrash: Add "is unavailable" to log-ignorelist This commit also: - Remove --rbd_iostat from thrasher fio - Log iteration details before printing stats in nvmeof_tharsher Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit c0ca0eb) qa/suites/nvmeof/thrasher: use 120 subsystems and 8 ns each For tharsher test: 1. Run it on 120 subsystems with 8 namespaces each 2. Run FIO for 20 mins (instead of 15mins) 2. Run FIO for few randomly picked devices (using `--random_devices 200`) Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e1983c5) qa/tasks/nvmeof.py: Improve thrasher and rbd image creation Create rbd images in one command using ";" to queue them, instead of running "cephadm shell -- rbd create" again and again for each image. Improve the method to select to-be-thrashed daemons. Use randint() and sample(), instead of weights/skip. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 82118e1) qa/workunits/nvmeof/setup_subsystem.sh: add list_namespaces() func Add list_namespaces function which could be useful for debugging later. Remove extra call of list_subsystems so it's only logged once after subsystems are completely setup. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 2030411) qa/workunits/nvmeof/basic_tests.sh: Assert number of devices Check number of devices connected after connect-all. It should be equal to number of namespaces created. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7ee4677) qa/suites/nvmeof/thrash: add 10-subsys-90-namespace-no_huge_pages.yaml Add test for no-huge-pages by using config "spdk_mem_size: 4096" in 10 subsystems and 90 namespaces each setup. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 09ade3d) monitoring: Add prometheus alert NVMeoFMultipleNamespacesOfRBDImage NVMeoFMultipleNamespacesOfRBDImage alerts the user if a RBD image is used for multiple namespaces. This is important alerts for cases where namespaces are created on same image for different gateway group. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 61b3289) mon/NVMeofGwMap: add healthcheck warning NVMEOF_GATEWAY_DELETING Add a warning when NVMeoF gateways are in DELETING state. This happens when there are namespaces under the deleted gateway's ANA group ID. The gateways are removed completely after users manually move these namespaces to another load balancing group. Or if a new gateway is deployed on that host. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 571dd53) src/common/options/mon.yaml.in: add mon_nvmeofgw_delete_grace This config allows to configure the delay in triggering NVMEOF_GATEWAY_DELETING healthcheck warning, which is triggered when NVMeoF gateways are in DELETEING state for too long (indicating a problem in namespace load-balacing). The default value for this config is 15 mins. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7b33f77) mon/NVMeofGwMap: add delay to NVMEOF_GATEWAY_DELETING warning Instead of immediately triggering, have this healthcheck trigger after some time has elasped. This delay can be configured by mon_nvmeofgw_delete_grace. Track the time when gateways go into DELETING state in a new member var (of NVMeofGwMon) 'gws_deleting_time'. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 56cf512) qa/workunits/nvmeof/basic_tests.sh: fix connect-all assert There seems to be change in 'nvme list' json output which caused failures in asserts after 'nvme connect-all' command. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 22f91cd) qa/tasks/nvmeof: Add --refresh flag in do_checks() cmds This is to ensure latest state of the services are displayed. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 023c209) qa: Add qa/suites/nvmeof/thrash/gateway-initiator-setup/2-subsys-8-namespace.yaml This allows to run nvmeof thrasher test on smaller confgurations which finshes faster than 120subsys-8ns config. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit d7551f7) qa/tasks/nvmeof.py: Add stop_and_join method to thrasher Also add nvme-gw show command output in do_checks() and revive daemons with 'ceph orch daemon start' in revive_daemon() method. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 0b0f450) qa/workunits/nvmeof/fio_test.sh: fix fio filenames Filenames were provided to fio as nvme1n1:nvme1n2, it should be pull path (/dev/nvme1n1:/dev/nvme1n2). Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 06811a4) qa/tasks/nvmeof.py: Do not use 'systemctl start' in thrasher Instead use 'daemon start' in revive_daemon() to bring up gateways thrashed with 'systemctl stop'. This is because 'systemctl start' method seems to temporary issues. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit b5e6a0c) qa/tasks/nvmeof.py: make seperate calls in do_checks() When running 'nvme list-subsys <device>' command in do_checks(), instead of combining command for all devices with '&&', make seperate calls. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 5a58114) qa/tasks/nvmeof.py: Fix do_checks() method All checks currently run on initator node, now run all "ceph" commands on one of gateway hosts instead of initator nodes. And run "nvme list" and "nvme list-subsys" checks on initator node. Add retry (5 times) to do_checks if any command fails. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7dfd3d3) qa/tasks/nvmeof.py: Ignore systemctl_stop thrashing method Do not use systemctl_stop method to thrash daemons, just use 'ceph orch daemon stop' and 'ceph orch daemon rm' methods to thrash nvmeof gateways. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit d4aec58) qa/tasks/nvmeof.py: Add teardown() method Add teardown method to remove nvmeof service before rest of the cluster tearsdown. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e8201d3) qa/suites/nvmeof: Remove watchdog from thrasher This commit does the following: 1. remove watchdog from thrasher 1. remove wait from fio_test 3. change thrasher switcher wait-time to 10 mins Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 76b4028) monitoring: add NVMeoFMaxGatewayGroups Add config NVMeoFMaxGatewayGroups to config.libsonnet and set it to 4 (groups). Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit c5c4b10) monitoring: add alert NVMeoFMaxGatewayGroups Add alert NVMeoFMaxGatewayGroups to prometheus_alerts.yml and prometheus_alerts.libsonnet. This alerts is to indicate if max number of NVMeoF gateway groups have been reached in a cluster. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit ab4a1dd) monitoring: add tests for NVMeoFMaxGatewayGroups Add unit tests for alert NVMeoFMaxGatewayGroups in monitoring/ceph-mixin/tests_alerts/test_alerts.yml Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e5cb5db) qa/suites/nvmeof: use SCALING_DELAYS: '120' Increase delays for qa/workunits/nvmeof/scalability_test.sh as namespace rebalancing takes more time. After upscaling, gateway initially could be 'CREATED', it is a valid state during gateway initialization, but then the state should progress to 'AVAILABLE' within couple of seconds. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 3b9b290) qa/workunits/nvmeof/fio_test: Log cluster status if fio fails Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e450406) qa/suites/nvmeof: add more asserts to scalability_test Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 877c726) qa/suites/nvmeof: Run fio with scalability test Run fio in parallel with scalability test. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e2f3bed) qa/workunits/nvmeof/fio_test.sh: add more debug commands Add more commands to debug when fio fails: - nvme list-subsys /dev/nvme1n2 - nvme list from the initiator - nvme list | wc -l - nvme id-ns /dev/nvme1n2 Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit fd8fbea) mon: Add nvmeof group/gateway name in "ceph -s" In "ceph status" command output, show gateway group names and gateway names. Before: ``` services: mon: 4 daemons, quorum ceph-nvme-vm8,ceph-nvme-vm1,ceph-nvme-vm7,ceph-nvme-vm6 (age 71m) mgr: ceph-nvme-vm8.tgytdq(active, since 73m), standbys: ceph-nvme-vm6.tequqo, ceph-nvme-vm1.pxrofr, ceph-nvme-vm7.lbxrea osd: 4 osds: 4 up (since 70m), 4 in (since 70m) nvmeof: 4 gateways active (4 hosts) ``` After: ``` services: mon: 4 daemons, quorum ceph-nvme-vm14,ceph-nvme-vm11,ceph-nvme-vm13,ceph-nvme-vm12 (age 17m) mgr: ceph-nvme-vm14.gjjgvq(active, since 19m), standbys: ceph-nvme-vm12.shbvpw, ceph-nvme-vm11.gucgiu, ceph-nvme-vm13.inzizw osd: 4 osds: 4 up (since 15m), 4 in (since 16m) nvmeof (mygroup1) : 2 gateways active (ceph-nvme-vm13.azfdpk, ceph-nvme-vm14.hdsoxl) nvmeof (mygroup2) : 2 gateways active (ceph-nvme-vm11.hnooxs, ceph-nvme-vm12.wcjcjs) ``` Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e3fab2a) mon: show count of active/total nvmeof gws in "ceph -s" Improve "ceph status" output for nvmeof service: 1. Group by service_id (<pool>.<group>) instead of just by gateway groups. 2. Show total gateway count from NVMeofGwMap, and count of active gateways. New output: ``` services: mon: 4 daemons, quorum ceph-nvme-vm31,ceph-nvme-vm28,ceph-nvme-vm30,ceph-nvme-vm29 (age 16m) mgr: ceph-nvme-vm31.wnfclf(active, since 18m), standbys: ceph-nvme-vm29.iuwqin, ceph-nvme-vm28.lnnyui, ceph-nvme-vm30.fitwnw osd: 4 osds: 4 up (since 14m), 4 in (since 15m) nvmeof (mypool.mygroup1): 2 gateways: 1 active (ceph-nvme-vm30.kkcfux) nvmeof (mypool.mygroup2): 2 gateways: 2 active (ceph-nvme-vm28.mfqucr, ceph-nvme-vm29.hrizzl) ``` Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 3065ffe) monitoring: fix NVMeoFSubsystemNamespaceLimit Alert is not triggered as expected, change the query to fix that. BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2282348 Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 4a7866a) mgr/cephadm: set service name for DaemonDescription object used during daemon removal What this is specifically fixing is that the nvmeof post_remove function needs the service spec of the daemon's service to get the pool and group tied to the nvmeof daemon. We have been using the DaemonDescription "service_name" property to get the service name in order to get the spec. This works in a regular deployment. However, it is possible to make a placement like placement: hosts: - vm-00=nvmeof.a - vm-01=nvmeof.b and one of the nvmeof CI tests was doing so, which is why we saw this. That will cause the nvmeof daemon names to be nvmeof.nvmeof.a and nvmeof.nvmeof.b and not include the service name at all. In this case, the service_name property on the DaemonDescription class will end up getting service names nvmeof.nvmeof.a and nvmeof.nvmeof.b respectively from the nvmeof daemons, which will cause us to fail to find the spec in post_remove. This change makes it so we manually set the service name for the DaemonDescription object that gets passed to post_remove based on the service name of the daemon object we get from the host cache, which will still have the correct service name even if the daemon has a custom name. Then the nvmeof post_remove function will get the correct service name and be able to find the spec. Additionally, we now take are technically taking the daemon type and id from the DaemonDescription in our HostCache as well, but this is mostly just for consistency and should have no real impact. Fixes: https://tracker.ceph.com/issues/68962 Signed-off-by: Adam King <adking@redhat.com> (cherry picked from commit d8dae24) Add multi-cluster support (showMultiCluster=True) to alerts Following PR ceph/ceph#55495 fixing the dashboard in regards to multiple clusters storing their metrics in a single Prometheus instance, this PR addresses the issues for alerts. Fixes: https://tracker.ceph.com/issues/64321 Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de> (cherry picked from commit 810c706) mon/nvme: fix unused lambda capture warnings Signed-off-by: Ronen Friedman <rfriedma@redhat.com> (cherry picked from commit edb0321) src/nvmeof/NVMeofGwMonitorClient: remove MDS client, not needed Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit f806872) cephadm/nvmeof: fix ports when default values are overridden Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit e717a92) cephadm/nvmeof: support per-node gateway addresses Added gateway and discovery address maps to the service specification. These maps store per-node service addresses. The address is first searched in the map, then in the spec address configuration. If neither is defined, the host IP is used as a fallback. Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit 2f47f9d) cephadm/nvmeof: support no huge pages for nvmeof spdk depends on: ceph/ceph-nvmeof#898 Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit 38513cb) pybind/mgr/orchestrator/module.py: NvmeofServiceSpec service_id - make service_id better alligned with default/empty group (ceph/ceph@f6d552d) - fix service_id in nvmeof daemon add Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit e1612d0) python-common/ceph/deployment: add SPDK log level to nvmeof configuration Fixes https://tracker.ceph.com/issues/67258 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit d3cc237) mgr/cephadm: add SPDK log level to nvmeof configuration Fixes https://tracker.ceph.com/issues/67258 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 19399de) python-common/ceph/deployment: change SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67629 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit d18e6fb) mgr/cephadm: change SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67629 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit d208242) python-common/ceph/deployment: revert SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67844 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit cb28d39) mgr/cephadm: revert SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67844 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 11de53f) python-common/ceph/deployment: Add namespace netmask parameters to nvmeof configuration Fixes https://tracker.ceph.com/issues/68542 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit dd4b357) mgr/cephadm: Add namespace netmask parameters to nvmeof configuration Fixes https://tracker.ceph.com/issues/68542 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 0dcc207) python-common/ceph/deployment: Add resource limits to nvmeof configuration Fixes https://tracker.ceph.com/issues/68967 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 4269d7c) mgr/cephadm: Add resource limits to nvmeof configuration Fixes https://tracker.ceph.com/issues/68967 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 1807a55) Signed-off-by: Gil Bregman <gbregman@il.ibm.com> mgr/cephadm/nvmeof: Add auto rebalance fields to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69176 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit bfc8fb6) mgr/cephadm/nvmeof: Rewrite NVMEoF fields validation. Fixes https://tracker.ceph.com/issues/69176 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 31283c0) mgr/cephadm/nvmeof: Add key verification field to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69413 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 26a0f9a) Signed-off-by: Gil Bregman <gbregman@il.ibm.com> mgr/cephadm: change ceph-nvmeof gw image version to 1.4 Fixes https://tracker.ceph.com/issues/69099 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> mgr/cephadm/nvmeof: Add verify_listener_ip field to NVMeOF configuration and remove obsolete enable_key_encryption Fixes https://tracker.ceph.com/issues/69731 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 744b04a) mgr/cephadm/nvmeof: Add max_hosts field to NVMeOF configuration and update default values Fixes https://tracker.ceph.com/issues/69759 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 0d8bd4d) mgr/cephadm/nvmeof: Add SPDK iobuf options field to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69554 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 42bac97) mgr/cephadm/nvmeof: Add QOS timeslice field to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69952 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 7b4af1f) mon/nvmeofgw*: fix HA usecase when gateway has no listeners: behaves like no-subsystems Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 47e7a24) mon/nvmeofgw*: monitors publish in nvme-gw show ana group responsible for namespace rebalance Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit c358483) nvmeofgw* : fix publishing rebalance index Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit ceb62c0) nvmeofgw*: 2 fixes - for duplicated optimized pathes and fix for GW startup 1. fix duplicated optimized host's pathes - trigger process_gw_down upon fast-gw reboot, removed old fast-reboot handlers 2. fix GW startup - trigger process_gw_down when expired WAIT_BLOCKLIST timer Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 4397c02) Merge pull request #60871 from leonidc/leonidc-epoch-filter Epoch filtering Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Aviv Caro <Aviv.Caro@ibm.com> Reviewed-by: Ronen Friedman <rfriedma@redhat.com> (cherry picked from commit 3cdf529) mon/nvmeofgw*: fix no-listeners FSM, fix detection of no-listeners condition Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 66ca80e) restore proper no-listeners logic Signed-off-by: leonidc <leonidc@il.ibm.com>

======================================== Resolves: rhbz#2350962 qa/tasks/nvmeof.py: add nvmeof gw-group to deployment Groups was made a required parameter to be `ceph orch apply nvmeof <pool> <group>` in ceph#58860. That broke the `nvmeof` suite so this PR fixes that. Right now, all gateway are deployed in a single group. Later, this would be changed to have multi groups for a better test. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit c9a6fed) qa: Expand nvmeof thrasher and add nvmeof_namespaces.yaml job 1. qa/tasks/nvmeof.py: add other methods to stop nvmeof daemons 2. add qa/workunits/rbd/nvmeof_namespace_test.sh which adds and deletes new namespaces. It is run in nvmeof_namespaces.yaml job where fio happens to other namespaces in background. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 58d8be9) qa/suites/nvmeof/basic: add nvmeof_scalability test Add test to upscale/downscale nvmeof gateways. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit e5a9cda) qa: move nvmeof shell scripts to qa/workunits/nvmeof Move all scripts qa/workunits/rbd/nvmeof_*.sh to qa/workunits/nvmeof/*.sh Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 2ed818e) qa/suites/nvmeof: increase hosts in cluster setup In "nvmeof" task, change "client" config to "installer" which allows to take inputs like "host.a". nvmeof/basic: change 2-gateway-2-initiator to 4-gateway-2-inititator cluster nvmeof/thrash: change 3-gateway-1-initiator to 4-gateway-1-inititaor cluster Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 4d97b1a) qa/suites/nvmeof: wait for service "nvmeof.mypool.mygroup0" This is because nvmeof gateway group names are now part of service id. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit da8e95c) labeler: add nvmeof labelers Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit d513cc5) qa/suites/nvmeof: use "latest" image of gateway and cli Change nvmeof gateway and cli image from 1.2 to "latest". Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 0bab553) qa/workunits/nvmeof/setup_subsystem.sh: use --no-group-append In newer version of nvmeof cli, "subsystem add" needs this tag to ensure subsystem name is value of --subsystem. Otherwise, in newer cli version, the gateway group is appended at the end of the subsystem name. This fixes the teuthology nvmeof suite (currently all jobs fails because of this). Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 303f18b) mon: add nvmeof healthchecks Add NVMeofGwMap::get_health_checks which raises NVMEOF_SINGLE_GATEWAY if any of the groups have 1 gateway. In NVMeofGwMon, call `encode_health` and `load_health` to register healthchecks. This will add nvmeof healthchecks to "ceph health" output. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 1cad040) mon: add warning NVMEOF_GATEWAY_DOWN In src/mon/NVMeofGwMap.cc, add warning NVMEOF_GATEWAY_DOWN when any gateway is in GW_UNAVAILABLE state. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 0006599) qa/suites/nvmeof: add mtls test Add qa/workunits/nvmeof/mtls_test.sh which enables mtls config and redeploy, then verify and disables mtls config. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit fdc93ad) monitoring: add 2 nvmeof alerts to prometheus_alerts.yaml - `NVMeoFMissingListener`: trigger if all listeners are not created for each gateway in a subsystem - `NVMeoFZeroListenerSubsystem`: trigger if a subsystem has no listeners Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit f02e312) monitoring: add 2 new nvmeof alerts Add NVMeoFMissingListener and NVMeoFZeroListenerSubsystem alerts to prometheus_alerts.libsonnet. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7994fea) monitoring: add tests for 2 new nvmeof alerts Add test for alerts NVMeoFMissingListener and NVMeoFZeroListenerSubsystem to test_alerts.yml. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit a878460) qa/suites/nvmeof: add nvmeof warnings to log-ignorelist Add NVMEOF_SINGLE_GATEWAY and NVMEOF_GATEWAY_DOWN warnings to nvmeof:thrash job's log-ignorelist Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 73d5c01) qa/suites/nvmeof: fix nvmeof_namespaces.yaml When basic_tests.sh is executed in parallel with namespace_test.sh, sometimes namespace_test.sh starts before fio_test.sh which would break the test. So this change ensures "fio_test.sh" is started before and executed in parallel with "namespace_test.sh". Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 6e15b5e) qa/suite/nvmeof: add asserts to scalability_test.sh Add assertions to 'status_checks()' function. Use "apply" and "redeploy", instead of "orch rm" and "apply" to upscale/downscale gateways. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 9393509) qa/suite/nvmeof/thrash: increase number of thrashing - Run fio for 15 mins (instead of 10min). - nvmeof.py: change daemon_max_thrash_times default from 3 to 5 - nvmeof.py: run nvme list in do_checks() Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 51743e6) qa/suites/nvmeof/basic: use default image in nvmeof_initiator.yaml Instead of using quay.io/ceph/nvmeof:latest, use default image in ceph build. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit f670916) qa/suites/nvmeof/thrash: Add "is unavailable" to log-ignorelist This commit also: - Remove --rbd_iostat from thrasher fio - Log iteration details before printing stats in nvmeof_tharsher Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit c0ca0eb) qa/suites/nvmeof/thrasher: use 120 subsystems and 8 ns each For tharsher test: 1. Run it on 120 subsystems with 8 namespaces each 2. Run FIO for 20 mins (instead of 15mins) 2. Run FIO for few randomly picked devices (using `--random_devices 200`) Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e1983c5) qa/tasks/nvmeof.py: Improve thrasher and rbd image creation Create rbd images in one command using ";" to queue them, instead of running "cephadm shell -- rbd create" again and again for each image. Improve the method to select to-be-thrashed daemons. Use randint() and sample(), instead of weights/skip. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 82118e1) qa/workunits/nvmeof/setup_subsystem.sh: add list_namespaces() func Add list_namespaces function which could be useful for debugging later. Remove extra call of list_subsystems so it's only logged once after subsystems are completely setup. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 2030411) qa/workunits/nvmeof/basic_tests.sh: Assert number of devices Check number of devices connected after connect-all. It should be equal to number of namespaces created. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7ee4677) qa/suites/nvmeof/thrash: add 10-subsys-90-namespace-no_huge_pages.yaml Add test for no-huge-pages by using config "spdk_mem_size: 4096" in 10 subsystems and 90 namespaces each setup. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 09ade3d) monitoring: Add prometheus alert NVMeoFMultipleNamespacesOfRBDImage NVMeoFMultipleNamespacesOfRBDImage alerts the user if a RBD image is used for multiple namespaces. This is important alerts for cases where namespaces are created on same image for different gateway group. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 61b3289) mon/NVMeofGwMap: add healthcheck warning NVMEOF_GATEWAY_DELETING Add a warning when NVMeoF gateways are in DELETING state. This happens when there are namespaces under the deleted gateway's ANA group ID. The gateways are removed completely after users manually move these namespaces to another load balancing group. Or if a new gateway is deployed on that host. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 571dd53) src/common/options/mon.yaml.in: add mon_nvmeofgw_delete_grace This config allows to configure the delay in triggering NVMEOF_GATEWAY_DELETING healthcheck warning, which is triggered when NVMeoF gateways are in DELETEING state for too long (indicating a problem in namespace load-balacing). The default value for this config is 15 mins. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7b33f77) mon/NVMeofGwMap: add delay to NVMEOF_GATEWAY_DELETING warning Instead of immediately triggering, have this healthcheck trigger after some time has elasped. This delay can be configured by mon_nvmeofgw_delete_grace. Track the time when gateways go into DELETING state in a new member var (of NVMeofGwMon) 'gws_deleting_time'. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 56cf512) qa/workunits/nvmeof/basic_tests.sh: fix connect-all assert There seems to be change in 'nvme list' json output which caused failures in asserts after 'nvme connect-all' command. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 22f91cd) qa/tasks/nvmeof: Add --refresh flag in do_checks() cmds This is to ensure latest state of the services are displayed. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 023c209) qa: Add qa/suites/nvmeof/thrash/gateway-initiator-setup/2-subsys-8-namespace.yaml This allows to run nvmeof thrasher test on smaller confgurations which finshes faster than 120subsys-8ns config. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit d7551f7) qa/tasks/nvmeof.py: Add stop_and_join method to thrasher Also add nvme-gw show command output in do_checks() and revive daemons with 'ceph orch daemon start' in revive_daemon() method. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 0b0f450) qa/workunits/nvmeof/fio_test.sh: fix fio filenames Filenames were provided to fio as nvme1n1:nvme1n2, it should be pull path (/dev/nvme1n1:/dev/nvme1n2). Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 06811a4) qa/tasks/nvmeof.py: Do not use 'systemctl start' in thrasher Instead use 'daemon start' in revive_daemon() to bring up gateways thrashed with 'systemctl stop'. This is because 'systemctl start' method seems to temporary issues. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit b5e6a0c) qa/tasks/nvmeof.py: make seperate calls in do_checks() When running 'nvme list-subsys <device>' command in do_checks(), instead of combining command for all devices with '&&', make seperate calls. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 5a58114) qa/tasks/nvmeof.py: Fix do_checks() method All checks currently run on initator node, now run all "ceph" commands on one of gateway hosts instead of initator nodes. And run "nvme list" and "nvme list-subsys" checks on initator node. Add retry (5 times) to do_checks if any command fails. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7dfd3d3) qa/tasks/nvmeof.py: Ignore systemctl_stop thrashing method Do not use systemctl_stop method to thrash daemons, just use 'ceph orch daemon stop' and 'ceph orch daemon rm' methods to thrash nvmeof gateways. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit d4aec58) qa/tasks/nvmeof.py: Add teardown() method Add teardown method to remove nvmeof service before rest of the cluster tearsdown. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e8201d3) qa/suites/nvmeof: Remove watchdog from thrasher This commit does the following: 1. remove watchdog from thrasher 1. remove wait from fio_test 3. change thrasher switcher wait-time to 10 mins Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 76b4028) monitoring: add NVMeoFMaxGatewayGroups Add config NVMeoFMaxGatewayGroups to config.libsonnet and set it to 4 (groups). Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit c5c4b10) monitoring: add alert NVMeoFMaxGatewayGroups Add alert NVMeoFMaxGatewayGroups to prometheus_alerts.yml and prometheus_alerts.libsonnet. This alerts is to indicate if max number of NVMeoF gateway groups have been reached in a cluster. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit ab4a1dd) monitoring: add tests for NVMeoFMaxGatewayGroups Add unit tests for alert NVMeoFMaxGatewayGroups in monitoring/ceph-mixin/tests_alerts/test_alerts.yml Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e5cb5db) qa/suites/nvmeof: use SCALING_DELAYS: '120' Increase delays for qa/workunits/nvmeof/scalability_test.sh as namespace rebalancing takes more time. After upscaling, gateway initially could be 'CREATED', it is a valid state during gateway initialization, but then the state should progress to 'AVAILABLE' within couple of seconds. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 3b9b290) qa/workunits/nvmeof/fio_test: Log cluster status if fio fails Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e450406) qa/suites/nvmeof: add more asserts to scalability_test Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 877c726) qa/suites/nvmeof: Run fio with scalability test Run fio in parallel with scalability test. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e2f3bed) qa/workunits/nvmeof/fio_test.sh: add more debug commands Add more commands to debug when fio fails: - nvme list-subsys /dev/nvme1n2 - nvme list from the initiator - nvme list | wc -l - nvme id-ns /dev/nvme1n2 Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit fd8fbea) mon: Add nvmeof group/gateway name in "ceph -s" In "ceph status" command output, show gateway group names and gateway names. Before: ``` services: mon: 4 daemons, quorum ceph-nvme-vm8,ceph-nvme-vm1,ceph-nvme-vm7,ceph-nvme-vm6 (age 71m) mgr: ceph-nvme-vm8.tgytdq(active, since 73m), standbys: ceph-nvme-vm6.tequqo, ceph-nvme-vm1.pxrofr, ceph-nvme-vm7.lbxrea osd: 4 osds: 4 up (since 70m), 4 in (since 70m) nvmeof: 4 gateways active (4 hosts) ``` After: ``` services: mon: 4 daemons, quorum ceph-nvme-vm14,ceph-nvme-vm11,ceph-nvme-vm13,ceph-nvme-vm12 (age 17m) mgr: ceph-nvme-vm14.gjjgvq(active, since 19m), standbys: ceph-nvme-vm12.shbvpw, ceph-nvme-vm11.gucgiu, ceph-nvme-vm13.inzizw osd: 4 osds: 4 up (since 15m), 4 in (since 16m) nvmeof (mygroup1) : 2 gateways active (ceph-nvme-vm13.azfdpk, ceph-nvme-vm14.hdsoxl) nvmeof (mygroup2) : 2 gateways active (ceph-nvme-vm11.hnooxs, ceph-nvme-vm12.wcjcjs) ``` Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e3fab2a) mon: show count of active/total nvmeof gws in "ceph -s" Improve "ceph status" output for nvmeof service: 1. Group by service_id (<pool>.<group>) instead of just by gateway groups. 2. Show total gateway count from NVMeofGwMap, and count of active gateways. New output: ``` services: mon: 4 daemons, quorum ceph-nvme-vm31,ceph-nvme-vm28,ceph-nvme-vm30,ceph-nvme-vm29 (age 16m) mgr: ceph-nvme-vm31.wnfclf(active, since 18m), standbys: ceph-nvme-vm29.iuwqin, ceph-nvme-vm28.lnnyui, ceph-nvme-vm30.fitwnw osd: 4 osds: 4 up (since 14m), 4 in (since 15m) nvmeof (mypool.mygroup1): 2 gateways: 1 active (ceph-nvme-vm30.kkcfux) nvmeof (mypool.mygroup2): 2 gateways: 2 active (ceph-nvme-vm28.mfqucr, ceph-nvme-vm29.hrizzl) ``` Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 3065ffe) monitoring: fix NVMeoFSubsystemNamespaceLimit Alert is not triggered as expected, change the query to fix that. BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2282348 Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 4a7866a) mgr/cephadm: set service name for DaemonDescription object used during daemon removal What this is specifically fixing is that the nvmeof post_remove function needs the service spec of the daemon's service to get the pool and group tied to the nvmeof daemon. We have been using the DaemonDescription "service_name" property to get the service name in order to get the spec. This works in a regular deployment. However, it is possible to make a placement like placement: hosts: - vm-00=nvmeof.a - vm-01=nvmeof.b and one of the nvmeof CI tests was doing so, which is why we saw this. That will cause the nvmeof daemon names to be nvmeof.nvmeof.a and nvmeof.nvmeof.b and not include the service name at all. In this case, the service_name property on the DaemonDescription class will end up getting service names nvmeof.nvmeof.a and nvmeof.nvmeof.b respectively from the nvmeof daemons, which will cause us to fail to find the spec in post_remove. This change makes it so we manually set the service name for the DaemonDescription object that gets passed to post_remove based on the service name of the daemon object we get from the host cache, which will still have the correct service name even if the daemon has a custom name. Then the nvmeof post_remove function will get the correct service name and be able to find the spec. Additionally, we now take are technically taking the daemon type and id from the DaemonDescription in our HostCache as well, but this is mostly just for consistency and should have no real impact. Fixes: https://tracker.ceph.com/issues/68962 Signed-off-by: Adam King <adking@redhat.com> (cherry picked from commit d8dae24) Add multi-cluster support (showMultiCluster=True) to alerts Following PR ceph#55495 fixing the dashboard in regards to multiple clusters storing their metrics in a single Prometheus instance, this PR addresses the issues for alerts. Fixes: https://tracker.ceph.com/issues/64321 Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de> (cherry picked from commit 810c706) mon/nvme: fix unused lambda capture warnings Signed-off-by: Ronen Friedman <rfriedma@redhat.com> (cherry picked from commit edb0321) src/nvmeof/NVMeofGwMonitorClient: remove MDS client, not needed Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit f806872) cephadm/nvmeof: fix ports when default values are overridden Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit e717a92) cephadm/nvmeof: support per-node gateway addresses Added gateway and discovery address maps to the service specification. These maps store per-node service addresses. The address is first searched in the map, then in the spec address configuration. If neither is defined, the host IP is used as a fallback. Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit 2f47f9d) cephadm/nvmeof: support no huge pages for nvmeof spdk depends on: ceph/ceph-nvmeof#898 Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit 38513cb) pybind/mgr/orchestrator/module.py: NvmeofServiceSpec service_id - make service_id better alligned with default/empty group (ceph@f6d552d) - fix service_id in nvmeof daemon add Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit e1612d0) python-common/ceph/deployment: add SPDK log level to nvmeof configuration Fixes https://tracker.ceph.com/issues/67258 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit d3cc237) mgr/cephadm: add SPDK log level to nvmeof configuration Fixes https://tracker.ceph.com/issues/67258 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 19399de) python-common/ceph/deployment: change SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67629 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit d18e6fb) mgr/cephadm: change SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67629 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit d208242) python-common/ceph/deployment: revert SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67844 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit cb28d39) mgr/cephadm: revert SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67844 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 11de53f) python-common/ceph/deployment: Add namespace netmask parameters to nvmeof configuration Fixes https://tracker.ceph.com/issues/68542 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit dd4b357) mgr/cephadm: Add namespace netmask parameters to nvmeof configuration Fixes https://tracker.ceph.com/issues/68542 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 0dcc207) python-common/ceph/deployment: Add resource limits to nvmeof configuration Fixes https://tracker.ceph.com/issues/68967 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 4269d7c) mgr/cephadm: Add resource limits to nvmeof configuration Fixes https://tracker.ceph.com/issues/68967 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 1807a55) Signed-off-by: Gil Bregman <gbregman@il.ibm.com> mgr/cephadm/nvmeof: Add auto rebalance fields to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69176 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit bfc8fb6) mgr/cephadm/nvmeof: Rewrite NVMEoF fields validation. Fixes https://tracker.ceph.com/issues/69176 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 31283c0) mgr/cephadm/nvmeof: Add key verification field to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69413 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 26a0f9a) Signed-off-by: Gil Bregman <gbregman@il.ibm.com> mgr/cephadm: change ceph-nvmeof gw image version to 1.4 Fixes https://tracker.ceph.com/issues/69099 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> mgr/cephadm/nvmeof: Add verify_listener_ip field to NVMeOF configuration and remove obsolete enable_key_encryption Fixes https://tracker.ceph.com/issues/69731 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 744b04a) mgr/cephadm/nvmeof: Add max_hosts field to NVMeOF configuration and update default values Fixes https://tracker.ceph.com/issues/69759 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 0d8bd4d) mgr/cephadm/nvmeof: Add SPDK iobuf options field to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69554 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 42bac97) mgr/cephadm/nvmeof: Add QOS timeslice field to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69952 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 7b4af1f) mon/nvmeofgw*: fix HA usecase when gateway has no listeners: behaves like no-subsystems Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 47e7a24) mon/nvmeofgw*: monitors publish in nvme-gw show ana group responsible for namespace rebalance Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit c358483) nvmeofgw* : fix publishing rebalance index Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit ceb62c0) nvmeofgw*: 2 fixes - for duplicated optimized pathes and fix for GW startup 1. fix duplicated optimized host's pathes - trigger process_gw_down upon fast-gw reboot, removed old fast-reboot handlers 2. fix GW startup - trigger process_gw_down when expired WAIT_BLOCKLIST timer Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 4397c02) Merge pull request ceph#60871 from leonidc/leonidc-epoch-filter Epoch filtering Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Aviv Caro <Aviv.Caro@ibm.com> Reviewed-by: Ronen Friedman <rfriedma@redhat.com> (cherry picked from commit 3cdf529) mon/nvmeofgw*: fix no-listeners FSM, fix detection of no-listeners condition Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 66ca80e) restore proper no-listeners logic Signed-off-by: leonidc <leonidc@il.ibm.com>

======================================== Resolves: rhbz#2350962 qa/tasks/nvmeof.py: add nvmeof gw-group to deployment Groups was made a required parameter to be `ceph orch apply nvmeof <pool> <group>` in ceph/ceph#58860. That broke the `nvmeof` suite so this PR fixes that. Right now, all gateway are deployed in a single group. Later, this would be changed to have multi groups for a better test. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit c9a6fed) qa: Expand nvmeof thrasher and add nvmeof_namespaces.yaml job 1. qa/tasks/nvmeof.py: add other methods to stop nvmeof daemons 2. add qa/workunits/rbd/nvmeof_namespace_test.sh which adds and deletes new namespaces. It is run in nvmeof_namespaces.yaml job where fio happens to other namespaces in background. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 58d8be9) qa/suites/nvmeof/basic: add nvmeof_scalability test Add test to upscale/downscale nvmeof gateways. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit e5a9cda) qa: move nvmeof shell scripts to qa/workunits/nvmeof Move all scripts qa/workunits/rbd/nvmeof_*.sh to qa/workunits/nvmeof/*.sh Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 2ed818e) qa/suites/nvmeof: increase hosts in cluster setup In "nvmeof" task, change "client" config to "installer" which allows to take inputs like "host.a". nvmeof/basic: change 2-gateway-2-initiator to 4-gateway-2-inititator cluster nvmeof/thrash: change 3-gateway-1-initiator to 4-gateway-1-inititaor cluster Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 4d97b1a) qa/suites/nvmeof: wait for service "nvmeof.mypool.mygroup0" This is because nvmeof gateway group names are now part of service id. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit da8e95c) labeler: add nvmeof labelers Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit d513cc5) qa/suites/nvmeof: use "latest" image of gateway and cli Change nvmeof gateway and cli image from 1.2 to "latest". Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 0bab553) qa/workunits/nvmeof/setup_subsystem.sh: use --no-group-append In newer version of nvmeof cli, "subsystem add" needs this tag to ensure subsystem name is value of --subsystem. Otherwise, in newer cli version, the gateway group is appended at the end of the subsystem name. This fixes the teuthology nvmeof suite (currently all jobs fails because of this). Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 303f18b) mon: add nvmeof healthchecks Add NVMeofGwMap::get_health_checks which raises NVMEOF_SINGLE_GATEWAY if any of the groups have 1 gateway. In NVMeofGwMon, call `encode_health` and `load_health` to register healthchecks. This will add nvmeof healthchecks to "ceph health" output. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 1cad040) mon: add warning NVMEOF_GATEWAY_DOWN In src/mon/NVMeofGwMap.cc, add warning NVMEOF_GATEWAY_DOWN when any gateway is in GW_UNAVAILABLE state. Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 0006599) qa/suites/nvmeof: add mtls test Add qa/workunits/nvmeof/mtls_test.sh which enables mtls config and redeploy, then verify and disables mtls config. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit fdc93ad) monitoring: add 2 nvmeof alerts to prometheus_alerts.yaml - `NVMeoFMissingListener`: trigger if all listeners are not created for each gateway in a subsystem - `NVMeoFZeroListenerSubsystem`: trigger if a subsystem has no listeners Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit f02e312) monitoring: add 2 new nvmeof alerts Add NVMeoFMissingListener and NVMeoFZeroListenerSubsystem alerts to prometheus_alerts.libsonnet. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7994fea) monitoring: add tests for 2 new nvmeof alerts Add test for alerts NVMeoFMissingListener and NVMeoFZeroListenerSubsystem to test_alerts.yml. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit a878460) qa/suites/nvmeof: add nvmeof warnings to log-ignorelist Add NVMEOF_SINGLE_GATEWAY and NVMEOF_GATEWAY_DOWN warnings to nvmeof:thrash job's log-ignorelist Signed-off-by: Vallari Agrawal <val.agl002@gmail.com> (cherry picked from commit 73d5c01) qa/suites/nvmeof: fix nvmeof_namespaces.yaml When basic_tests.sh is executed in parallel with namespace_test.sh, sometimes namespace_test.sh starts before fio_test.sh which would break the test. So this change ensures "fio_test.sh" is started before and executed in parallel with "namespace_test.sh". Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 6e15b5e) qa/suite/nvmeof: add asserts to scalability_test.sh Add assertions to 'status_checks()' function. Use "apply" and "redeploy", instead of "orch rm" and "apply" to upscale/downscale gateways. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 9393509) qa/suite/nvmeof/thrash: increase number of thrashing - Run fio for 15 mins (instead of 10min). - nvmeof.py: change daemon_max_thrash_times default from 3 to 5 - nvmeof.py: run nvme list in do_checks() Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 51743e6) qa/suites/nvmeof/basic: use default image in nvmeof_initiator.yaml Instead of using quay.io/ceph/nvmeof:latest, use default image in ceph build. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit f670916) qa/suites/nvmeof/thrash: Add "is unavailable" to log-ignorelist This commit also: - Remove --rbd_iostat from thrasher fio - Log iteration details before printing stats in nvmeof_tharsher Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit c0ca0eb) qa/suites/nvmeof/thrasher: use 120 subsystems and 8 ns each For tharsher test: 1. Run it on 120 subsystems with 8 namespaces each 2. Run FIO for 20 mins (instead of 15mins) 2. Run FIO for few randomly picked devices (using `--random_devices 200`) Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e1983c5) qa/tasks/nvmeof.py: Improve thrasher and rbd image creation Create rbd images in one command using ";" to queue them, instead of running "cephadm shell -- rbd create" again and again for each image. Improve the method to select to-be-thrashed daemons. Use randint() and sample(), instead of weights/skip. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 82118e1) qa/workunits/nvmeof/setup_subsystem.sh: add list_namespaces() func Add list_namespaces function which could be useful for debugging later. Remove extra call of list_subsystems so it's only logged once after subsystems are completely setup. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 2030411) qa/workunits/nvmeof/basic_tests.sh: Assert number of devices Check number of devices connected after connect-all. It should be equal to number of namespaces created. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7ee4677) qa/suites/nvmeof/thrash: add 10-subsys-90-namespace-no_huge_pages.yaml Add test for no-huge-pages by using config "spdk_mem_size: 4096" in 10 subsystems and 90 namespaces each setup. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 09ade3d) monitoring: Add prometheus alert NVMeoFMultipleNamespacesOfRBDImage NVMeoFMultipleNamespacesOfRBDImage alerts the user if a RBD image is used for multiple namespaces. This is important alerts for cases where namespaces are created on same image for different gateway group. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 61b3289) mon/NVMeofGwMap: add healthcheck warning NVMEOF_GATEWAY_DELETING Add a warning when NVMeoF gateways are in DELETING state. This happens when there are namespaces under the deleted gateway's ANA group ID. The gateways are removed completely after users manually move these namespaces to another load balancing group. Or if a new gateway is deployed on that host. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 571dd53) src/common/options/mon.yaml.in: add mon_nvmeofgw_delete_grace This config allows to configure the delay in triggering NVMEOF_GATEWAY_DELETING healthcheck warning, which is triggered when NVMeoF gateways are in DELETEING state for too long (indicating a problem in namespace load-balacing). The default value for this config is 15 mins. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7b33f77) mon/NVMeofGwMap: add delay to NVMEOF_GATEWAY_DELETING warning Instead of immediately triggering, have this healthcheck trigger after some time has elasped. This delay can be configured by mon_nvmeofgw_delete_grace. Track the time when gateways go into DELETING state in a new member var (of NVMeofGwMon) 'gws_deleting_time'. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 56cf512) qa/workunits/nvmeof/basic_tests.sh: fix connect-all assert There seems to be change in 'nvme list' json output which caused failures in asserts after 'nvme connect-all' command. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 22f91cd) qa/tasks/nvmeof: Add --refresh flag in do_checks() cmds This is to ensure latest state of the services are displayed. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 023c209) qa: Add qa/suites/nvmeof/thrash/gateway-initiator-setup/2-subsys-8-namespace.yaml This allows to run nvmeof thrasher test on smaller confgurations which finshes faster than 120subsys-8ns config. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit d7551f7) qa/tasks/nvmeof.py: Add stop_and_join method to thrasher Also add nvme-gw show command output in do_checks() and revive daemons with 'ceph orch daemon start' in revive_daemon() method. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 0b0f450) qa/workunits/nvmeof/fio_test.sh: fix fio filenames Filenames were provided to fio as nvme1n1:nvme1n2, it should be pull path (/dev/nvme1n1:/dev/nvme1n2). Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 06811a4) qa/tasks/nvmeof.py: Do not use 'systemctl start' in thrasher Instead use 'daemon start' in revive_daemon() to bring up gateways thrashed with 'systemctl stop'. This is because 'systemctl start' method seems to temporary issues. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit b5e6a0c) qa/tasks/nvmeof.py: make seperate calls in do_checks() When running 'nvme list-subsys <device>' command in do_checks(), instead of combining command for all devices with '&&', make seperate calls. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 5a58114) qa/tasks/nvmeof.py: Fix do_checks() method All checks currently run on initator node, now run all "ceph" commands on one of gateway hosts instead of initator nodes. And run "nvme list" and "nvme list-subsys" checks on initator node. Add retry (5 times) to do_checks if any command fails. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 7dfd3d3) qa/tasks/nvmeof.py: Ignore systemctl_stop thrashing method Do not use systemctl_stop method to thrash daemons, just use 'ceph orch daemon stop' and 'ceph orch daemon rm' methods to thrash nvmeof gateways. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit d4aec58) qa/tasks/nvmeof.py: Add teardown() method Add teardown method to remove nvmeof service before rest of the cluster tearsdown. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e8201d3) qa/suites/nvmeof: Remove watchdog from thrasher This commit does the following: 1. remove watchdog from thrasher 1. remove wait from fio_test 3. change thrasher switcher wait-time to 10 mins Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 76b4028) monitoring: add NVMeoFMaxGatewayGroups Add config NVMeoFMaxGatewayGroups to config.libsonnet and set it to 4 (groups). Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit c5c4b10) monitoring: add alert NVMeoFMaxGatewayGroups Add alert NVMeoFMaxGatewayGroups to prometheus_alerts.yml and prometheus_alerts.libsonnet. This alerts is to indicate if max number of NVMeoF gateway groups have been reached in a cluster. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit ab4a1dd) monitoring: add tests for NVMeoFMaxGatewayGroups Add unit tests for alert NVMeoFMaxGatewayGroups in monitoring/ceph-mixin/tests_alerts/test_alerts.yml Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e5cb5db) qa/suites/nvmeof: use SCALING_DELAYS: '120' Increase delays for qa/workunits/nvmeof/scalability_test.sh as namespace rebalancing takes more time. After upscaling, gateway initially could be 'CREATED', it is a valid state during gateway initialization, but then the state should progress to 'AVAILABLE' within couple of seconds. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 3b9b290) qa/workunits/nvmeof/fio_test: Log cluster status if fio fails Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e450406) qa/suites/nvmeof: add more asserts to scalability_test Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 877c726) qa/suites/nvmeof: Run fio with scalability test Run fio in parallel with scalability test. Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e2f3bed) qa/workunits/nvmeof/fio_test.sh: add more debug commands Add more commands to debug when fio fails: - nvme list-subsys /dev/nvme1n2 - nvme list from the initiator - nvme list | wc -l - nvme id-ns /dev/nvme1n2 Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit fd8fbea) mon: Add nvmeof group/gateway name in "ceph -s" In "ceph status" command output, show gateway group names and gateway names. Before: ``` services: mon: 4 daemons, quorum ceph-nvme-vm8,ceph-nvme-vm1,ceph-nvme-vm7,ceph-nvme-vm6 (age 71m) mgr: ceph-nvme-vm8.tgytdq(active, since 73m), standbys: ceph-nvme-vm6.tequqo, ceph-nvme-vm1.pxrofr, ceph-nvme-vm7.lbxrea osd: 4 osds: 4 up (since 70m), 4 in (since 70m) nvmeof: 4 gateways active (4 hosts) ``` After: ``` services: mon: 4 daemons, quorum ceph-nvme-vm14,ceph-nvme-vm11,ceph-nvme-vm13,ceph-nvme-vm12 (age 17m) mgr: ceph-nvme-vm14.gjjgvq(active, since 19m), standbys: ceph-nvme-vm12.shbvpw, ceph-nvme-vm11.gucgiu, ceph-nvme-vm13.inzizw osd: 4 osds: 4 up (since 15m), 4 in (since 16m) nvmeof (mygroup1) : 2 gateways active (ceph-nvme-vm13.azfdpk, ceph-nvme-vm14.hdsoxl) nvmeof (mygroup2) : 2 gateways active (ceph-nvme-vm11.hnooxs, ceph-nvme-vm12.wcjcjs) ``` Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit e3fab2a) mon: show count of active/total nvmeof gws in "ceph -s" Improve "ceph status" output for nvmeof service: 1. Group by service_id (<pool>.<group>) instead of just by gateway groups. 2. Show total gateway count from NVMeofGwMap, and count of active gateways. New output: ``` services: mon: 4 daemons, quorum ceph-nvme-vm31,ceph-nvme-vm28,ceph-nvme-vm30,ceph-nvme-vm29 (age 16m) mgr: ceph-nvme-vm31.wnfclf(active, since 18m), standbys: ceph-nvme-vm29.iuwqin, ceph-nvme-vm28.lnnyui, ceph-nvme-vm30.fitwnw osd: 4 osds: 4 up (since 14m), 4 in (since 15m) nvmeof (mypool.mygroup1): 2 gateways: 1 active (ceph-nvme-vm30.kkcfux) nvmeof (mypool.mygroup2): 2 gateways: 2 active (ceph-nvme-vm28.mfqucr, ceph-nvme-vm29.hrizzl) ``` Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 3065ffe) monitoring: fix NVMeoFSubsystemNamespaceLimit Alert is not triggered as expected, change the query to fix that. BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2282348 Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com> (cherry picked from commit 4a7866a) mgr/cephadm: set service name for DaemonDescription object used during daemon removal What this is specifically fixing is that the nvmeof post_remove function needs the service spec of the daemon's service to get the pool and group tied to the nvmeof daemon. We have been using the DaemonDescription "service_name" property to get the service name in order to get the spec. This works in a regular deployment. However, it is possible to make a placement like placement: hosts: - vm-00=nvmeof.a - vm-01=nvmeof.b and one of the nvmeof CI tests was doing so, which is why we saw this. That will cause the nvmeof daemon names to be nvmeof.nvmeof.a and nvmeof.nvmeof.b and not include the service name at all. In this case, the service_name property on the DaemonDescription class will end up getting service names nvmeof.nvmeof.a and nvmeof.nvmeof.b respectively from the nvmeof daemons, which will cause us to fail to find the spec in post_remove. This change makes it so we manually set the service name for the DaemonDescription object that gets passed to post_remove based on the service name of the daemon object we get from the host cache, which will still have the correct service name even if the daemon has a custom name. Then the nvmeof post_remove function will get the correct service name and be able to find the spec. Additionally, we now take are technically taking the daemon type and id from the DaemonDescription in our HostCache as well, but this is mostly just for consistency and should have no real impact. Fixes: https://tracker.ceph.com/issues/68962 Signed-off-by: Adam King <adking@redhat.com> (cherry picked from commit d8dae24) Add multi-cluster support (showMultiCluster=True) to alerts Following PR ceph/ceph#55495 fixing the dashboard in regards to multiple clusters storing their metrics in a single Prometheus instance, this PR addresses the issues for alerts. Fixes: https://tracker.ceph.com/issues/64321 Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de> (cherry picked from commit 810c706) mon/nvme: fix unused lambda capture warnings Signed-off-by: Ronen Friedman <rfriedma@redhat.com> (cherry picked from commit edb0321) src/nvmeof/NVMeofGwMonitorClient: remove MDS client, not needed Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit f806872) cephadm/nvmeof: fix ports when default values are overridden Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit e717a92) cephadm/nvmeof: support per-node gateway addresses Added gateway and discovery address maps to the service specification. These maps store per-node service addresses. The address is first searched in the map, then in the spec address configuration. If neither is defined, the host IP is used as a fallback. Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit 2f47f9d) cephadm/nvmeof: support no huge pages for nvmeof spdk depends on: ceph/ceph-nvmeof#898 Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit 38513cb) pybind/mgr/orchestrator/module.py: NvmeofServiceSpec service_id - make service_id better alligned with default/empty group (ceph/ceph@f6d552d) - fix service_id in nvmeof daemon add Signed-off-by: Alexander Indenbaum <aindenba@redhat.com> (cherry picked from commit e1612d0) python-common/ceph/deployment: add SPDK log level to nvmeof configuration Fixes https://tracker.ceph.com/issues/67258 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit d3cc237) mgr/cephadm: add SPDK log level to nvmeof configuration Fixes https://tracker.ceph.com/issues/67258 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 19399de) python-common/ceph/deployment: change SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67629 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit d18e6fb) mgr/cephadm: change SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67629 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit d208242) python-common/ceph/deployment: revert SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67844 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit cb28d39) mgr/cephadm: revert SPDK RPC fields in nvmeof configuration Fixes https://tracker.ceph.com/issues/67844 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 11de53f) python-common/ceph/deployment: Add namespace netmask parameters to nvmeof configuration Fixes https://tracker.ceph.com/issues/68542 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit dd4b357) mgr/cephadm: Add namespace netmask parameters to nvmeof configuration Fixes https://tracker.ceph.com/issues/68542 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 0dcc207) python-common/ceph/deployment: Add resource limits to nvmeof configuration Fixes https://tracker.ceph.com/issues/68967 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 4269d7c) mgr/cephadm: Add resource limits to nvmeof configuration Fixes https://tracker.ceph.com/issues/68967 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 1807a55) Signed-off-by: Gil Bregman <gbregman@il.ibm.com> mgr/cephadm/nvmeof: Add auto rebalance fields to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69176 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit bfc8fb6) mgr/cephadm/nvmeof: Rewrite NVMEoF fields validation. Fixes https://tracker.ceph.com/issues/69176 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 31283c0) mgr/cephadm/nvmeof: Add key verification field to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69413 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 26a0f9a) Signed-off-by: Gil Bregman <gbregman@il.ibm.com> mgr/cephadm: change ceph-nvmeof gw image version to 1.4 Fixes https://tracker.ceph.com/issues/69099 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> mgr/cephadm/nvmeof: Add verify_listener_ip field to NVMeOF configuration and remove obsolete enable_key_encryption Fixes https://tracker.ceph.com/issues/69731 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 744b04a) mgr/cephadm/nvmeof: Add max_hosts field to NVMeOF configuration and update default values Fixes https://tracker.ceph.com/issues/69759 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 0d8bd4d) mgr/cephadm/nvmeof: Add SPDK iobuf options field to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69554 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 42bac97) mgr/cephadm/nvmeof: Add QOS timeslice field to NVMeOF configuration Fixes https://tracker.ceph.com/issues/69952 Signed-off-by: Gil Bregman <gbregman@il.ibm.com> (cherry picked from commit 7b4af1f) mon/nvmeofgw*: fix HA usecase when gateway has no listeners: behaves like no-subsystems Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 47e7a24) mon/nvmeofgw*: monitors publish in nvme-gw show ana group responsible for namespace rebalance Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit c358483) nvmeofgw* : fix publishing rebalance index Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit ceb62c0) nvmeofgw*: 2 fixes - for duplicated optimized pathes and fix for GW startup 1. fix duplicated optimized host's pathes - trigger process_gw_down upon fast-gw reboot, removed old fast-reboot handlers 2. fix GW startup - trigger process_gw_down when expired WAIT_BLOCKLIST timer Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 4397c02) Merge pull request #60871 from leonidc/leonidc-epoch-filter Epoch filtering Reviewed-by: Samuel Just <sjust@redhat.com> Reviewed-by: Aviv Caro <Aviv.Caro@ibm.com> Reviewed-by: Ronen Friedman <rfriedma@redhat.com> (cherry picked from commit 3cdf529) mon/nvmeofgw*: fix no-listeners FSM, fix detection of no-listeners condition Signed-off-by: Leonid Chernin <leonidc@il.ibm.com> (cherry picked from commit 66ca80e) restore proper no-listeners logic Signed-off-by: leonidc <leonidc@il.ibm.com>

frittentheke requested a review from a team as a code owner February 8, 2024 13:53

frittentheke requested review from Pegonzal and ivoalmeida and removed request for a team February 8, 2024 13:53

github-actions bot added dashboard monitoring labels Feb 8, 2024

frittentheke force-pushed the issue_64321 branch 2 times, most recently from 711b4c0 to b127147 Compare February 9, 2024 15:11

frittentheke changed the title ~~monitoring/ceph-mixin: fix multicluster support in dashboards and their queries~~ monitoring/ceph-mixin: Cleanup of variables, queries and tests (to fix showMultiCluster=True) Feb 9, 2024

github-actions bot added the needs-rebase label Feb 13, 2024

frittentheke force-pushed the issue_64321 branch from b127147 to f591e73 Compare February 14, 2024 14:42

github-actions bot removed the needs-rebase label Feb 14, 2024

frittentheke force-pushed the issue_64321 branch 2 times, most recently from 339d9b9 to 08ea2eb Compare February 16, 2024 13:44

frittentheke mentioned this pull request May 14, 2024

squid: monitoring/ceph-mixin: Cleanup of variables, queries and tests (to fix showMultiCluster=True) #57461

Merged

frittentheke mentioned this pull request May 16, 2024

reef: monitoring/ceph-mixin: Cleanup of variables, queries and tests (to fix showMultiCluster=True) #57518

Closed

jmolmo mentioned this pull request May 30, 2024

Duplicates series on Prom alert Rules CephOSDFlapping and CephPGImbalance while monitoring several clusters rook/rook#13575

Closed

Conversation

frittentheke commented Feb 8, 2024

Checklist

Uh oh!

frittentheke commented Feb 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frittentheke commented Feb 9, 2024

Uh oh!

cloudbehl commented Feb 12, 2024

Uh oh!

cloudbehl commented Feb 12, 2024

Uh oh!

github-actions bot commented Feb 13, 2024

Uh oh!

frittentheke commented Feb 14, 2024

Uh oh!

frittentheke commented Feb 14, 2024

Uh oh!

cloudbehl commented Feb 15, 2024

Uh oh!

cloudbehl commented Feb 15, 2024

Uh oh!

frittentheke commented Feb 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frittentheke commented Feb 16, 2024

Uh oh!

cloudbehl commented Feb 22, 2024

Uh oh!

cloudbehl commented Feb 22, 2024

Uh oh!

frittentheke commented Feb 22, 2024

Uh oh!

frittentheke commented Feb 22, 2024

Uh oh!

cloudbehl commented Feb 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aaSharma14 commented May 7, 2024

Uh oh!

frittentheke commented May 13, 2024

Uh oh!

aaSharma14 commented May 14, 2024

Uh oh!

aaSharma14 commented May 15, 2024

Uh oh!

frittentheke commented May 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

frittentheke commented Feb 9, 2024 •

edited

Loading

frittentheke commented Feb 16, 2024 •

edited

Loading

cloudbehl commented Feb 22, 2024 •

edited

Loading