Skip to content

Revive nvme module#67641

Merged
Hezko merged 8 commits intoceph:mainfrom
Hezko:revive-nvme-module
Mar 11, 2026
Merged

Revive nvme module#67641
Hezko merged 8 commits intoceph:mainfrom
Hezko:revive-nvme-module

Conversation

@Hezko
Copy link
Contributor

@Hezko Hezko commented Mar 4, 2026

Reapply #67167 which was reverted.
fixes: https://tracker.ceph.com/issues/74702

Note: #67782 is also needed to complete the work on this PR

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

You must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.

@github-actions
Copy link

github-actions bot commented Mar 4, 2026

Config Diff Tool Output

! changed: mgr_initial_modules: old: iostat nfs (mgr.yaml.in)
! changed: mgr_initial_modules: new: iostat nfs nvmeof (mgr.yaml.in)

The above configuration changes are found in the PR. Please update the relevant release documentation if necessary.
Ignore this comment if docs are already updated. To make the "Check ceph config changes" CI check pass, please comment /config check ok and re-run the test.

@caroav
Copy link
Contributor

caroav commented Mar 5, 2026

jenkins test make check arm64

@Hezko Hezko self-assigned this Mar 5, 2026
@VallariAg
Copy link
Member

QA analysis

Build:
(only centos9 build): https://shaman.ceph.com/builds/ceph/wip-tomer-revive-nvme-module-centos9-only/7bb33b21d1a7608a6c664c6a1320bfa4de8381ec/
(retriggered to get ubuntu builds): https://shaman.ceph.com/builds/ceph/wip-nvmeof-submodule-4March/1860b51ea223c4b1bf3943dd87b97ed74cd29920/

nvmeof run passed:
https://pulpito.ceph.com/vallariag-2026-03-04_08:38:43-nvmeof-wip-tomer-revive-nvme-module-centos9-only-distro-default-trial/
https://pulpito.ceph.com/vallariag-2026-03-04_09:20:03-nvmeof-wip-tomer-revive-nvme-module-centos9-only-distro-default-trial/

  1. fio failures (also in main): https://tracker.ceph.com/issues/74660
  2. Some jobs fails because of cluster warning  CEPHADM_APPLY_SPEC_FAIL found in logs - It also sometimes happen in main branch. Not a blocker, it looks like a self-healing problem because the cluster warning goes away in few seconds.
  3. [81579] New problem I noticed in logs of this job: it looks like sometimes the test remove 3 out of 4 gws with "ceph orch daemon rm" in the same iteration of thrashing, those 3 removed daemons don't come up on their own at all. Looking at logs of older nvmeof runs, I found one of Yuri's runs to have the same problem, so it's unrelated to this PR. Opened a tracker for this: https://tracker.ceph.com/issues/75331

rados:mgr run:
centos jobs: https://pulpito.ceph.com/vallariag-2026-03-04_09:03:27-rados:mgr-wip-tomer-revive-nvme-module-centos9-only-distro-default-trial/
ubuntu jobs: https://pulpito.ceph.com/vallariag-2026-03-05_07:24:12-rados:mgr-wip-nvmeof-submodule-4March-distro-default-trial/
These run have 2 unique failures - we found pre-existing tickets for both of them (72747 and 75101).

@Hezko Hezko force-pushed the revive-nvme-module branch from 6c7cefe to 27aaa12 Compare March 5, 2026 19:36
@Hezko
Copy link
Contributor Author

Hezko commented Mar 5, 2026

/config check ok

@Hezko Hezko marked this pull request as ready for review March 5, 2026 19:39
@Hezko Hezko requested a review from a team as a code owner March 5, 2026 19:39
@Hezko Hezko requested review from batrick, bill-scales and Copilot March 5, 2026 19:39
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Tomer Haskalovitch added 6 commits March 10, 2026 21:50
Introduce a new NVMe-oF mgr module and which create the pool
used for storing NVMe-related metadata ceph orch nvmeof apply command.
This removes the need for users to manually create and configure the
metadata pool before using the NVMe-oF functionality, simplifying
setup and reducing the chance of misconfiguration.

Fixes: https://tracker.ceph.com/issues/74702

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit 15fcbb5)
Fixes: https://tracker.ceph.com/issues/74702

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit 901ec98)
Fixes: https://tracker.ceph.com/issues/74702

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit eccffe5)
Added a call to create_pool_if_not_exists during the execution of ceph orch apply nvmeof command.

Fixes: https://tracker.ceph.com/issues/74702

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit f5734cf)
Fixes: https://tracker.ceph.com/issues/74702

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit eecbff7)
Fixes: https://tracker.ceph.com/issues/74702

Signed-off-by: Tomer Haskalovitch <tomer.haska@ibm.com>
(cherry picked from commit 166fb04)
@Hezko Hezko force-pushed the revive-nvme-module branch from 27aaa12 to e4609ae Compare March 10, 2026 19:50
@batrick
Copy link
Member

batrick commented Mar 11, 2026

jenkins test make check

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please strip

    (cherry picked from commit 1860b51ea223c4b1bf3943dd87b97ed74cd29920)

and

    (cherry picked from commit 1860b51ea223c4b1bf3943dd87b97ed74cd29920)

from the commit messages for the last two commits as these commits do not belong anywhere else in the repo (i.e. in another branch).

Otherwise LGTM.

avanthakkar and others added 2 commits March 11, 2026 21:18
Fixed AttributeError: type object 'NVMeoF' has no attribute 'CLICommand'

Signed-off-by: Avan Thakkar <athakkar@redhat.com>
While deploying gateways with "ceph orch apply nvmeof",
--pool can be optional now. If not passed, a pool with
name ".nvmeof" would automatically be created.

In nvmeof task, "auto_pool_create: True" would skip --pool
in "ceph orch apply nvmeof".

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
@Hezko Hezko force-pushed the revive-nvme-module branch from e4609ae to 97f4043 Compare March 11, 2026 19:19
@Hezko Hezko merged commit 4a8594f into ceph:main Mar 11, 2026
13 of 14 checks passed
@github-actions
Copy link

This is an automated message by src/script/redmine-upkeep.py.

I found one or more Fixes: tags in the commit messages in

git log 4a8594ff9e223374b9a71d3582e504ccba6a872c^..4a8594ff9e223374b9a71d3582e504ccba6a872c

The referenced tickets are:

Those tickets do not reference this merged Pull Request. If this Pull Request merge resolves any of those tickets, please update the "Pull Request ID" field on each ticket. A future run of this script will appropriately update them.

Update Log: https://github.com/ceph/ceph/actions/runs/22976625018

@rkachach
Copy link
Contributor

@Hezko @batrick Orch teuthology is failing bcz of this PR:

Mar 13 14:33:28 trial121 ceph-80c7f0da-1ee9-11f1-a2b8-d404e6e7d460-mgr-trial121-fejwjs[37284]: 2026-03-13T14:33:28.862+0000 7f161d164fc0 -1 mgr[py] Exception calling _register_commands on nvmeof
Mar 13 14:33:28 trial121 ceph-80c7f0da-1ee9-11f1-a2b8-d404e6e7d460-mgr-trial121-fejwjs[37284]: 2026-03-13T14:33:28.862+0000 7f161d164fc0 -1 mgr[py] Traceback (most recent call last):
Mar 13 14:33:28 trial121 ceph-80c7f0da-1ee9-11f1-a2b8-d404e6e7d460-mgr-trial121-fejwjs[37284]:   File "/usr/share/ceph/mgr/mgr_module.py", line 1107, in _register_commands
Mar 13 14:33:28 trial121 ceph-80c7f0da-1ee9-11f1-a2b8-d404e6e7d460-mgr-trial121-fejwjs[37284]:     cls.COMMANDS.extend(cls.CLICommand.dump_cmd_list())
Mar 13 14:33:28 trial121 ceph-80c7f0da-1ee9-11f1-a2b8-d404e6e7d460-mgr-trial121-fejwjs[37284]: AttributeError: type object 'NVMeoF' has no attribute 'CLICommand'
Mar 13 14:33:28 trial121 ceph-80c7f0da-1ee9-11f1-a2b8-d404e6e7d460-mgr-trial121-fejwjs[37284]: 
Mar 13 14:33:29 trial121 ceph-80c7f0da-1ee9-11f1-a2b8-d404e6e7d460-mgr-trial121-fejwjs[37284]: /lib64/python3.9/site-packages/scipy/__init__.py:73: UserWarning: NumPy was imported from a Python sub-interpre>

I opened this candidate fix PR: #67782

ljflores pushed a commit to ljflores/ceph that referenced this pull request Mar 19, 2026
Post the merge of this: ceph#67641

Fixes: https://tracker.ceph.com/issues/71631
Signed-off-by: Laura Flores <lflores@ibm.com>
ljflores pushed a commit to ljflores/ceph that referenced this pull request Mar 19, 2026
Post the merge of this: ceph#67641

Fixes: https://tracker.ceph.com/issues/71631
Signed-off-by: Laura Flores <lflores@ibm.com>
(cherry picked from commit 740de93)
NitzanMordhai pushed a commit to ceph/ceph-ci that referenced this pull request Mar 19, 2026
Post the merge of this: ceph/ceph#67641

Fixes: https://tracker.ceph.com/issues/71631
Signed-off-by: Laura Flores <lflores@ibm.com>
(cherry picked from commit 740de93)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants