upgrade rules for NVMeofGW monitors and gateways#59240
Conversation
|
commit message is missing subdirectory/component and can be much more explicit: |
|
We probably want to include the feature bit in this PR as well. Please cherry-pick "include/ceph_features: remove stray available marker" and " include/ceph_features: add NVMEOFHA feature bit" from https://github.com/athanatos/ceph/tree/sjust/wip-nvmeof-upgrade-59198 ordered before your commit. Keep them as separate commits as the commit messages have information in them (let me know if you need help with this). Then, incorporate "include/ceph_features: remove stray available marker" into your commit (either cherry-pick and squash or just find/replace SQUID manually, there are only 4 usages). |
72e03c5 to
f37b0a4
Compare
|
jenkins test make check |
|
The current commit ordering does not really make sense. mon/NVMeofGw*: support upgrades from prior out-of-tree nvmeofha implementation (nvmeof-reef) is the first commit, and it uses SQUID feature bits. The next commit "mon/NVMEofHw*: use NVMEOFHA rather than SERVER_SQUID" does SERVER_SQUID -> NVMEOFHA. This should be squashed into "mon/NVMeofGw*: support upgrades from prior out-of-tree nvmeofha implementation (nvmeof-reef)" -- there is no reason for future readers to see a version of this code with the wrong feature bit. The commit after that "include/ceph_features: add NVMEOFHA feature bit" is where the NVMEOFHA feature bit actually gets added. Without this commit, the prior two do not compile. Please put this one before the feature bit is used. Finally, "mon/NVMeofGw*: code review fixes, limit number Gws in each pool&group to 16" should also be squashed into "mon/NVMeofGw*: support upgrades from prior out-of-tree nvmeofha implementation (nvmeof-reef)" Less importantly, this branch also lacks "include/ceph_features: remove stray available marker" I've repushed https://github.com/athanatos/ceph/tree/sjust/wip-nvmeof-upgrade-59198-commits-fixed with the above changes as well as fixes for the comments I make below -- please read over those fixes carefully. I also pushed https://github.com/athanatos/ceph/tree/sjust/wip-nvmeof-upgrade-59198-commits-fixed-squashed with the comment fixes squashed -- feel free to simply force-push https://github.com/athanatos/ceph/tree/sjust/wip-nvmeof-upgrade-59198-commits-fixed-squashed to this branch if you are ok with the changes. |
src/mon/NVMeofGwSerialize.h
Outdated
| for (auto &tm_itr:state.data) { | ||
| tmdata[tm_itr.first].timer_started = tm_itr.second.timer_started; | ||
| tmdata[tm_itr.first].timer_value = tm_itr.second.timer_value; | ||
| tmdata[tm_itr.first].end_time = tm_itr.second.end_time; |
There was a problem hiding this comment.
This is dangerous. It sort of works at the moment due to the guard in cfg_add_gw and the corresponding enforcement in nvmeof-reef, but during the evolution of this PR, MAX_SUPPORTED_ANA_GROUPS has generally been considered to be for backward compatibility rather than a permanent commitment. This encoder should really be robust to a future where more than 16 GWs are supported in a group, and it would be a very easy thing to miss while making such a change. Rather than populating a fixed size array, simply check state.data for each loop through the following for loop. The branch I mention above has such a fix.
There was a problem hiding this comment.
If you ever do feel you need such a fixed size array, please be certain to include ceph_assert guards to ensure that the indices are in bounds if they cannot be statically proven to be so within 10 lines or so of the usage. While crashing the mon is unacceptable, it's actually far preferable to the kind of undefined nonsense that can happen as a result of an out-of-bounds array write.
|
https://github.com/athanatos/ceph/tree/sjust/wip-nvmeof-upgrade-59198-commits-fixed-squashed is what https://github.com/athanatos/ceph/tree/sjust/wip-nvmeof-upgrade-59198-commits-fixed should look like with the fixes squashed. If you are happy with the changes, feel free to just force-push https://github.com/athanatos/ceph/tree/sjust/wip-nvmeof-upgrade-59198-commits-fixed-squashed to this branch. |
f37b0a4 to
e447024
Compare
|
Sorry, one more fix. @gregsfortytwo points out above that populating ENCODE_START as I suggested is dangerous because when you eventually want to bump version it will naturally bump the min_compat version as well (the second argument), which is unlikely to be desirable since presumably your future struct encoding changes will be simpler. I've pushed (once again) https://github.com/athanatos/ceph/tree/sjust/wip-nvmeof-upgrade-59198-commits-fixed with a fix for the above and https://github.com/athanatos/ceph/tree/sjust/wip-nvmeof-upgrade-59198-commits-fixed-squashed as a final branch with that extra commit squashed. 281ea85 is the specific change. |
e447024 to
8e3eee4
Compare
|
Pushed testing branch https://shaman.ceph.com/builds/ceph/wip-sjust-nvmeof-testing-2024-08-21 with #59240, #59385, #59388, and #59366 |
|
jenkins test make check |
Sam have bundled ceph/ceph#59388 with ceph/ceph#59240, ceph/ceph#59385, ceph/ceph#59388, and ceph/ceph#59366 in a testing branch https://shaman.ceph.com/builds/ceph/wip-sjust-nvmeof-testing-2024-08-21 Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
Should have been removed in caa9e7a. Signed-off-by: Samuel Just <sjust@redhat.com>
Normally, we'd just use the SERVER_SQUID or SERVER_T flags instead of using an extra feature bit. However, the nvmeof ha monitor paxos service has had a more complex development journey. There are users interested in using the nvmeof ha feature in squid, but it didn't make the cutoff for backporting it. There's an upstream nvmeof-squid branch in the ceph.git repository with the patches backported for anyone interested in building it. However, that means that users of our normal stable releases will see the feature added to the monitor one release after anyone who chooses to use the nvmeof-squid branch. We could disallow upgrades from nvmeof-squid to T, but by adding a feature bit here we make such a restriction unnecessary. Signed-off-by: Samuel Just <sjust@redhat.com>
…mentation (nvmeof-reef) This commit adds upgrade support for users running an experimental nvmeofha implementation which can be found in the nvmeof-reef branch in ceph.git. Signed-off-by: Leonid Chernin <leonidc@il.ibm.com>
8e3eee4 to
3c50ef6
Compare
|
@athanatos , All comments seems to be resolved, some of them resolved by your commit |
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e