mgr/dashboard: Refactor grpc proto files generation. #61967
mgr/dashboard: Refactor grpc proto files generation. #61967
Conversation
7d261c3 to
07259c1
Compare
|
jenkins retest this please |
07259c1 to
309f976
Compare
|
|
||
| # Set defaults for variables | ||
| set(GRPC_TOOLS_EXPERIMENTAL_FLAG "") | ||
| set(GRPC_TOOLS_NVME_PREPARE_SOURCE_CMD cp ${CEPH_NVME_GATEWAY_PROTO_FILE} ${CEPH_DASHBOARD_GATEWAY_PROTO_FILE}) |
There was a problem hiding this comment.
Why don't we set a symlink in the git tree instead?
There was a problem hiding this comment.
@baum had actually suggested that in the previous MR.
This time around I left it like this because I was either copying or seding the optional keyword out.
Check the comment below for the sed reasons.
| set(UBUNTU_22_04_GRPC_TOOLS_VERSION "3.5.1") | ||
| set(CENTOS_9_GRPC_TOOLS_VERSION "3.14.0") | ||
| set(FEDORA_40_GRPC_TOOLS_VERSION "3.19.6") |
There was a problem hiding this comment.
Rather than explicitly hard-coding distro-versions (this is going to complicate maintenance, and excludes distros-versions: F41, Ubuntu 24.04, ...). We should only distinguish by breaking changes libproto versions:
<3.12:optionalnot supported (removeoptionalfrom proto file).>=3.12: add--experimental_allow_proto3_optional(from3.14it's not required, but it's still accepted, so let's simplify and add it until some future3.*breaks if this is present).
There was a problem hiding this comment.
They are hardcoded in the variable name only, and you are right it was the bit I least liked about it but it helped me navigate the minefield.
If you follow the logic a bit, below you'll see the F41 and higher are supported but just marked as Unknown.
They can definitely be renamed to match the libproto version.
Regarding the optional keyword and <3.12 ...
There's a reason why I'm doing that, the current version of grpc_tools shipped by Ubuntu22.04 can't handle the current gateway.proto as is.
I'll simplify in line with the suggestions.
|
|
||
| elseif(LIBPROTOC_VERSION VERSION_LESS_EQUAL UBUNTU_22_04_GRPC_TOOLS_VERSION) | ||
| message(DEBUG "System likely to be Ubuntu 22.04 or earlier. libprotoc ${LIBPROTOC_VERSION}") | ||
| set(GRPC_TOOLS_NVME_PREPARE_SOURCE_CMD sed 's|optional ||g' ${CEPH_NVME_GATEWAY_PROTO_FILE} > ${CEPH_DASHBOARD_GATEWAY_PROTO_FILE}) |
There was a problem hiding this comment.
@epuertat it might cause comparability issues when mixing different versions of gateway and CLI.
There was a problem hiding this comment.
AFAIK, grpc APIs, as long as they are in the same protobuf version (v2 or v3), are compatible no matter the grpc/protobuf library version.
There was a problem hiding this comment.
I'm talking about compatibility between the gateway and the CLI, not the protocol layer.
There was a problem hiding this comment.
Is there a straightforward way to test that ?
There was a problem hiding this comment.
I'm talking about compatibility between the gateway and the CLI, not the protocol layer.
But if they're not colocated, they could (and probably will) run different versions of the gRPC libs (GW runs inside a ubi9 container, while Ceph Dashboard and CLI might run on Ubuntu 22/24,for those users running Deb packages).
da0ec73 to
6c84559
Compare
5dd3f87 to
516af47
Compare
|
Hi folks. The grpc build code from the previous MR that this current MR is trying to fix is breaking the If the only contention is regarding the Ubuntu and the removal of the optional keyword, one of the way outs could be to introduce those hardcoded files ( the old manually generated ones ) back into the repository. I could retrofit that logic in, if ubuntu22.04 copy over the hardcoded generated files from ubuntu22.04 and move on. In the meantime, I'm fighting the unintended consequences of |
516af47 to
59299be
Compare
|
I am also seeing that in the latest main, the grpc proto files are generated using grpc version which I don't remember seeing before IIRC. And that broke the nvmeof in main repo because our grpc version in centos is 1.45.x. Probably the @pecastro Shall we revert the original commit in the main branch for now and then bring it over here as this PRs base commit? That way people can build ceph main for now and test nvmeof stuffs as well. Meanwhile we can test the entire new changes here and merge it only when the supported distros (newer versions of it too?) are a pass. WDYT? |
I did mention this would happen in the original PR. #61089 (comment) I believe this existing PR will fix all the underlying issues. But given what has happen maybe caution is the way to go. @nizamial09 I don't mind if you revert those back to the original hardcoded files and mechanism whilst we work through this one properly this time, it might even be the better option other than the one on #61954 |
|
Thanks @pecastro , i agree with it. I opened the Revert PR here: #62103. I think reverting it would help us focus on the actual issue rather than getting diverted into other distro specific issues for now.
Ah yeah 😕 I should have looked past the dashboard error and understood the issue of the nvmeof controller itself. I missed it. Sorry for that. |
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
|
@pecastro the revert is merged in main now. |
Any thoughts on this ? |
|
On second though, looking into this bit https://github.com/ceph/ceph/blob/main/src/CMakeLists.txt#L907 ) which I'm not sure how I missed it in the original work but that's a different question ...) to try to factor in So my question is, for the dashboard generated protos, do we want to honour this variable or are we rebuilding the protos regardless ? And if so, do we want to to keep the new logic of removing the optional keyword or shall we reuse the originals just in the Ubuntu case. |
59299be to
09d6e15
Compare
to be honest here, I've only ever tested the nvmeof against the centos8/9 only. And I am almost sure that our APIs won't be there for ubuntu releases because ubuntu has a different grpcio version which is not supported with the one that we used to generate the proto files with. And if the core nvmeof is also doing the same thing because of the dependency issue, I am okay with it for the time being. If the monitor is required for starting the nvmeof gateways, I don't see a reason why we would need to build the proto files in ubuntu... |
09d6e15 to
10e189c
Compare
5eba9f7 to
8846e2b
Compare
Signed-off-by: Paulo E. Castro <pecastro@wormholenet.com>
* Rewrite grpc proto code generation logic to not depend on a venv installed during the build process. * Ensure OS packages are installed to support development. * Extend generation to cater with multiple OSs. * Move grpc generation it's own CmakeLists.txt. * Notes added to the generation script. Signed-off-by: Paulo E. Castro <pecastro@wormholenet.com>
Signed-off-by: Paulo E. Castro <pecastro@wormholenet.com>
openapi-check: install_deps /ceph/src/pybind/mgr/dashboard> python -I -m pip install -r requirements.txt -r requirements-test.txt -c constraints.txt
Looking in links: /ceph/src/pybind/mgr/dashboard/wheelhouse
Obtaining file:///ceph/src/python-common (from -r requirements.txt (line 7))
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
_ python setup.py egg_info did not run successfully.
│ exit code: 1
_─> [1 lines of output]
ERROR: Can not execute `setup.py` since setuptools is not available in the build environment.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
_ Encountered error while generating package metadata.
_─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
openapi-check: exit 1 (0.31 seconds) /ceph/src/pybind/mgr/dashboard> python -I -m pip install -r requirements.txt -r requirements-test.txt -c constraints.txt pid=270650
openapi-check: FAIL code 1 (0.33 seconds)
evaluation failed :( (0.36 seconds)
Signed-off-by: Paulo E. Castro <pecastro@wormholenet.com>
These all came about with the introduction of `sitepackages = true` in tox to allow the use system grpc python packages. Unfortunately these errors only seem to show up in old systems like Ubuntu 22.04 due most likely to the legacy version of some of the python libraries used. ``` 10: Requirement already satisfied: distlib<1,>=0.3.7 in /ceph/build/mgr-dashboard-lint-virtualenv/lib/python3.10/site-packages (from virtualenv>=20.27.1->tox) (0.3.9) 10: lint: install_deps /ceph/src/pybind/mgr/dashboard> python -I -m pip install -r requirements.txt -r requirements-lint.txt -c constraints.txt 10: lint: commands[0] /ceph/src/pybind/mgr/dashboard> flake8 10: lint: commands[1] /ceph/src/pybind/mgr/dashboard> flake8 --config=tox.ini ../../../../qa/tasks/mgr/dashboard 10: lint: commands[2] /ceph/src/pybind/mgr/dashboard> isort ../../../../qa/tasks/mgr/dashboard --check 10: lint: commands[3] /ceph/src/pybind/mgr/dashboard> pylint -rn --rcfile=.pylintrc --jobs=1 . api controllers plugins services tests ../../../../qa/tasks/mgr/dashboard 10: ************* Module controllers.osd 10: controllers/osd.py:9:0: E0401: Unable to import 'ceph.deployment.drive_group' (import-error) 10: ************* Module plugins.motd 10: plugins/motd.py:8:0: E0401: Unable to import 'ceph.utils' (import-error) 10: ************* Module services.orchestrator 10: services/orchestrator.py:7:0: E0401: Unable to import 'ceph.deployment.service_spec' (import-error) 10: ************* Module tests.test_osd 10: tests/test_osd.py:7:0: E0401: Unable to import 'ceph.deployment.drive_group' (import-error) 10: tests/test_osd.py:8:0: E0401: Unable to import 'ceph.deployment.service_spec' (import-error) ... ``` Signed-off-by: Paulo E. Castro <pecastro@wormholenet.com>
8846e2b to
55576f9
Compare
I thought so as well but ... The naive approach of just honouring this So, I can leave as is ( doing the "naughty" optional keyword removal ) or I can add another commit so it uses the existing generated files that exist in the repo ONLY for the Ubuntu case. |
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
during the build process.
Fixes: #61089
Alternative fix to: https://tracker.ceph.com/issues/69995 & #61846
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e