Skip to content

pybind/mgr: Protobuffer file generation for grpc deps.#61089

Merged
nizamial09 merged 3 commits intoceph:mainfrom
pecastro:mgr_proto_build_generation
Feb 13, 2025
Merged

pybind/mgr: Protobuffer file generation for grpc deps.#61089
nizamial09 merged 3 commits intoceph:mainfrom
pecastro:mgr_proto_build_generation

Conversation

@pecastro
Copy link
Copy Markdown
Contributor

@pecastro pecastro commented Dec 14, 2024

Fixes: https://tracker.ceph.com/issues/64681

@nizamial09, @epuertat Please have a look and share your thoughts.
This aligns with the suggestion here #60719 (comment)

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@pecastro pecastro requested a review from a team as a code owner December 14, 2024 14:23
@pecastro pecastro requested review from aaSharma14 and ivoalmeida and removed request for a team December 14, 2024 14:23
@epuertat
Copy link
Copy Markdown
Member

That's great @pecastro ! The challenge with the proto-generated Python files is the import path created. I remember that by default it used some path that didn't work for the dashboard and required manual patching of the Python import header...

The fix involved invoking the script from a specific path location with some "../.." stuff around to force it use the correct relative paths... Sorry for my vague description. I managed to make it work time ago, but I don't remember the exact command 😓

@nizamial09 maybe you're luckier here?

@pecastro pecastro force-pushed the mgr_proto_build_generation branch 2 times, most recently from fc29622 to 09b01f1 Compare December 16, 2024 22:03
@pecastro pecastro force-pushed the mgr_proto_build_generation branch from 09b01f1 to ce834b5 Compare January 11, 2025 18:39
@ceph-jenkins
Copy link
Copy Markdown
Collaborator

Can one of the admins verify this patch?

@pecastro pecastro force-pushed the mgr_proto_build_generation branch 2 times, most recently from 3c778d1 to 5bb4120 Compare January 12, 2025 18:34
@github-actions github-actions bot added the tests label Jan 12, 2025
@pecastro pecastro force-pushed the mgr_proto_build_generation branch from 5bb4120 to ddfd508 Compare January 12, 2025 21:12
@pecastro
Copy link
Copy Markdown
Contributor Author

@epuertat @nizamial09 It built and it passed make check. 🎉

@nizamial09
Copy link
Copy Markdown
Member

nizamial09 commented Jan 15, 2025

The fix involved invoking the script from a specific path location with some "../.." stuff around to force it use the correct relative paths... Sorry for my vague description. I managed to make it work time ago, but I don't remember the exact command 😓

@nizamial09 maybe you're luckier here?

@epuertat I don't fully remember it and I can't track the command logs in history since I had to use container to make it work (my system python version was not the one used to build ceph on centos9).

And thanks @pecastro for the PR. I will run this through our build pipeline to test and generate an image to see whether the protobuf files are actually there and nothing's went wrong in the build. Sorry for making you wait here.

@epuertat
Copy link
Copy Markdown
Member

@epuertat I don't fully remember it and I can't track the command logs in history since I had to use container to make it work (my system python version was not the one used to build ceph on centos9).

Looks like @pecastro nailed it! Yeah, it had to do with the relative path you fed the protoc generation command.

@pecastro pecastro force-pushed the mgr_proto_build_generation branch from 5f97a2a to 2125187 Compare January 15, 2025 23:50
@nizamial09
Copy link
Copy Markdown
Member

a new shaman build triggered

@pecastro
Copy link
Copy Markdown
Contributor Author

Humm ... I think I've "streamlined" it a bit too much.

The import paths of the generated files are not the same as before therefore preventing the correct import of the model.

@pecastro pecastro force-pushed the mgr_proto_build_generation branch from 2125187 to d4985f5 Compare January 16, 2025 22:21
# protobuffer files generation
add_custom_command(
OUTPUT "${CMAKE_CURRENT_SOURCE_DIR}/services/proto/gateway_pb2.py" "${CMAKE_CURRENT_SOURCE_DIR}/services/proto/gateway_pb2_grpc.py"
COMMAND cp ${CEPH_NVME_GATEWAY_PROTO_DIR}/gateway.proto dashboard/services/proto/gateway.proto
Copy link
Copy Markdown
Contributor

@baum baum Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pecastro, thank you for exploring the streamlining suggestion! It seems relative paths can be a bit tricky for protoc, as highlighted by this error: 😀

protoc is too dumb to figure out when two paths (e.g., absolute and relative) are equivalent (it's harder than you think)

Perhaps we could address this by substituting the copying of the protobuf file during build time with creating a symbolic link as part of this PR as git handles symbolic links quite well. I tested this approach in the src/pybind/mgr/dashboard/services/proto directory:

$ ln -s ../../../../../../src/nvmeof/gateway/control/proto/gateway.proto .
$ ls -l gateway.proto
lrwxrwxrwx. 1 baum baum 64 Jan 17 06:01 gateway.proto -> ../../../../../../src/nvmeof/gateway/control/proto/gateway.proto
$ python -m grpc_tools.protoc --proto_path=. --python_out=. --grpc_python_out=. gateway.proto

Alternatively, passing a relative --proto_path seems to work in the src/pybind/mgr/dashboard/services/proto directory, eliminating the need for a protobuf file copy or symbolic link. Does it cause any issues with imports?

$ rm gateway.proto
$ python -m grpc_tools.protoc --proto_path=../../../../../../src/nvmeof/gateway/control/proto/ --python_out=. --grpc_python_out=. gateway.proto

Let me know what you think!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember I managed to make it work just by invoking the script from a specific location that led it to match the expected import path... I'll look for that line again.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're setting the WORKING_DIRECTORYin the add_custom_command(), you should be able to use relative paths there. It should be something like (maybe adding the dashboard subdir too):

python -m grpc_tools.protoc -I./services/proto \
  --python_out=. --grpc_python_out=. \
  services/proto/gateway.proto

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pecastro, thank you for exploring the streamlining suggestion! It seems relative paths can be a bit tricky for protoc, as highlighted by this error: 😀

@baum Where did that error come from exactly ?

I've seen it during development of the fix whilst I was trying different invocations of the final result but not since I committed and pushed.
The pipeline failed make check but on a test unrelated to this work.

image

The relevant test is run-tox-mgr-dashboard-openapi which passed with success if you check the console output.

https://jenkins.ceph.com/job/ceph-pull-requests/149745/consoleFull#-3553227439b08c573-3997-47f1-a8a3-7366b7cbd1e2

make check (arm 64) also passed in the github pipeline. :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same code, different fail in a different place.

image

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test failures looked unrelated so retriggered it.

@pecastro
Copy link
Copy Markdown
Contributor Author

jenkins retest this please

@nizamial09
Copy link
Copy Markdown
Member

jenkins test make check

@pecastro
Copy link
Copy Markdown
Contributor Author

but I'm not seeing a few expected references in any of the builds.

@pecastro Yeah I am not an expert on the shaman side either but I always thought that logs the build output of ceph but I was wrong. I checked the output of one of the jenkins run in this PR and I saw the lines you mentioned there.

Also, I tested with the image the shaman build created and I saw the proto files inside the services/proto files as well. Hence I concluded that the shaman also went fine. Its just that it is logging a different build log.

It can't log differently, the only difference I saw in your shaman logs would be the percentual completion on the CMake output vs saying [ 10/2231 ].

Can you check if the files inside services/proto are the newly generated ones vs being the old ones? The MR removes the old files but the added CMake code will create them in the same place.

@nizamial09
Copy link
Copy Markdown
Member

@pecastro you are right. the files inside the container was older ones.

So I've been trying to understand the build pipeline. So far and from what I gathered, apart from just generating the protobuf files during the build pipelines, we also need to install those generated protobuf files inside the ceph container so that those files will be there in the designated folder. We already do this for some things like the grafana: https://github.com/ceph/ceph/blob/main/ceph.spec.in#L1660

But the thing that I can't figure out is the proto file generation itself and where does that happen in the build pipeline. I don't see any traces of dashboard CMakelist getting executed there. Rather the dashboard is fed as an rpm after the cmake step. So I am not entirely sure how the CMakelist works here and I'll need help from @epuertat to figure that part out.

@pecastro
Copy link
Copy Markdown
Contributor Author

pecastro commented Feb 5, 2025

If WITH_MGR_DASHBOARD_FRONTEND is false https://github.com/pecastro/ceph/blob/mgr_proto_build_generation/src/pybind/mgr/dashboard/CMakeLists.txt#L7 the code block that generates the proto files will never run and the gateway_pb2*.py files will not be generated.

And this line https://github.com/ceph/ceph/blob/main/ceph.spec.in#L1466C7-L1466C34 is suspicious.

Using this branch though, means the old files will be removed.

If you are saying that the files inside the container are the old one, that means the container was generated off a different revision and not from this branch nor with these commits cherry-picked.

It would be great to understand the build process better. Is the shaman build script public ?!

I could always try to move the code that builds the proto files out of the if(WITH_MGR_DASHBOARD_FRONTEND) block...

@epuertat
Copy link
Copy Markdown
Member

epuertat commented Feb 6, 2025

@pecastro : WITH_MGR_DASHBOARD_FRONTEND only affects files under dashboard/frontend/ dir (NPM generated JS bundle). The (back-end) Python files/stuff should be executed anyway. If you execute manually the rpmbuild command you could verify that the new proto files are actually written to the RPM_BUILD_ROOT.

Fixes: https://tracker.ceph.com/issues/64681

Signed-off-by: Paulo E. Castro <pecastro@wormholenet.com>
…oto using the protobuf file from the gateway submodule.

- Remove old gateway-proto.
- Updated src/nvmeof/gateway to the latest sha-1 from devel.

Signed-off-by: Paulo E. Castro <pecastro@wormholenet.com>
- This ensures protos are built during the RPM build process.
- gitignore the sourced gateway.proto file.

Signed-off-by: Paulo E. Castro <pecastro@wormholenet.com>
@pecastro pecastro force-pushed the mgr_proto_build_generation branch from d4985f5 to 37e6333 Compare February 8, 2025 11:29
@pecastro
Copy link
Copy Markdown
Contributor Author

pecastro commented Feb 8, 2025

@nizamial09 @epuertat

I moved the build proto logic out of the if(WITH_MGR_DASHBOARD_FRONTEND) and running rpmbuild I could confirm the files are there.

@nizamial09 If you trigger a shaman build you'll be able to validate the new proto files with something like:

rpm2cpio ceph-mgr-dashboard-xxxxxxxxxx.noarch.rpm | cpio -iv --to-stdout ./usr/share/ceph/mgr/dashboard/services/proto/gateway_pb2.py | head -20
# -*- coding: utf-8 -*-
# Generated by the protocol buffer compiler.  DO NOT EDIT!
# NO CHECKED-IN PROTOBUF GENCODE
# source: dashboard/services/proto/gateway.proto
# Protobuf Python Version: 5.29.0
"""Generated protocol buffer code."""
from google.protobuf import descriptor as _descriptor
from google.protobuf import descriptor_pool as _descriptor_pool
from google.protobuf import runtime_version as _runtime_version
from google.protobuf import symbol_database as _symbol_database
from google.protobuf.internal import builder as _builder
_runtime_version.ValidateProtobufRuntimeVersion(
    _runtime_version.Domain.PUBLIC,
    5,
    29,
    0,
    '',
    'dashboard/services/proto/gateway.proto'
)
# @@protoc_insertion_point(imports)

VS old file

head -20 src/pybind/mgr/dashboard/services/proto/gateway_pb2.py
# -*- coding: utf-8 -*-
# Generated by the protocol buffer compiler.  DO NOT EDIT!
# source: dashboard/services/proto/gateway.proto
"""Generated protocol buffer code."""
from google.protobuf.internal import enum_type_wrapper
from google.protobuf import descriptor as _descriptor
from google.protobuf import message as _message
from google.protobuf import reflection as _reflection
from google.protobuf import symbol_database as _symbol_database
# @@protoc_insertion_point(imports)

_sym_db = _symbol_database.Default()




DESCRIPTOR = _descriptor.FileDescriptor(
  name='dashboard/services/proto/gateway.proto',
  package='',
  syntax='proto3',

@nizamial09
Copy link
Copy Markdown
Member

Thank you @pecastro. I am trying to schedule a shaman run but it looks like a long queue there. will see if it ever turns up there
https://shaman.ceph.com/builds/ceph/

@nizamial09
Copy link
Copy Markdown
Member

jenkins test make check arm64

@nizamial09
Copy link
Copy Markdown
Member

yup, now i am seeing the correct proto files in the image as well. thank you so much for taking care of it. @pecastro

─$ podman run -it quay.ceph.io/ceph-ci/ceph:65a61cbeece7ba88356b0632873f4e902987801a cat /usr/share/ceph/mgr/dashboard/services/proto/gateway_pb2.py
# -*- coding: utf-8 -*-
# Generated by the protocol buffer compiler.  DO NOT EDIT!
# NO CHECKED-IN PROTOBUF GENCODE
# source: dashboard/services/proto/gateway.proto
# Protobuf Python Version: 5.29.0
"""Generated protocol buffer code."""
from google.protobuf import descriptor as _descriptor
from google.protobuf import descriptor_pool as _descriptor_pool
from google.protobuf import runtime_version as _runtime_version
from google.protobuf import symbol_database as _symbol_database
from google.protobuf.internal import builder as _builder
_runtime_version.ValidateProtobufRuntimeVersion(
    _runtime_version.Domain.PUBLIC,
    5,
    29,
    0,
    '',
    'dashboard/services/proto/gateway.proto'
)
# @@protoc_insertion_point(imports)

_sym_db = _symbol_database.Default()

@nizamial09
Copy link
Copy Markdown
Member

this could be ready to merge once the test passes then.

@nizamial09 nizamial09 merged commit aa292e8 into ceph:main Feb 13, 2025
4 checks passed
@athanatos athanatos mentioned this pull request Feb 17, 2025
14 tasks
epuertat added a commit to rhcs-dashboard/ceph that referenced this pull request Feb 18, 2025
It disables the specific pinning to a Python xmlsec version.

Fixes: https://tracker.ceph.com/issues/70019
Fixes: 37e6333
Fixes: ceph#61089

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
add_custom_command(
OUTPUT ${CEPH_GRPC_VIRTUALENV}/lib/python*/site-packages/grpc_tools
COMMAND ${CMAKE_SOURCE_DIR}/src/tools/setup-virtualenv.sh --python=${Python3_EXECUTABLE} ${CEPH_GRPC_VIRTUALENV}
COMMAND ${CEPH_GRPC_VIRTUALENV}/bin/pip install grpcio-tools
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't have this here. All Python deps should be installed from requirements.txt files.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, IMHO the proto files should be generated with each distro's grpc-tools packages. That's the only way to ensure the generated code is compatible with each distro's version.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW ... #61089 (comment)

I can have a look into redoing some of this in that direction.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in RPM-based systems (DEB-based systems have a similar mechanism), build-specific deps should be installed through BuildRequires: in the ceph.spec.in file.

lee-j-sanders pushed a commit to ceph/ceph-ci that referenced this pull request Feb 18, 2025
It disables the specific pinning to a Python xmlsec version.

Fixes: https://tracker.ceph.com/issues/70019
Fixes: 37e6333
Fixes: ceph/ceph#61089

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
@pecastro
Copy link
Copy Markdown
Contributor Author

@epuertat @nizamial09 This was what the MR added https://github.com/ceph/ceph/blame/main/src/pybind/mgr/dashboard/CMakeLists.txt

The if(WITH_TESTS) block has been there a long time and that just generates the Makefile targets that ctest then uses to run the tests. The tests block doesn't need moving.

Also, moving the proto generation to inside the tox tests is not going to work because if the proto files are not generated as part of the build they'll never make it to the rpm. Before being generated as of now, they were part of the repository.
@nizamial09 During development,if you remember from the conversation on the that MR. I had to move the block of code responsible for the proto generation out of the if(WITH_MGR_DASHBOARD_FRONTEND or otherwise they wouldn't be created.

I'd remove the pip install grpcio-tools from the CMakelists, and adapt that part too to invoke this

I agree with the first part of the sentence and looking at some of the later comments @epuertat left on the original PR

BTW, IMHO the proto files should be generated with each distro's grpc-tools packages. That's the only way to ensure the generated code is compatible with each distro's version.

I found this a bit problematic. I kept on bumping into an error generating the protos when using older versions of the grpc python module (ubuntu22.04) which was in a way invalidating the premise of using the distro package to generate the code.
Using generated proto files with newer versions of the grpc meant older systems like ubuntu22.04 wouldn't be able to load them.

So, under catch 22 I decided to fix forward.
I need some time to look into this better, maybe I'm missing something.

@epuertat
Copy link
Copy Markdown
Member

@pecastro:
The if(WITH_TESTS) block has been there a long time and that just generates the Makefile targets that ctest then uses to run the tests. The tests block doesn't need moving.

Yes, that's because the rpmbuild is generated without "make-check" tests.

Also, moving the proto generation to inside the tox tests is not going to work because if the proto files are not generated as part of the build they'll never make it to the rpm. Before being generated as of now, they were part of the repository. @nizamial09 During development,if you remember from the conversation on the that MR. I had to move the block of code responsible for the proto generation out of the if(WITH_MGR_DASHBOARD_FRONTEND or otherwise they wouldn't be created.

We probably abused of tox for non-test tasks (OpenAPI spec generation, etc) but it's a good framework to quickly run Python reproducible environments without having to worry about deps, etc. Keep in mind that (unfortunately) Ceph CI is not yet virtualized/containerized.

I'd remove the pip install grpcio-tools from the CMakelists, and adapt that part too to invoke this

The install-deps.sh script takes care of downloading all requirements*.txt, and cmake scripts also deal with virtualenv creation.

I agree with the first part of the sentence and looking at some of the later comments @epuertat left on the original PR

BTW, IMHO the proto files should be generated with each distro's grpc-tools packages. That's the only way to ensure the generated code is compatible with each distro's version.

I found this a bit problematic. I kept on bumping into an error generating the protos when using older versions of the grpc python module (ubuntu22.04) which was in a way invalidating the premise of using the distro package to generate the code. Using generated proto files with newer versions of the grpc meant older systems like ubuntu22.04 wouldn't be able to load them.

So, under catch 22 I decided to fix forward. I need some time to look into this better, maybe I'm missing something.

The question is that proto stuff is generated during the package creation, so it will always run along with the same grpc package versions.

dnyanee1997 pushed a commit to rhcs-dashboard/ceph that referenced this pull request Feb 22, 2025
It disables the specific pinning to a Python xmlsec version.

Fixes: https://tracker.ceph.com/issues/70019
Fixes: 37e6333
Fixes: ceph#61089

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
dnyanee1997 pushed a commit to rhcs-dashboard/ceph that referenced this pull request Feb 23, 2025
It disables the specific pinning to a Python xmlsec version.

Fixes: https://tracker.ceph.com/issues/70019
Fixes: 37e6333
Fixes: ceph#61089

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
@pecastro pecastro deleted the mgr_proto_build_generation branch February 23, 2025 23:34
dnyanee1997 pushed a commit to rhcs-dashboard/ceph that referenced this pull request Feb 24, 2025
It disables the specific pinning to a Python xmlsec version.

Fixes: https://tracker.ceph.com/issues/70019
Fixes: 37e6333
Fixes: ceph#61089

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
@nizamial09 nizamial09 mentioned this pull request Mar 4, 2025
14 tasks
Comment on lines +24 to +28
add_custom_target(mgr-dashboard-services-build
ALL
DEPENDS mgr-dashboard-services-deps)

add_dependencies(tests mgr-dashboard-services-build)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pecastro a slack user reports that this new dashboard stuff breaks the build when WITH_NVMEOF_GATEWAY_MONITOR_CLIENT=OFF

can you please add if(WITH_NVMEOF_GATEWAY_MONITOR_CLIENT) where necessary to fix this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbodley, @nizamial09 is in the process of reverting the commits from this PR here #62103. So that will stop being an issue once that is done.

I'll factor that WITH_NVMEOF_GATEWAY_MONITOR_CLIENT in the new PR #61967 that is being worked to make the GRPC components more streamlined and stable.

harriscr pushed a commit to ceph/ceph-ci that referenced this pull request May 15, 2025
It disables the specific pinning to a Python xmlsec version.

Fixes: https://tracker.ceph.com/issues/70019
Fixes: 37e6333
Fixes: ceph/ceph#61089

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

6 participants