Skip to content

[python] Support importing without proto file name knowledge in Python generated protobuf code#275

Merged
karenyrx merged 7 commits intoopensearch-project:mainfrom
karenyrx:pythonfile
Nov 7, 2025
Merged

[python] Support importing without proto file name knowledge in Python generated protobuf code#275
karenyrx merged 7 commits intoopensearch-project:mainfrom
karenyrx:pythonfile

Conversation

@karenyrx
Copy link
Copy Markdown
Collaborator

@karenyrx karenyrx commented Nov 5, 2025

Description

Python generated code users should not need to know the the protobuf filename that the message is located in (i.e. whether it's in document.proto, search.proto, or common.proto). This PR adds support for importing without reference to the filenames. Still backward compatible to support the previous way of importing.

Method 1: File Name Dependent (current import style)

from opensearch.protobufs.schemas import document_pb2, common_pb2

requestBody = document_pb2.BulkRequestBody()
params = common_pb2.GlobalParams()

Method 2: No File Name Dependency (new import style added in this PR)

from opensearch.protobufs.schemas import BulkRequestBody, GlobalParams

requestBody = BulkRequestBody()
params = GlobalParams()

Test plan

  1. Tested new imports work ( without the file name )
(venv) ~/opensearch-protobufs on [pythonfile] % PYTHONPATH=bazel-bin python3 -c "from opensearch.protobufs.schemas import (
    BulkRequest,
    BulkRequestBody,
    IndexDocumentRequest,
    UpdateDocumentRequest,
    DeleteDocumentRequest,
    GetDocumentRequest,
    GlobalParams,
    Script,
    ObjectMap,
    SourceConfig,
    OpType,
    Refresh,
    Result,
) 
from opensearch.protobufs.services import DocumentServiceStub, SearchServiceStub
print('schemas and services imports both work')
"
schemas and services imports both work
  1. Tested old imports also work (with the file name) - for backward compatibility
(venv) ~/opensearch-protobufs on [pythonfile] % PYTHONPATH=bazel-bin python3 -c "
from opensearch.protobufs.schemas import document_pb2, common_pb2
from opensearch.protobufs.services import DocumentServiceStub, SearchServiceStub
old_bulk_request = document_pb2.BulkRequestBody()
old_global_params = common_pb2.GlobalParams()
print('old imports work')
"
old imports work

Details How it works

  1. generate_init_files.py will:
  • discovers all *_pb2.py files in the schemas and services directories
  • find all protobuf message classes, enums, and constants
  • create init files that re-export all discovered classes
  1. BUILD.bazel integrates the generated init files through the fix_python_imports genrule
  2. The system will creae a package structure like:
opensearch/
├── __init__.py                    # Top-level package
└── protobufs/
    ├── __init__.py               # Protobufs package
    ├── schemas/
    │   ├── __init__.py           # Auto-generated with 240+ exports
    │   ├── common_pb2.py
    │   ├── document_pb2.py
    │   └── search_pb2.py
    └── services/
        ├── __init__.py           # Auto-generated service exports
        ├── document_service_pb2.py
        ├── document_service_pb2_grpc.py
        ├── search_service_pb2.py
        └── search_service_pb2_grpc.py

@karenyrx karenyrx changed the title Eliminate file name dependencies in Python generated protobuf code Support importing without proto file name knowledge in Python generated protobuf code Nov 5, 2025
Signed-off-by: Karen X <karenxyr@gmail.com>
@karenyrx karenyrx changed the title Support importing without proto file name knowledge in Python generated protobuf code [python] Support importing without proto file name knowledge in Python generated protobuf code Nov 5, 2025
Signed-off-by: Karen X <karenxyr@gmail.com>
Signed-off-by: Karen X <karenxyr@gmail.com>
Signed-off-by: Karen X <karenxyr@gmail.com>
@karenyrx
Copy link
Copy Markdown
Collaborator Author

karenyrx commented Nov 5, 2025

Surprisingly, the updates to build-protobufs-python.yml in this PR decreased the Python CI run time from ~11m to ~4m
https://github.com/opensearch-project/opensearch-protobufs/actions/workflows/build-protobufs-python.yml

Screenshot 2025-11-04 at 8 47 37 PM

@karenyrx
Copy link
Copy Markdown
Collaborator Author

karenyrx commented Nov 5, 2025

The Mend Security Check failures are unrelated to this PR; as they are for a typescript dependency eslint-plugin-jest-28.8.0.tgz not a python one. Will be addressed in a separate PR. They're faiing on this PR because it's the first time the Mend check has been run in a long time (this PR triggered the run due to GHA changes).

Fixed in #277

@karenyrx karenyrx marked this pull request as ready for review November 5, 2025 04:50
Signed-off-by: Karen X <karenxyr@gmail.com>
Signed-off-by: Karen X <karenxyr@gmail.com>
@finnegancarroll
Copy link
Copy Markdown
Contributor

Gave this change a try with OSB and the new import scheme works fine.

I notice the init.py files in the package are still relatively simple and achieve the goal of removing filenames from end user imports and am wondering why tools/generate_init_files.py needs to parse the contents of .proto files if all we need is the file name to export all definitions?

from .common_pb2 import *
from .document_pb2 import *
from .search_pb2 import *

Signed-off-by: karenx <karenx@uber.com>
@karenyrx
Copy link
Copy Markdown
Collaborator Author

karenyrx commented Nov 7, 2025

@finnegancarroll Ah, it was part of a legacy implementation where I was populating the init.py files with every single protobuf type like:

all = [ // all protobuf messaage / enum types]

Have removed this, thanks so much for catching this!

@karenyrx karenyrx merged commit 20ba15e into opensearch-project:main Nov 7, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants