Skip to content

feat: add gemini structure output#60

Merged
galzilber merged 8 commits intomainfrom
gz/add-gemini-stracture-output
Aug 6, 2025
Merged

feat: add gemini structure output#60
galzilber merged 8 commits intomainfrom
gz/add-gemini-stracture-output

Conversation

@galzilber
Copy link
Contributor

@galzilber galzilber commented Aug 6, 2025

Important

Adds support for structured output in chat requests using JSON schemas, with updates to GeminiSchema and related functions, and new tests for validation.

  • Behavior:
    • Adds response_mime_type and response_schema to GenerationConfig in models.rs for structured output.
    • Implements GeminiSchema enum with types STRING, NUMBER, INTEGER, BOOLEAN, ARRAY, and OBJECT.
    • Adds from_value_with_fallback() to GeminiSchema for schema conversion.
    • Updates to_openai_with_structured_output() in GeminiChatResponse to handle JSON validation.
  • Provider:
    • Modifies chat_completions() in provider.rs to handle structured output based on response_schema.
  • Tests:
    • Adds tests in tests.rs for schema conversion and structured response handling.

This description was created by Ellipsis for bea9a26. You can customize this summary. It will automatically update as commits are pushed.


Summary by CodeRabbit

  • New Features

    • Enhanced support for structured output by allowing JSON schema-based response formats in chat requests.
    • Improved handling of response formats, including detection and conversion of OpenAI JSON schemas to Gemini-compatible schemas.
    • Added capability to return responses as raw JSON when structured output is requested.
  • Tests

    • Introduced comprehensive unit tests to validate schema conversion and structured response handling.
  • Chores

    • Updated file ignore settings to exclude certain configuration files from version control.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 6, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

The changes introduce structured output support for Gemini chat requests by adding optional fields for response MIME type and schema in the configuration. A function for converting OpenAI JSON schemas to Gemini-compatible formats is implemented. The request/response handling logic is updated accordingly, and comprehensive tests are added to validate schema conversion and structured output integration.

Changes

Cohort / File(s) Change Summary
Git Ignore Update
.gitignore
Added service.json to the ignore list under the environment variables section.
Gemini Structured Output Support
src/providers/vertexai/models.rs
Added response_mime_type and response_schema fields to GenerationConfig. Introduced GeminiSchema enum representing Gemini JSON schema types with variants for STRING, NUMBER, INTEGER, BOOLEAN, ARRAY, and OBJECT, including recursive conversion from arbitrary JSON schema Value with fallback. Updated Gemini chat request conversion to handle response formats and attach MIME type and schema. Added method to GeminiChatResponse for OpenAI-compatible output supporting structured output detection and JSON validation.
Provider Response Handling
src/providers/vertexai/provider.rs
Added detection of structured output by checking for response_schema in the Gemini chat request's generation config. Updated response conversion to use the new structured output-aware method with a boolean flag indicating structured output presence. Applied changes in both test mode and normal response handling.
Schema Conversion and Structured Output Tests
src/providers/vertexai/tests.rs
Added extensive unit tests covering conversion of basic and complex JSON schemas to Gemini schemas, including handling of descriptions, arrays, objects, required fields, unsupported types, and property ordering. Added tests for chat completion request conversions with various response formats (json_schema, json_object, none), verifying correct MIME type, schema attachment, and structured output flag propagation.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Provider
    participant Gemini

    Client->>Provider: Send ChatCompletionRequest (may include response_format/schema)
    Provider->>Provider: Convert request to GeminiChatRequest
    Provider->>Provider: If response_format is json_schema, convert schema (OpenAI -> Gemini)
    Provider->>Gemini: Send GeminiChatRequest (with optional response_mime_type/schema)
    Gemini-->>Provider: Return GeminiChatResponse
    Provider->>Provider: Convert GeminiChatResponse to OpenAI format (structured output aware)
    Provider-->>Client: Return ChatCompletion (with/without structured output)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

In the warren where schemas grow,
Rabbits hop and schemas flow.
From OpenAI to Gemini’s den,
Structured outputs leap again.
Tests abound, conversions gleam,
JSON fields in every stream.
With every hop, the code is bright—
Gemini’s structured future in sight! 🐇✨

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch gz/add-gemini-stracture-output

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 78a699f in 1 minute and 4 seconds. Click for details.
  • Reviewed 515 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 7 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/vertexai/models.rs:143
  • Draft comment:
    In convert_openai_schema_to_gemini, the mapping of types (e.g. 'string' -> 'STRING') looks correct. Consider adding brief inline comments explaining why additionalProperties is ignored, to aid future maintainers.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. src/providers/vertexai/models.rs:380
  • Draft comment:
    In to_openai_with_structured_output, when is_structured_output is true and JSON-parsing the trimmed text fails, consider logging the error. This could aid debugging when the structured format isn’t valid JSON.
  • Reason this comment was not posted:
    Comment was on unchanged code.
3. src/providers/vertexai/provider.rs:55
  • Draft comment:
    In get_auth_token, the use of expect for missing credentials (either in config or env) may be too aggressive. Consider returning an error instead of panicking so that the provider handles misconfiguration more gracefully.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. src/providers/vertexai/provider.rs:161
  • Draft comment:
    The logic distinguishing between test mode and normal mode using environment variables (e.g. RECORD_MODE or use_test_auth) is well documented. Consider adding a brief comment near the endpoint construction explaining the URL structure for clarity.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. src/providers/vertexai/tests.rs:90
  • Draft comment:
    The tests cover many aspects (schema conversion, tool handling, system messages). Consider splitting this test file into sub-modules or adding section comments to improve maintainability as the test suite grows.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. src/providers/vertexai/tests.rs:705
  • Draft comment:
    Using #[should_panic] with an expected panic message is brittle – any change in the error string will break this test. Consider using a pattern match or custom error type to make the test less sensitive to exact wording.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. src/providers/vertexai/tests.rs:1180
  • Draft comment:
    Tests like test_gemini_request_with_array_content correctly combine array parts into a single string. Consider commenting that the parts are concatenated with a space to clarify intended behavior.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_1kYpdmPVHNAxukK1

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 0dd99a4 in 1 minute and 46 seconds. Click for details.
  • Reviewed 671 lines of code in 3 files
  • Skipped 1 files when reviewing.
  • Skipped posting 4 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/vertexai/models.rs:371
  • Draft comment:
    When concatenating parts for structured output in to_openai_with_structured_output, multiple JSON fragments might be merged into a single string. Consider handling multi-part JSON responses explicitly to avoid invalid JSON.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The code already has a reasonable approach - it validates each part individually and only treats it as JSON if it's complete and valid JSON. If a part isn't valid JSON, it's treated as plain text. This seems like a safe approach that avoids the issue the comment is worried about. The comment appears to be speculative rather than identifying a clear problem. The comment could be right that there are edge cases where the JSON handling could be improved, but it doesn't provide a concrete example or clear improvement suggestion. The current implementation already handles JSON validation appropriately by checking each part individually. Without a concrete example of a problem, the comment is too speculative. Delete the comment. It raises a speculative concern without clear evidence of an actual problem, and the code already has reasonable JSON validation logic.
2. src/providers/vertexai/models.rs:218
  • Draft comment:
    The conversion function ignores 'additionalProperties'. While this matches Gemini’s spec, consider logging a warning (or documenting) in case important metadata is dropped.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The code change is explicitly handling additionalProperties by ignoring it, with a clear comment explaining why. The reviewer's suggestion to add a warning is a nice-to-have but not critical since: 1) The code already documents this limitation in the comment 2) This is an internal schema conversion function 3) The behavior is intentional and matches Gemini's capabilities 4) Adding warnings could add noise for an expected behavior. The comment does identify a potential information loss during schema conversion that could be important to API users. Maybe some applications rely on additionalProperties behavior. While additionalProperties is important in some cases, this is an intentional limitation due to Gemini's capabilities, and is already documented in the code comment. Adding warnings would just add noise for an expected behavior. Delete the comment. The code already documents this limitation appropriately, and adding warnings would add unnecessary noise for an intentional, documented behavior.
3. src/providers/vertexai/tests.rs:85
  • Draft comment:
    Using an 'unsafe' block for std::env::set_var is unnecessary since set_var is a safe function. Remove the unsafe block.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. src/providers/vertexai/tests.rs:705
  • Draft comment:
    The expected panic message in test_invalid_location_format is very specific and might be brittle. Consider matching a substring or using a more flexible check.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_ZJyMLNUBU9OeZmuu

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.gitignore (1)

31-33: Broaden the ignore rule and add a documented stub

Adding service.json here is sensible for keeping service-account credentials out of the repo.
Two small tweaks will make it more robust and developer-friendly:

- service.json
+# ignore the file at any depth
+**/service.json
+
+# provide a template developers can copy & fill in
+!**/service.json.example

**/service.json also catches copies placed in sub-directories (common in multi-crate repos).
• Checking-in a redacted service.json.example avoids build breaks and clarifies the expected shape of the file.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 29dcca9 and 78a699f.

📒 Files selected for processing (4)
  • .gitignore (1 hunks)
  • src/providers/vertexai/models.rs (5 hunks)
  • src/providers/vertexai/provider.rs (3 hunks)
  • src/providers/vertexai/tests.rs (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/providers/vertexai/tests.rs (1)
src/providers/vertexai/models.rs (3)
  • convert_openai_schema_to_gemini (143-237)
  • from (240-356)
  • from (467-505)
src/providers/vertexai/models.rs (3)
src/providers/vertexai/tests.rs (3)
  • serde_json (62-62)
  • serde_json (113-113)
  • serde_json (582-582)
src/providers/vertexai/provider.rs (1)
  • new (104-125)
src/providers/registry.rs (1)
  • new (16-32)
🪛 GitHub Actions: build
src/providers/vertexai/models.rs

[error] 169-221: Clippy lint error: this if let can be collapsed into the outer if let. Suggestion to modify the outer pattern to include the inner pattern to simplify code. See https://rust-lang.github.io/rust-clippy/master/index.html#collapsible_match


[error] 175-182: Clippy lint error: this if let can be collapsed into the outer if let. Suggestion to modify the outer pattern to include the inner pattern to simplify code. See https://rust-lang.github.io/rust-clippy/master/index.html#collapsible_match

🔇 Additional comments (6)
src/providers/vertexai/provider.rs (2)

171-175: LGTM: Clean structured output detection logic.

The implementation correctly derives the structured output flag from the presence of response_schema in the generation config, using safe Option chaining with appropriate fallback.


255-258: LGTM: Consistent structured output handling.

Both test mode and normal response paths correctly use the new to_openai_with_structured_output method with the appropriate structured output flag. This ensures consistent behavior across all code paths.

Also applies to: 275-276

src/providers/vertexai/tests.rs (1)

16-16: Excellent comprehensive test coverage for structured output feature.

The test suite thoroughly validates:

  • Schema conversion for all basic and complex types
  • Error handling for unsupported schema types
  • Request conversion with different response format configurations
  • Required fields handling and property ordering
  • Integration between OpenAI and Gemini schema formats

This provides strong confidence in the correctness of the new structured output functionality.

Also applies to: 1227-1678

src/providers/vertexai/models.rs (3)

87-90: LGTM: Well-structured new fields for structured output support.

The new response_mime_type and response_schema fields are properly defined with appropriate serde attributes for JSON serialization and camelCase conversion to match the Gemini API format.


288-323: LGTM: Robust response format handling.

The implementation correctly handles different response format types:

  • json_schema: Converts OpenAI schema to Gemini format and sets MIME type
  • json_object: Sets JSON MIME type without schema
  • Other/None: No structured output configuration

The error handling is appropriate, falling back to None values when schema conversion fails.


360-463: LGTM: Well-designed structured output response handling.

The new method maintains backward compatibility while adding structured output support. The approach of:

  1. Delegating the original to_openai method to the new method with false
  2. Validating JSON-like text structure before parsing
  3. Falling back to original text if JSON parsing fails

This is a robust and safe implementation that preserves existing behavior while adding the new functionality.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed a0dd4e0 in 58 seconds. Click for details.
  • Reviewed 106 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/vertexai/models.rs:168
  • Draft comment:
    Excellent change: using 'if let Some(Value::Object(props_obj))' simplifies nested conditionals for properties extraction.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. src/providers/vertexai/models.rs:172
  • Draft comment:
    Consider reviewing the two separate iterations over required fields and remaining keys for property ordering. If performance becomes critical, a single-pass partitioning might be more efficient.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_DDix3BKv9GaSliKx

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipped PR review on 7b02f94 because no changed files had a supported extension. If you think this was in error, please contact us and we'll fix it right away.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 4e275f1 in 53 seconds. Click for details.
  • Reviewed 26 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/vertexai/models.rs:172
  • Draft comment:
    Simplified extraction of required_fields using direct pattern matching is clearer. Ensure that non-array values for 'required' are gracefully handled (as before).
  • Reason this comment was not posted:
    Comment looked like it was already resolved.

Workflow ID: wflow_hnD3cUgri67XD3TC

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/providers/vertexai/models.rs (2)

439-474: Consider enhancing JSON detection for structured output.

The structured output handling correctly validates JSON content, but the current detection logic only handles objects and arrays. Consider supporting other valid JSON root types (numbers, strings, booleans, null) if the Gemini API can return them.

-                            if (trimmed_text.starts_with('{') && trimmed_text.ends_with('}'))
-                                || (trimmed_text.starts_with('[') && trimmed_text.ends_with(']'))
+                            // Try to parse any potential JSON, not just objects/arrays
+                            match serde_json::from_str::<serde_json::Value>(trimmed_text)

This would handle cases where the structured output might be a JSON primitive value.


396-397: Minor: Consider removing unnecessary clones.

The response_mime_type and response_schema values are cloned unnecessarily since they're being moved into the struct.

-            response_mime_type: response_mime_type.clone(),
-            response_schema: response_schema.clone(),
+            response_mime_type,
+            response_schema,
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 78a699f and 5858d53.

📒 Files selected for processing (3)
  • .gitignore (1 hunks)
  • src/providers/vertexai/models.rs (5 hunks)
  • src/providers/vertexai/tests.rs (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/providers/vertexai/tests.rs
🔇 Additional comments (7)
src/providers/vertexai/models.rs (7)

87-91: LGTM! Well-structured optional fields for structured output.

The new optional fields response_mime_type and response_schema are properly configured with serde annotations to support structured output requests. The field naming follows Rust conventions while the serde rename ensures compatibility with Gemini's API expectations.


93-127: Excellent enum design for Gemini schema representation.

The GeminiSchema enum effectively models JSON schema types with proper Rust idioms:

  • Tagged union approach correctly handles type discrimination
  • Recursive ARRAY type properly uses Box<GeminiSchema> to prevent infinite size
  • OBJECT type comprehensively supports properties, ordering, and required fields
  • Serde annotations ensure proper serialization/deserialization behavior

179-191: LGTM! Proper trait implementation with safe fallback.

The From trait implementation correctly handles the conversion from OpenAI JSON schema to Gemini schema format, with appropriate fallback to a basic string schema when no schema is provided.


193-314: Robust schema conversion with comprehensive type handling.

The from_value_with_fallback method effectively handles recursive schema conversion with several strengths:

  • Safe fallback to STRING type for unsupported or malformed schemas
  • Proper handling of nested ARRAY and OBJECT types
  • Required fields are correctly prioritized in property ordering
  • Description preservation throughout the conversion process

317-320: Good backward compatibility helper function.

The helper function provides a clean JSON conversion interface while maintaining backward compatibility with existing tests.


371-388: Well-implemented response format handling.

The logic correctly processes different response format types:

  • Properly distinguishes between json_schema and json_object formats
  • Applies schema conversion only when appropriate
  • Sets correct MIME type for JSON responses
  • Handles edge cases with proper fallbacks

435-437: Good refactoring for backward compatibility.

The existing to_openai method is properly refactored to maintain backward compatibility while delegating to the new structured output method.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed 5858d53 in 2 minutes and 24 seconds. Click for details.
  • Reviewed 486 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/vertexai/models.rs:372
  • Draft comment:
    The conversion branch for 'json_schema' now directly uses GeminiSchema::from(json_schema), which simplifies error handling. Ensure that silently falling back to STRING on missing schema is acceptable.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
2. src/providers/vertexai/models.rs:226
  • Draft comment:
    In the 'object' branch of from_value_with_fallback, property ordering is computed by iterating over required fields and then properties using contains checks. Consider using a HashSet for required_fields to improve lookup efficiency, even if property counts are small.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% While technically correct that HashSet would provide O(1) lookups vs O(n), this is likely premature optimization. Schema objects typically have very few properties and required fields (usually <10). The overhead of creating a HashSet may outweigh benefits at this scale. The code is also in a request processing path where a few extra array lookups are negligible compared to network I/O. The comment identifies a real algorithmic inefficiency. I could be wrong about the typical size of schema objects - if they commonly have hundreds of properties, the O(n) lookups could matter. Even with large schemas, this code runs once per request during schema conversion. The performance impact would be minimal compared to the actual API calls and response processing. The comment suggests a premature optimization that would add complexity for negligible benefit in typical usage. The current Vec implementation is simpler and sufficient.
3. src/providers/vertexai/tests.rs:1538
  • Draft comment:
    The test for schema conversion with required and additional properties ignores 'additionalProperties'. It might be useful to document that additionalProperties is intentionally not supported by Gemini schemas.
  • Reason this comment was not posted:
    Comment was on unchanged code.
4. src/providers/vertexai/tests.rs:1351
  • Draft comment:
    The test for unsupported schema types falls back to a STRING schema. This fallback behavior is validated here, but consider adding tests for other non-object schema values as well.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.

Workflow ID: wflow_tWMwmyGJS5JJegVH

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

}

// Helper function for backward compatibility with tests
pub fn convert_openai_schema_to_gemini_json(schema: &Value) -> Result<Value, String> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a brief doc comment to the new helper function convert_openai_schema_to_gemini_json to clarify that unsupported or missing types fall back to a STRING schema.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed 2622d6d in 1 minute and 36 seconds. Click for details.
  • Reviewed 161 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/vertexai/models.rs:311
  • Draft comment:
    The legacy helper function 'convert_openai_schema_to_gemini_json' was removed, which cleans up the code. This removal is appropriate given the new public API.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. src/providers/vertexai/tests.rs:1229
  • Draft comment:
    Updated tests to use 'GeminiSchema::from_value_with_fallback' directly instead of the removed helper. The refactoring is consistent across schema conversion tests.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, describing what was done in the code without providing any actionable feedback or suggestions. It does not align with the rules for useful comments.

Workflow ID: wflow_FopOUz4dAXXESLuy

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

}

impl GeminiSchema {
pub fn from_value_with_fallback(schema: &Value, fallback_description: Option<String>) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making from_value_with_fallback public is a useful change. Consider adding a Rustdoc comment to explain its purpose and behavior for external callers.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/providers/vertexai/models.rs (2)

384-385: Verify the necessity of cloning response_mime_type and response_schema.

The .clone() calls on response_mime_type and response_schema may be unnecessary since these values are being moved into the struct and not used again.

Consider removing the unnecessary clones:

-            response_mime_type: response_mime_type.clone(),
-            response_schema: response_schema.clone(),
+            response_mime_type,
+            response_schema,

442-462: Validate JSON parsing approach for structured output.

The structured output handling attempts to detect and validate JSON content, but the current approach has some limitations:

  1. The JSON validation is done after trimming, which could potentially alter valid JSON
  2. Only basic bracket checking is performed before JSON parsing
  3. The fallback behavior may not be optimal for malformed JSON

Consider improving the JSON validation approach:

-                        if is_structured_output {
-                            // Check if the text looks like JSON and try to parse it
-                            let trimmed_text = text.trim();
-                            if (trimmed_text.starts_with('{') && trimmed_text.ends_with('}'))
-                                || (trimmed_text.starts_with('[') && trimmed_text.ends_with(']'))
-                            {
-                                // Validate that it's proper JSON
-                                match serde_json::from_str::<serde_json::Value>(trimmed_text) {
-                                    Ok(_) => {
-                                        message_text.push_str(trimmed_text);
-                                    }
-                                    Err(_) => {
-                                        message_text.push_str(&text);
-                                    }
-                                }
-                            } else {
-                                message_text.push_str(&text);
-                            }
-                        } else {
-                            message_text.push_str(&text);
-                        }
+                        if is_structured_output {
+                            let trimmed_text = text.trim();
+                            // Directly attempt JSON parsing without pre-checking brackets
+                            match serde_json::from_str::<serde_json::Value>(trimmed_text) {
+                                Ok(_) => {
+                                    message_text.push_str(trimmed_text);
+                                }
+                                Err(_) => {
+                                    // For structured output, we might want to log this as a warning
+                                    // since the model was expected to return valid JSON
+                                    message_text.push_str(&text);
+                                }
+                            }
+                        } else {
+                            message_text.push_str(&text);
+                        }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2622d6d and bea9a26.

📒 Files selected for processing (2)
  • src/providers/vertexai/models.rs (5 hunks)
  • src/providers/vertexai/tests.rs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/providers/vertexai/tests.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/providers/vertexai/models.rs (3)
src/providers/vertexai/provider.rs (1)
  • new (104-125)
src/providers/registry.rs (1)
  • new (16-32)
src/providers/vertexai/tests.rs (3)
  • serde_json (62-62)
  • serde_json (113-113)
  • serde_json (582-582)
🔇 Additional comments (5)
src/providers/vertexai/models.rs (5)

87-91: LGTM! Well-structured optional fields for structured output support.

The new fields are properly configured with appropriate serde attributes for JSON serialization/deserialization. The rename attributes ensure compatibility with Gemini's API naming conventions.


93-127: Well-designed schema enum with comprehensive type coverage.

The GeminiSchema enum effectively covers all necessary JSON schema types with proper serde configuration. The use of Box for the items field in ARRAY prevents infinite size issues, and the OBJECT variant includes all necessary fields for complex schema definitions.


179-300: Robust schema conversion with comprehensive fallback handling.

The from_value_with_fallback method correctly handles the conversion from OpenAI JSON schemas to Gemini schemas with proper fallback mechanisms. The implementation handles nested structures, required fields, and property ordering appropriately.


351-376: Proper handling of response format configuration.

The logic correctly processes different response format types (json_schema and json_object) and appropriately sets the MIME type and schema. The fallback handling ensures graceful degradation when schema information is missing.


423-431: Good refactoring to support structured output while maintaining backward compatibility.

The refactoring of to_openai to delegate to to_openai_with_structured_output maintains backward compatibility while adding the new structured output functionality.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed bea9a26 in 2 minutes and 42 seconds. Click for details.
  • Reviewed 153 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/vertexai/models.rs:352
  • Draft comment:
    When converting a ChatCompletionRequest, the new branch for a 'json_schema' response_format returns (MIME, None) if the json_schema.schema is absent. Previously, a default STRING schema (with description) was used. Confirm that this change in fallback behavior is intentional and documented.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% While the comment accurately identifies a behavior change, it violates several rules: 1) It asks for confirmation of intention ("Confirm that...") 2) It asks for documentation rather than specific code changes 3) It's speculative about whether this change needs attention rather than identifying a clear issue. The change itself seems intentional - removing special case handling in favor of a simpler approach. The behavior change could potentially break clients expecting the old behavior. Maybe this deserves more scrutiny? The old behavior was actually more complex and potentially surprising - creating a schema when none was provided. The new behavior is simpler and more intuitive - if no schema is provided, don't create one. This is better design. Delete this comment. While it correctly identifies a change, it violates rules by asking for confirmation and documentation rather than suggesting specific code changes. The change itself appears to be an intentional improvement.
2. src/providers/vertexai/tests.rs:1380
  • Draft comment:
    Consider adding a test case for the scenario where response_format is of type 'json_schema' but the json_schema.schema field is missing. This would help confirm the intended behavior of returning only the MIME type with a None schema.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_EyUOXRXhgzCL60qT

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@galzilber galzilber merged commit f16296e into main Aug 6, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants