Support for Nested Pydantic Models in Schemas#107
Merged
simonw merged 2 commits intosimonw:mainfrom Oct 11, 2025
Merged
Conversation
…erence other models), the Gemini API would reject requests with errors like: ``` Invalid JSON payload received. Unknown name "$defs" at 'generation_config.response_schema': Cannot find field. Invalid JSON payload received. Unknown name "$ref" at 'generation_config.response_schema.properties[0].value.items': Cannot find field. ``` Added `test_cleanup_schema_with_refs` - a parametrized test with 4 test cases validating `$ref` resolution for each pattern: 1. Direct model reference (Person with Address) 2. List of models (Dogs with List[Dog]) 3. Optional model field (Person with Optional[Company]) 4. Nested composition (Customer → List[Order] → List[Item]) Added 4 integration tests using real Pydantic models with the Gemini API: 1. `test_nested_model_direct_reference` - Pattern 1 2. `test_prompt_with_multiple_dogs` - Pattern 2 (list of models) 3. `test_nested_model_optional` - Pattern 3 4. `test_nested_model_deep_composition` - Pattern 4 These tests validate that the schemas work end-to-end with the Gemini API.
(i.e., models that reference other models). To do this, I modified the `cleanup_schema` function in `llm_gemini.py` to resolve `$ref` references before sending the schema to Gemini: 1. Added a new helper function `_resolve_refs()` that recursively finds and replaces `$ref` references with their actual definitions 2. Updated `cleanup_schema()` to extract the `$defs` section (if present) and resolve all references using `_resolve_refs()` 3. Continue with the existing cleanup logic to remove other unsupported keys
Owner
|
This is a great patch, thank you. |
Owner
|
I'm going to land it and then add the VCR test. |
simonw
added a commit
that referenced
this pull request
Oct 11, 2025
Owner
|
My own demo of this fix. Before applying the change: python -c 'import llm
from pydantic import BaseModel
class Dog(BaseModel):
name: str
class Dogs(BaseModel):
dogs: list[Dog]
model = llm.get_model("gemini-2.5-flash")
print(model.prompt("invent 3 dogs", schema=Dogs))
'Outputs: After applying the change: {"dogs":[{"name":"Buddy"},{"name":"Max"},{"name":"Bella"}]} |
Contributor
|
It looks like this fix needs to be applied to tool schemas too. |
werdnum
pushed a commit
to werdnum/llm-gemini
that referenced
this pull request
Oct 13, 2025
The fix from PR simonw#107 resolved $ref references in response schemas but tool schemas were still passing input_schema directly without cleanup. This applies cleanup_schema() to tool.input_schema, ensuring nested Pydantic models work correctly in tool parameters. Adds test_tools_with_nested_pydantic_models() to verify that tools with nested models (PersonInput containing Address) properly resolve $ref references and work with the Gemini API. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
3 tasks
Contributor
Author
|
I've got travel coming up, so it might take me a week to get to this. If someone else wants to jump on it, go ahead. |
simonw
added a commit
that referenced
this pull request
Nov 18, 2025
The fix from PR #107 resolved $ref references in response schemas but tool schemas were still passing input_schema directly without cleanup. This applies cleanup_schema() to tool.input_schema, ensuring nested Pydantic models work correctly in tool parameters. Adds test_tools_with_nested_pydantic_models() to verify that tools with nested models (PersonInput containing Address) properly resolve $ref references and work with the Gemini API. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Andrew Garrett <andrewgarrett@google,com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Simon Willison <swillison@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix: Support for Nested Pydantic Models in Schemas
Problem
When using Pydantic schemas with nested models (i.e., models that reference other models), the Gemini API would reject requests with errors like:
This affected several common Pydantic patterns:
Direct model references: A model with a field that is another model
Lists of models: A model containing a list of another model
Optional model fields: A model with an optional reference to another model
Deeply nested compositions: Multiple levels of model references
Root Cause
When Pydantic generates JSON schemas for nested models, it uses JSON Schema's
$defs(definitions) and$ref(references) features for code reuse:{ "properties": { "dogs": { "items": {"$ref": "#/$defs/Dog"} } }, "$defs": { "Dog": { "properties": { "name": {"type": "string"} } } } }The Gemini API does not support
$defsor$ref- it requires schemas to be fully inlined.Solution
Modified the
cleanup_schemafunction inllm_gemini.pyto resolve$refreferences before sending the schema to Gemini:_resolve_refs()that recursively finds and replaces$refreferences with their actual definitionscleanup_schema()to extract the$defssection (if present) and resolve all references using_resolve_refs()Code Changes
File:
llm_gemini.py(lines 206-249)Added
_resolve_refs()helper function:Updated
cleanup_schema()to use it:The fix handles arbitrary nesting depth and uses
copy.deepcopy()to avoid mutating the original definitions.Tests Added
Unit Tests (all passing)
Added
test_cleanup_schema_with_refs- a parametrized test with 4 test cases validating$refresolution for each pattern:Integration Tests (skipped in CI, pass with real API key)
Added 4 integration tests using real Pydantic models with the Gemini API:
test_nested_model_direct_reference- Pattern 1test_prompt_with_multiple_dogs- Pattern 2 (list of models)test_nested_model_optional- Pattern 3test_nested_model_deep_composition- Pattern 4These tests validate that the schemas work end-to-end with the Gemini API.
VCR Cassette Recording Issue
Problem Encountered
When adding the new integration tests, we encountered an issue with pytest-recording not creating VCR cassettes for the new tests. The tests pass when run with a real API key (
PYTEST_GEMINI_API_KEYset), but the cassettes are not being recorded to disk.Attempted Solutions
--record-mode=once,--record-mode=new_episodes,--record-mode=rewritetests/conftest.pyCurrent Status
$refresolution all pass ✅@pytest.mark.skipfor CI until cassettes can be recordedWorkaround for Testing
Developers can verify the integration tests work by running:
PYTEST_GEMINI_API_KEY="$(llm keys get gemini)" pytest tests/test_gemini.py::test_nested_model_direct_reference -vThis appears to be a pytest-recording environment issue unrelated to the schema fix itself. The important validation is that:
Test Results
Breaking Changes
None. This is a backward-compatible bug fix that enables previously broken functionality.
Related Issues
Fixes issue with nested Pydantic models in schemas.