I'm using OpenRouter (OpenAI-compatible proxy) as an abstraction layer to call different LLM providers with a consistent interface, including Google's Gemini models. I've set up my code to use structured outputs with Pydantic models through OpenAI's beta.chat.completions.parse method.
While this works perfectly for other providers, and initially worked for Gemini (for a simpler use case), I'm getting errors when trying to use Google Gemini models with structured output - Gemini would return nulls instead of objects, e.g.
{annotations = ["null", "null", "null", "null"]}
example of class Annotations that uses schema with refs
class Annotation(BaseModel):
col: int
comment: str | None = None
header: ColumnName
class Annotations(BaseModel):
annotations: list[Annotation]
I've tried using the standard approach:
response = client.beta.chat.completions.parse(
model="google/gemini-2.0-flash",
messages=messages,
response_format=my_pydantic_model,
)
I figured that Gemini's OpenAPI spec is limited, see:
https://ai.google.dev/api/caching#Schema - many fields are not suppored.
https://ai.google.dev/gemini-api/docs/structured-output?lang=python - custom schema field propertyOrdering is required to preserve order of output fields (critical for custom CoT e.g.).
https://github.com/googleapis/python-genai/issues/460 - similar complaint about refs limitation.
I found that the same pydantic classes would work with the native google.gen-ai sdk (and direct google's ai studio call). I ended up with a hacky solution that I will post as an answer, hopefully it will be helpful, or someone can suggest more clear solution.