3

I'm using OpenRouter (OpenAI-compatible proxy) as an abstraction layer to call different LLM providers with a consistent interface, including Google's Gemini models. I've set up my code to use structured outputs with Pydantic models through OpenAI's beta.chat.completions.parse method.

While this works perfectly for other providers, and initially worked for Gemini (for a simpler use case), I'm getting errors when trying to use Google Gemini models with structured output - Gemini would return nulls instead of objects, e.g. {annotations = ["null", "null", "null", "null"]}

example of class Annotations that uses schema with refs

class Annotation(BaseModel):
    col: int
    comment: str | None = None
    header: ColumnName

class Annotations(BaseModel):
    annotations: list[Annotation]

I've tried using the standard approach:

response = client.beta.chat.completions.parse(
    model="google/gemini-2.0-flash",
    messages=messages,
    response_format=my_pydantic_model,
)

I figured that Gemini's OpenAPI spec is limited, see:
https://ai.google.dev/api/caching#Schema - many fields are not suppored.
https://ai.google.dev/gemini-api/docs/structured-output?lang=python - custom schema field propertyOrdering is required to preserve order of output fields (critical for custom CoT e.g.).
https://github.com/googleapis/python-genai/issues/460 - similar complaint about refs limitation.

I found that the same pydantic classes would work with the native google.gen-ai sdk (and direct google's ai studio call). I ended up with a hacky solution that I will post as an answer, hopefully it will be helpful, or someone can suggest more clear solution.

1 Answer 1

3

The issue stems from the current Gemini's limitations for JSON schemas compared to OpenAI's SDK expectations.

Here's a hack that worked for me:

from google import genai
from google.genai._transformers import process_schema
from pydantic import BaseModel

def normalize_schema_for_gemini(response_format: type[BaseModel]) -> dict[str, object]:
    # Convert Pydantic model to JSON schema
    json_schema: dict[str, object] = response_format.model_json_schema()

    # Use Google's own schema processing function
    googleFakeClient = genai.Client(api_key="fake")._api_client
    process_schema(json_schema, googleFakeClient, order_properties=True)

    # Return in the format Gemini expects
    return {
        "type": "json_schema",
        "json_schema": {
            "schema": json_schema,
            "name": response_format.__name__,
            "strict": True,
        },
    }

# When making the API call:
if "google" in model:
    response_type = normalize_schema_for_gemini(response_format)
else:
    response_type = response_format

response = client.beta.chat.completions.parse(
    model=model,
    messages=messages,
    response_format=response_type,
)

parsed = response.choices[0].message.parsed
if parsed is None and isinstance(response_type, dict):
    parsed = response_format.model_validate_json(response.choices[0].message.content)

The key insight is using Google's own process_schema function from the genai package to prepare the schema before sending it.

The alternative would be to use change schema manually using pydantic feautres, but I figured that calling google's native methods would be the most bullet proof.
https://docs.pydantic.dev/latest/concepts/json_schema/#model-level-customization - extra configs.
https://github.com/pydantic/pydantic/issues/889 - denormalization of refs.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Draft saved
Draft discarded

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.