Gemini AI Structured Output with references via OpenAI SDK

Question

I'm using OpenRouter (OpenAI-compatible proxy) as an abstraction layer to call different LLM providers with a consistent interface, including Google's Gemini models. I've set up my code to use structured outputs with Pydantic models through OpenAI's beta.chat.completions.parse method.

While this works perfectly for other providers, and initially worked for Gemini (for a simpler use case), I'm getting errors when trying to use Google Gemini models with structured output - Gemini would return nulls instead of objects, e.g. {annotations = ["null", "null", "null", "null"]}

example of class Annotations that uses schema with refs

class Annotation(BaseModel):
    col: int
    comment: str | None = None
    header: ColumnName

class Annotations(BaseModel):
    annotations: list[Annotation]

I've tried using the standard approach:

response = client.beta.chat.completions.parse(
    model="google/gemini-2.0-flash",
    messages=messages,
    response_format=my_pydantic_model,
)

I figured that Gemini's OpenAPI spec is limited, see:
https://ai.google.dev/api/caching#Schema - many fields are not suppored.
https://ai.google.dev/gemini-api/docs/structured-output?lang=python - custom schema field propertyOrdering is required to preserve order of output fields (critical for custom CoT e.g.).
https://github.com/googleapis/python-genai/issues/460 - similar complaint about refs limitation.

I found that the same pydantic classes would work with the native google.gen-ai sdk (and direct google's ai studio call). I ended up with a hacky solution that I will post as an answer, hopefully it will be helpful, or someone can suggest more clear solution.

mishka · Accepted Answer · 2025-04-23 12:42:44Z

The issue stems from the current Gemini's limitations for JSON schemas compared to OpenAI's SDK expectations.

Here's a hack that worked for me:

from google import genai
from google.genai._transformers import process_schema
from pydantic import BaseModel

def normalize_schema_for_gemini(response_format: type[BaseModel]) -> dict[str, object]:
    # Convert Pydantic model to JSON schema
    json_schema: dict[str, object] = response_format.model_json_schema()

    # Use Google's own schema processing function
    googleFakeClient = genai.Client(api_key="fake")._api_client
    process_schema(json_schema, googleFakeClient, order_properties=True)

    # Return in the format Gemini expects
    return {
        "type": "json_schema",
        "json_schema": {
            "schema": json_schema,
            "name": response_format.__name__,
            "strict": True,
        },
    }

# When making the API call:
if "google" in model:
    response_type = normalize_schema_for_gemini(response_format)
else:
    response_type = response_format

response = client.beta.chat.completions.parse(
    model=model,
    messages=messages,
    response_format=response_type,
)

parsed = response.choices[0].message.parsed
if parsed is None and isinstance(response_type, dict):
    parsed = response_format.model_validate_json(response.choices[0].message.content)

The key insight is using Google's own process_schema function from the genai package to prepare the schema before sending it.

The alternative would be to use change schema manually using pydantic feautres, but I figured that calling google's native methods would be the most bullet proof.
https://docs.pydantic.dev/latest/concepts/json_schema/#model-level-customization - extra configs.
https://github.com/pydantic/pydantic/issues/889 - denormalization of refs.

Collectives™ on Stack Overflow

Gemini AI Structured Output with references via OpenAI SDK

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related