Error code: 500 on simple pass-through ollama request

*Description*:
fyi I tried this and had a 500 error attempting to get this to route through to ollama. I'm using `ENVOY_VERSION=1.34.1 aigw run ai-gateway.yaml`

Here's my client failure
```bash
$ uv run --exact -q --env-file env.local ../chat.py
Traceback (most recent call last):
  File "/Users/adriancole/oss/observability-examples/inference-platforms/aigw/../chat.py", line 56, in <module>
    main()

  File "/Users/adriancole/oss/observability-examples/inference-platforms/aigw/../chat.py", line 48, in main
    chat_completion = client.chat.completions.create(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/adriancole/.cache/uv/environments-v2/chat-939cac794e42803a/lib/python3.12/site-packages/openai/_utils/_utils.py", line 287, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/adriancole/.cache/uv/environments-v2/chat-939cac794e42803a/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 925, in create
    return self._post(
           ^^^^^^^^^^^
  File "/Users/adriancole/.cache/uv/environments-v2/chat-939cac794e42803a/lib/python3.12/site-packages/openai/_base_client.py", line 1242, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/adriancole/.cache/uv/environments-v2/chat-939cac794e42803a/lib/python3.12/site-packages/openai/_base_client.py", line 1037, in request
    raise self._make_status_error_from_response(err.response) from None
openai.InternalServerError: Error code: 500
```

*Repro steps*:


env.local
```
OPENAI_BASE_URL=http://localhost:1975/v1
OPENAI_API_KEY=unused
CHAT_MODEL=qwen3:0.6B
```

chat.py
<details>

```python
# /// script
# dependencies = [
#     "openai",
#     "elastic-opentelemetry",
#     "openinference-instrumentation-openai",
#     "opentelemetry-instrumentation-httpx"
# ]
# ///
import argparse
import os

import openai
# from opentelemetry.instrumentation import auto_instrumentation
#
# auto_instrumentation.initialize()

model = os.getenv("CHAT_MODEL", "gpt-4o-mini")


def main():
    parser = argparse.ArgumentParser(description="OpenTelemetry-Enabled OpenAI Test Client")
    parser.add_argument(
        "--use-responses-api", action="store_true", help="Use the responses API instead of chat completions."
    )
    args = parser.parse_args()

    client = openai.Client()

    messages = [
        {
            "role": "user",
            "content": "Answer in up to 3 words: Which ocean contains Bouvet Island?",
        }
    ]

    # vllm-specific switch to disable thinking, ignored by other inference platforms.
    # See https://qwen.readthedocs.io/en/latest/deployment/vllm.html#thinking-non-thinking-modes
    if "qwen3" in model.lower():
        extra_body = {"chat_template_kwargs": {"enable_thinking": False}}
    else:
        extra_body = {}
    if args.use_responses_api:
        response = client.responses.create(
            model=model, input=messages[0]["content"], temperature=0, extra_body=extra_body
        )
        print(response.output[0].content[0].text)
    else:
        chat_completion = client.chat.completions.create(
            model=model, messages=messages, temperature=0, extra_body=extra_body,     extra_headers={"x-ai-eg-model": "qwen3:0.6B"}

        )
        print(chat_completion.choices[0].message.content)


if __name__ == "__main__":
    main()
```
</details>


ai-gateway.yaml
<details>

```yaml
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: aigw-run
spec:
  controllerName: gateway.envoyproxy.io/gatewayclass-controller
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: aigw-run
  namespace: default
spec:
  gatewayClassName: aigw-run
  listeners:
    - name: http
      protocol: HTTP
      port: 1975
  infrastructure:
    parametersRef:
      group: gateway.envoyproxy.io
      kind: EnvoyProxy
      name: envoy-ai-gateway
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: envoy-ai-gateway
  namespace: default
spec:
  logging:
    level:
      default: debug
---
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIGatewayRoute
metadata:
  name: aigw-run
  namespace: default
spec:
  schema:
    name: OpenAI
  targetRefs:
    - name: aigw-run
      kind: Gateway
      group: gateway.networking.k8s.io
  rules:
    - matches:
        - headers:
            - type: Exact
              name: x-ai-eg-model
              value: qwen3:0.6B
      backendRefs:
        - name: ollama
          namespace: default
---
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIServiceBackend
metadata:
  name: ollama
  namespace: default
spec:
  timeouts:
    request: 3m
  schema:
    name: OpenAI
  backendRef:
    name: ollama
    kind: Backend
    group: gateway.envoyproxy.io
    namespace: default
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: Backend
metadata:
  name: ollama
  namespace: default
spec:
  endpoints:
    - ip:
        address: 127.0.0.1
        port: 11434
```
</details>

aigw linux
```bash
docker run -it --rm -p 1975:1975 golang:1.24
go install github.com/envoyproxy/ai-gateway/cmd/aigw@main
ENVOY_VERSION=1.34.1 aigw run ai-gateway.yaml
```

aigw darwin
```bash
 go install github.com/envoyproxy/ai-gateway/cmd/aigw@main
 ENVOY_VERSION=1.34.1 aigw run ai-gateway.yaml
```


*Environment*:

github.com/envoyproxy/ai-gateway/cmd/aigw@main
ENVOY_VERSION=1.34.1

darwin or linux

*Logs*:

[darwin-logs.txt](https://github.com/user-attachments/files/20787346/darwin-logs.txt)
[linux-logs.txt](https://github.com/user-attachments/files/20787345/linux-logs.txt)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error code: 500 on simple pass-through ollama request #724

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error code: 500 on simple pass-through ollama request #724

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions