[Bug] Native Groq provider fails with 'Object of type ModelMetaclass is not JSON serializable' on every structured-output call (deriver/dialectic/summary/dream)

## Bug

Self-hosted Honcho v3.0.6 with `DERIVER_PROVIDER=groq` (and `DIALECTIC_LEVELS__*__PROVIDER=groq`, `SUMMARY_PROVIDER=groq`, `DREAM_PROVIDER=groq`) crashes on every structured-output LLM call with:

```
WARNING - Error on attempt 1/3 with groq/llama-3.3-70b-versatile: Object of type ModelMetaclass is not JSON serializable
INFO - Will retry with attempt 2/3
... (3 retries, all fail) ...
ERROR - Error processing representation batch for work unit representation:claude_code:anan-thinkpad-server:claude: RetryError[<Future at 0x... state=finished raised TypeError>]
```

Result: queue items pile up unprocessed, peer card / observations never get extracted, `/v3/.../peers/{name}/card` stays empty.

## Root cause

In `src/utils/clients.py` the Groq branch passes the Pydantic **class** verbatim to `response_format`:

- Line 2301 (non-streaming):
  ```python
  if response_model:
      groq_params[\"response_format\"] = response_model
  ```
- Line 2559 (streaming): same pattern.

The Groq Python SDK then tries to JSON-serialize that class to send over the wire, which fails because `ModelMetaclass` is not JSON-serializable. Compare to the OpenAI branch in the same file, which uses `client.chat.completions.parse(response_format=response_model, ...)` — the OpenAI SDK accepts the class directly, but the Groq SDK does not.

Groq's API expects either:
- `{\"type\": \"json_object\"}`, or
- `{\"type\": \"json_schema\", \"json_schema\": {\"name\": \"...\", \"schema\": <dict>, \"strict\": true}}`

Right now neither is sent — the raw Pydantic class is.

## Repro

`.env` (relevant slice):
```
LLM_GROQ_API_KEY=gsk_...
DERIVER_PROVIDER=groq
DERIVER_MODEL=llama-3.3-70b-versatile
DERIVER_FLUSH_ENABLED=true   # makes it fire on every message; bug also occurs without this, just slower to surface
DIALECTIC_LEVELS__minimal__PROVIDER=groq
DIALECTIC_LEVELS__minimal__MODEL=llama-3.3-70b-versatile
DIALECTIC_LEVELS__minimal__THINKING_BUDGET_TOKENS=0
DIALECTIC_LEVELS__minimal__MAX_TOOL_ITERATIONS=1
# (and same for low/medium/high/max, SUMMARY_*, DREAM_*)
```

1. Bring up Honcho v3.0.6 via the example compose (api+deriver from source).
2. POST a few messages to a session.
3. `docker compose logs deriver -f` shows the error above on every attempt.
4. Queue rows accumulate with `processed=false`:
   ```sql
   SELECT count(*) FILTER (WHERE processed) AS done,
          count(*) FILTER (WHERE NOT processed) AS pending
   FROM queue WHERE task_type = 'representation';
   ```

## Expected

Honcho should send a Groq-compatible `response_format` payload, parse the returned JSON, and validate it against `response_model` (the same way it already does in the post-response branch at lines ~2316–2334).

## Suggested fix

Two safe options at lines 2301 and 2559:

**Option A — use Groq json_schema strict mode** (preferred — Groq supports it for several models including `llama-3.3-70b-versatile`):

```python
if response_model:
    groq_params[\"response_format\"] = {
        \"type\": \"json_schema\",
        \"json_schema\": {
            \"name\": response_model.__name__,
            \"schema\": response_model.model_json_schema(),
            \"strict\": True,
        },
    }
elif json_mode:
    groq_params[\"response_format\"] = {\"type\": \"json_object\"}
```

**Option B — fall through to plain json_object mode and rely on the existing post-call `model_validate`** (less strict, broader model compatibility):

```python
if response_model or json_mode:
    groq_params[\"response_format\"] = {\"type\": \"json_object\"}
```

The existing parsing block at 2316–2334 already does `response_model.model_validate(json.loads(content))`, so Option B works without any other change. Option A gives the model a schema hint and tighter outputs.

The same change is needed at the streaming site (line 2559).

## Environment

- Honcho: v3.0.6 (commit 317b4a6)
- Deployment: self-hosted via included `docker-compose.yml.example` (built from source)
- Python: 3.13 (from upstream Dockerfile)
- `groq` package version: whatever ships with `uv sync` in v3.0.6
- Embeddings work fine (using `LLM_EMBEDDING_PROVIDER=openrouter`, no Groq involvement)
- `AUTH_USE_AUTH=true`, JWT-based auth working
- All non-Groq paths (peer create, message ingest, queue enqueue, reconciler `sync_vectors`, `cleanup_queue`) work normally
- Workaround: switch `*_PROVIDER` from `groq` to another provider (`openrouter` via `LLM_OPENAI_COMPATIBLE_*`, `anthropic`, `openai`, `gemini`) and the deriver processes successfully

## Impact

Anyone who picks Groq as a provider for a chat feature in self-hosted v3.0.6 will hit this on the very first message. The free Groq tier is the obvious choice for low-volume self-hosters, so this likely blocks a non-trivial slice of new self-hosted deployments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Native Groq provider fails with 'Object of type ModelMetaclass is not JSON serializable' on every structured-output call (deriver/dialectic/summary/dream) #634

Bug

Root cause

Repro

Expected

Suggested fix

Environment

Impact

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug] Native Groq provider fails with 'Object of type ModelMetaclass is not JSON serializable' on every structured-output call (deriver/dialectic/summary/dream) #634

Description

Bug

Root cause

Repro

Expected

Suggested fix

Environment

Impact

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions