Skip to content

[Bug] Native Groq provider fails with 'Object of type ModelMetaclass is not JSON serializable' on every structured-output call (deriver/dialectic/summary/dream) #634

@Panandika

Description

@Panandika

Bug

Self-hosted Honcho v3.0.6 with DERIVER_PROVIDER=groq (and DIALECTIC_LEVELS__*__PROVIDER=groq, SUMMARY_PROVIDER=groq, DREAM_PROVIDER=groq) crashes on every structured-output LLM call with:

WARNING - Error on attempt 1/3 with groq/llama-3.3-70b-versatile: Object of type ModelMetaclass is not JSON serializable
INFO - Will retry with attempt 2/3
... (3 retries, all fail) ...
ERROR - Error processing representation batch for work unit representation:claude_code:anan-thinkpad-server:claude: RetryError[<Future at 0x... state=finished raised TypeError>]

Result: queue items pile up unprocessed, peer card / observations never get extracted, /v3/.../peers/{name}/card stays empty.

Root cause

In src/utils/clients.py the Groq branch passes the Pydantic class verbatim to response_format:

  • Line 2301 (non-streaming):
    if response_model:
        groq_params[\"response_format\"] = response_model
  • Line 2559 (streaming): same pattern.

The Groq Python SDK then tries to JSON-serialize that class to send over the wire, which fails because ModelMetaclass is not JSON-serializable. Compare to the OpenAI branch in the same file, which uses client.chat.completions.parse(response_format=response_model, ...) — the OpenAI SDK accepts the class directly, but the Groq SDK does not.

Groq's API expects either:

  • {\"type\": \"json_object\"}, or
  • {\"type\": \"json_schema\", \"json_schema\": {\"name\": \"...\", \"schema\": <dict>, \"strict\": true}}

Right now neither is sent — the raw Pydantic class is.

Repro

.env (relevant slice):

LLM_GROQ_API_KEY=gsk_...
DERIVER_PROVIDER=groq
DERIVER_MODEL=llama-3.3-70b-versatile
DERIVER_FLUSH_ENABLED=true   # makes it fire on every message; bug also occurs without this, just slower to surface
DIALECTIC_LEVELS__minimal__PROVIDER=groq
DIALECTIC_LEVELS__minimal__MODEL=llama-3.3-70b-versatile
DIALECTIC_LEVELS__minimal__THINKING_BUDGET_TOKENS=0
DIALECTIC_LEVELS__minimal__MAX_TOOL_ITERATIONS=1
# (and same for low/medium/high/max, SUMMARY_*, DREAM_*)
  1. Bring up Honcho v3.0.6 via the example compose (api+deriver from source).
  2. POST a few messages to a session.
  3. docker compose logs deriver -f shows the error above on every attempt.
  4. Queue rows accumulate with processed=false:
    SELECT count(*) FILTER (WHERE processed) AS done,
           count(*) FILTER (WHERE NOT processed) AS pending
    FROM queue WHERE task_type = 'representation';

Expected

Honcho should send a Groq-compatible response_format payload, parse the returned JSON, and validate it against response_model (the same way it already does in the post-response branch at lines ~2316–2334).

Suggested fix

Two safe options at lines 2301 and 2559:

Option A — use Groq json_schema strict mode (preferred — Groq supports it for several models including llama-3.3-70b-versatile):

if response_model:
    groq_params[\"response_format\"] = {
        \"type\": \"json_schema\",
        \"json_schema\": {
            \"name\": response_model.__name__,
            \"schema\": response_model.model_json_schema(),
            \"strict\": True,
        },
    }
elif json_mode:
    groq_params[\"response_format\"] = {\"type\": \"json_object\"}

Option B — fall through to plain json_object mode and rely on the existing post-call model_validate (less strict, broader model compatibility):

if response_model or json_mode:
    groq_params[\"response_format\"] = {\"type\": \"json_object\"}

The existing parsing block at 2316–2334 already does response_model.model_validate(json.loads(content)), so Option B works without any other change. Option A gives the model a schema hint and tighter outputs.

The same change is needed at the streaming site (line 2559).

Environment

  • Honcho: v3.0.6 (commit 317b4a6)
  • Deployment: self-hosted via included docker-compose.yml.example (built from source)
  • Python: 3.13 (from upstream Dockerfile)
  • groq package version: whatever ships with uv sync in v3.0.6
  • Embeddings work fine (using LLM_EMBEDDING_PROVIDER=openrouter, no Groq involvement)
  • AUTH_USE_AUTH=true, JWT-based auth working
  • All non-Groq paths (peer create, message ingest, queue enqueue, reconciler sync_vectors, cleanup_queue) work normally
  • Workaround: switch *_PROVIDER from groq to another provider (openrouter via LLM_OPENAI_COMPATIBLE_*, anthropic, openai, gemini) and the deriver processes successfully

Impact

Anyone who picks Groq as a provider for a chat feature in self-hosted v3.0.6 will hit this on the very first message. The free Groq tier is the obvious choice for low-volume self-hosters, so this likely blocks a non-trivial slice of new self-hosted deployments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions