feat(openai): extra_body passthrough + embeddings dimensions + json_object default by ZVin-Chen · Pull Request #670 · plastic-labs/honcho

ZVin-Chen · 2026-05-11T12:19:42Z

Summary

Three small, additive improvements to the OpenAI backend (src/llm/backends/openai.py + src/embedding_client.py) that broaden compatibility with OpenAI-compatible providers (DeepSeek, Alibaba Bailian, vLLM-hosted models, etc.) without changing behaviour on OpenAI itself.

1. `provider_params` → OpenAI SDK `extra_body` passthrough

ConfiguredModelSettings.overrides.provider_params is documented as a free-form passthrough for backend-specific knobs, but the OpenAI backend's _build_params was cherry-picking a small whitelist (top_p, frequency_penalty, presence_penalty, seed, verbosity) and silently dropping the rest. Any unrecognised key now flows through to the OpenAI SDK's extra_body parameter, which is the canonical way to pass provider-specific fields.

This unblocks e.g.:

[dialectic.levels.minimal.model_config.overrides]
provider_params = { thinking = { type = "disabled" } }

…on DeepSeek's v4 family, which defaults thinking.type=enabled (returns reasoning_content + rejects tool_choice=required).

2. Forward `dimensions` on embedding calls

embedding_client.py already validates that returned embedding length matches vector_dimensions, but it doesn't pass dimensions= on the wire. With Alibaba Bailian's text-embedding-v4, default output is 1024-d but the deployment's pgvector schema typically uses 1536. Calling with dimensions=self.vector_dimensions lets the operator-configured EMBEDDING_VECTOR_DIMENSIONS actually take effect — OpenAI's own text-embedding-3-* accepts the same parameter, so this is portable.

Applied at all three OpenAI embedding call sites (single embed, batch embed, batch embed with text ids).

3. Default structured-output to `json_object` + schema-in-prompt

When a Pydantic class is passed as response_format, the OpenAI backend was first trying chat.completions.parse() (Structured Outputs, json_schema mode), then falling back to a json_schema payload via _create_structured_response. Both paths use json_schema, which DeepSeek-v4 family and several Bailian / vLLM models reject (This response_format type is unavailable now / Failed to deserialize the JSON body).

Default now goes straight to {"type": "json_object"} (universally supported) with the target Pydantic schema injected as a system-message instruction. The existing repair_response_model_json machinery already handles minor JSON shape drift, so the looser enforcement is acceptable in exchange for portability. Streaming path (stream) gets the same treatment.

Trade-off: on OpenAI itself, this gives up strict-mode json_schema in favour of json_object + prompted schema. In practice the parsed output is virtually identical because gpt-* models comply with schema descriptions reliably; the repair logic catches any drift.

Why these three together

They surfaced together while wiring honcho into a stack that uses DeepSeek as the dialectic / deriver LLM and Alibaba Bailian (text-embedding-v4) for embeddings. Each change is independent and small but they share the same motivation: make the OpenAI backend gracefully cover the long tail of OpenAI-compatible providers.

End-to-end verification

Tested against:

LLM: DeepSeek v4-flash via https://api.deepseek.com/v1 with thinking={type:disabled} injected through provider_params
Embeddings: Alibaba Bailian text-embedding-v4 via https://dashscope.aliyuncs.com/compatible-mode/v1, dimensions=1536

Full dialectic chat across all 5 reasoning levels (minimal / low / medium / high / max) returns synthesized answers; deriver creates observations end-to-end; dreamer specialists (deduction + induction) complete a full update_peer_card cycle.

Notes

No new dependencies.
No behaviour change on OpenAI's own endpoints other than (a) embeddings now explicitly request dimensions (matches OpenAI's documented param) and (b) structured outputs route through json_object instead of json_schema (still produces valid Pydantic-parseable JSON via repair logic).
The submodule is wired into a downstream consumer (ZVin-Chen/emotional_agent#17) that exercises the stack end-to-end.

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Fixed embedding request dimension consistency across all operation types to ensure proper vector generation.
- Improved structured JSON output handling with better validation, error detection, and automatic correction mechanisms.
Improvements
- Enhanced provider parameter compatibility by expanding support for additional configuration options.

Provider-specific OpenAI-compatible knobs (DeepSeek's `thinking`, vLLM/SGLang options, etc.) currently can't reach the wire — `_build_params` hard-codes a small whitelist (top_p / freq / presence / seed / verbosity) and silently drops everything else from ModelConfig.provider_params. Forward any unrecognised provider_params key through to the OpenAI SDK's `extra_body` parameter so operators can inject provider-specific fields without backend changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t to json_object Two related portability improvements for OpenAI-compatible providers that don't implement OpenAI's newest API surface: 1. embedding_client.py: forward `vector_dimensions` to the embeddings endpoint as the `dimensions` parameter on every OpenAI call. Without it, providers like Alibaba Bailian's text-embedding-v4 default to a different output size than the operator-configured EMBEDDING_VECTOR_DIMENSIONS, breaking the pgvector schema match. 2. backends/openai.py: switch structured-output (`response_format=<Pydantic class>`) to `{"type": "json_object"}` and inject the target schema as a system-message instruction, instead of OpenAI Structured Outputs' `json_schema`. Reasons: * DeepSeek's v4 family rejects `json_schema` outright. * Several Bailian / vLLM-hosted models only implement OpenAI's older JSON mode (`json_object`). * OpenAI itself accepts the new shape gracefully. Schema enforcement is a bit looser; `repair_response_model_json` already handles minor drift downstream, so the trade-off favours portability. Applies to both blocking complete() and streaming stream() paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-11T12:20:00Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 65f4d609-0bd2-41cc-a4f0-4997ec204dd8

📥 Commits

Reviewing files that changed from the base of the PR and between a4ae372 and f927a30.

📒 Files selected for processing (2)

src/embedding_client.py
src/llm/backends/openai.py

Disabled knowledge base sources:

Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

Walkthrough

This PR introduces two independent improvements to OpenAI integration: embedding API calls now explicitly pass the configured dimensionality parameter, and structured output handling is refactored from SDK-specific parsing to a portable schema-injection mechanism that works across OpenAI-compatible providers.

Changes

Embedding Dimensions Parameter

Layer / File(s)	Summary
Embedding API Call Sites `src/embedding_client.py`	Three OpenAI embeddings.create call sites now include `dimensions=self.vector_dimensions` parameter: single-query embedding, simple batch embedding, and batch processing.

OpenAI Structured Output Portability

Layer / File(s)	Summary
Backend Imports `src/llm/backends/openai.py`	Imports reworked to remove parse/validation utilities and retain only Pydantic BaseModel and local repair/exception utilities needed for the new portability mechanism.
Structured Response Creation `src/llm/backends/openai.py`	`_create_structured_response()` now forces `json_object` format, generates Pydantic model JSON schema, and injects a schema instruction into the system message.
Complete Method Structured Path `src/llm/backends/openai.py`	`complete()` method with `response_format` as BaseModel now calls `_create_structured_response()`, repairs JSON via `_parse_or_repair_structured_content()`, and normalizes the result, replacing `chat.completions.parse` and `validate_structured_output`.
Stream Method Structured Path `src/llm/backends/openai.py`	`stream()` method now forces `json_object` and injects JSON-schema instruction into the first system message (creating or updating as needed).
Extra Parameters Pass-through `src/llm/backends/openai.py`	`_build_params()` recognizes common OpenAI top-level fields and routes unrecognized `extra_params` keys to `params["extra_body"]` for provider-specific option pass-through.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

embedding_client.py: OpenAI path doesn't send dimensions, silently ignoring EMBEDDING_VECTOR_DIMENSIONS #601: Directly addresses the missing dimensions parameter in OpenAI embeddings API calls across all three embedding code paths.

Poem

🐰 A rabbit hops through vectors bright,
With dimensions tuned just right!
And JSON schemas, now portably borne,
Flow through the system, carefully sworn,
To compatible providers with nary a frown! 🎯

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ZVin-Chen · 2026-05-11T12:20:30Z

Sorry, opened against the wrong base. Re-targeting to my fork's main.

ZVin-Chen and others added 2 commits May 11, 2026 17:21

ZVin-Chen closed this May 11, 2026

tgtjam mentioned this pull request May 31, 2026

fix(openai-backend): fallback json_schema→json_object for providers without structured output support #751

Open

This was referenced Jun 1, 2026

fix(llm): pass OpenAI provider extra options #760

Closed

fix(llm/openai): pass through extra_body from extra_params #722

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(openai): extra_body passthrough + embeddings dimensions + json_object default#670

feat(openai): extra_body passthrough + embeddings dimensions + json_object default#670
ZVin-Chen wants to merge 2 commits into
plastic-labs:mainfrom
ZVin-Chen:feature/openai-extra-body-passthrough

ZVin-Chen commented May 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 11, 2026 •

edited

Loading

Review failed

Uh oh!

ZVin-Chen commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ZVin-Chen commented May 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. provider_params → OpenAI SDK extra_body passthrough

2. Forward dimensions on embedding calls

3. Default structured-output to json_object + schema-in-prompt

Why these three together

End-to-end verification

Notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

Uh oh!

ZVin-Chen commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ZVin-Chen commented May 11, 2026 •

edited by coderabbitai Bot

Loading

1. `provider_params` → OpenAI SDK `extra_body` passthrough

2. Forward `dimensions` on embedding calls

3. Default structured-output to `json_object` + schema-in-prompt

coderabbitai Bot commented May 11, 2026 •

edited

Loading