Skip to content

[Bug]: API server gateway does not expose model reasoning/thinking blocks in /v1/chat/completions responses #37044

@requiemforennui

Description

@requiemforennui

Bug Description

Description

When using Hermes Agent as a gateway fronting Open WebUI, reasoning blocks generated by the underlying model are displayed correctly in the Hermes CLI/TUI but are missing from the API response sent to Open WebUI. This makes reasoning/thinking traces invisible to downstream clients that rely on the OpenAI-compatible chat completions endpoint.

Environment

Relevant Config

yaml

model:
default: stepfun/step-3.7-flash:free
provider: nous
base_url: https://inference-api.nousresearch.com/v1

agent:
reasoning_effort: medium

display.show_reasoning: true is also enabled in the Hermes config.

Steps to Reproduce

  1. Configure Hermes Agent with a reasoning-capable model and display.show_reasoning: true.
  2. Connect Open WebUI to the Hermes gateway endpoint.
  3. Send a prompt that elicits a visible reasoning chain (e.g., a multi-step problem).
  4. Observe that the Hermes TUI displays the reasoning block, but Open WebUI shows only the final assistant message with no thinking/reasoning section.

Expected Behavior

The reasoning/thinking content should be included in the streamed or final API response payload so that Open WebUI and other downstream consumers can render it.

Actual Behavior

The gateway’s API server adapter strips or omits the reasoning data from the outbound response. Only the final assistant content is delivered. This appears isolated to the API server path; other adapters (TUI, messaging) display reasoning correctly.

Affected Component

Configuration (config.yaml, .env, hermes setup), Gateway (Telegram/Discord/Slack/WhatsApp)

Messaging Platform (if gateway-related)

No response

Debug Report

No debug report available - all analysis is from the Open WebUI chat.

Operating System

Windows 10

Python Version

3.11.15

Hermes Version

0.15.1

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

The API server adapter does not implement reasoning pass-through. There is no documented config toggle to enable it.

Proposed Fix (optional)

Add a config option such as gateway.expose_reasoning (default: false) and, when enabled, include the model’s reasoning content in the API server response stream. Alternatively, always include reasoning when the model returns it and display.show_reasoning is true.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/gatewayGateway runner, session dispatch, deliverytype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions