Clarify support for Qwen/vLLM reasoning-parser output in custom OpenAI-compatible providers

## Summary

I am investigating Hermes Agent with a local OpenAI-compatible vLLM endpoint serving Qwen reasoning models.

When vLLM is launched with `--reasoning-parser qwen3`, Qwen reasoning output may be separated from normal assistant `content` into reasoning-specific fields such as `reasoning`, `reasoning_content`, or streaming `delta.reasoning`.

Could Hermes clarify whether custom OpenAI-compatible providers are expected to support this response shape, ignore it, or require visible assistant output to be present in `content`?

## Current status

This is not currently reproduced in my stable runtime.

My current working configuration disables `--reasoning-parser qwen3`, so the stable runtime does not exercise the separated reasoning-output path.

This issue is based on an integration investigation and is primarily a support-boundary / documentation clarification question, not a confirmed Hermes bug report.

## Environment

* Hermes Agent: v0.14.0
* Backend: local OpenAI-compatible vLLM endpoint
* Model family: Qwen reasoning model
* vLLM configuration under investigation: `--reasoning-parser qwen3`
* Current stable workaround: reasoning-parser disabled
* Additional workaround: pass `chat_template_kwargs.enable_thinking=false` where supported so the model does not emit thinking/reasoning content

## Observed / suspected behavior

With Qwen/vLLM reasoning-parser enabled, visible assistant `content` may be empty while reasoning output is emitted separately.

If a custom OpenAI-compatible provider primarily reads `content`, the assistant response can appear empty or invalid even though the backend generated reasoning output.

## Workaround

The current stable local configuration avoids this path by:

1. Disabling `--reasoning-parser qwen3`
2. Passing `chat_template_kwargs.enable_thinking=false` to vLLM / Qwen where supported

With thinking disabled, normal assistant output is stable in the current runtime.

## Request

Could Hermes clarify the expected behavior for custom OpenAI-compatible providers when the backend returns reasoning-specific fields?

Specifically:

* Are `reasoning`, `reasoning_content`, or streaming `delta.reasoning` expected to be supported?
* Should Hermes ignore those fields and require visible output in `content`?
* If unsupported, could this be documented for Qwen/vLLM reasoning-parser users?
* If supported, should the custom provider normalize reasoning-separated output before deciding that the response is empty?

## Notes

I am intentionally not filing this as a confirmed Hermes bug, because my current stable runtime disables the reasoning-parser path.

If this response shape is unsupported by design, documenting that limitation would already be helpful.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify support for Qwen/vLLM reasoning-parser output in custom OpenAI-compatible providers #38360

Summary

Current status

Environment

Observed / suspected behavior

Workaround

Request

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Clarify support for Qwen/vLLM reasoning-parser output in custom OpenAI-compatible providers #38360

Description

Summary

Current status

Environment

Observed / suspected behavior

Workaround

Request

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions