Skip to content

Bug: Switching models mid-session causes API failures. #3304

@byLasri

Description

@byLasri

What happened?

Image

How to reproduce the bug

  1. Start a new session using a model/provider that supports reasoning or “thinking” output (e.g., Gemini).
  2. Send a simple prompt (e.g., "Say OK") and ensure the response includes thinking/reasoning content.
  3. Without clearing the session, switch models using /model to a different provider that strictly follows the OpenAI-compatible format (e.g., Mistral or Groq).
    (Note: Providers like OpenRouter or Gemini typically do not show this issue.)
  4. Send another prompt.

Result

  • The request fails, returning an error (varies depending on the provider).

Important notes

  • This issue is provider/API-related, not model-related.
  • It can occur with any “thinking” model if the target API does not accept reasoning content.
  • It does not occur when switching between providers that both tolerate or support reasoning content (e.g., OpenRouter → Gemini).
  • It does not occur if the session starts directly with the second provider.
  • It does not occur if the conversation is cleared before switching.
  • It only occurs when existing conversation history is reused across different providers.

Impact on users

  • Prevents continuing a conversation after switching providers
  • Forces users to clear the session, resulting in loss of context/history
  • Makes some providers effectively unusable in multi-provider workflows
  • Creates inconsistent behavior across “OpenAI-compatible” APIs, reducing reliability of model switching

Why it happens (step-by-step trace)

Step 1: A reasoning-capable model returns extra data

When you use a model that supports reasoning (like GPT-OSS), the API response includes special fields:

Example prompt:

Say OK

Example response:

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "OK",
      "reasoning_content": "User asked me to say OK"
    }
  }]
}

The important part is the presence of reasoning_content, which is not part of the standard OpenAI message format.


Step 2: The code captures reasoning_content

During response processing, this field is extracted and attached to the message, in converter.ts (lines 320-340), when processing the API response, the code extracts the reasoning:

     const reasoningContent = thoughtParts.join('');
     if (reasoningContent) {
       message.reasoning_content = reasoningContent;
     }

This creates an extended message with the reasoning_content field, the message now includes an extra non-standard field.


Step 3: The message is stored in conversation history

The conversation history keeps the full message:

conversationHistory = [
  { role: "user", content: "Say OK" },
  {
    role: "assistant",
    content: "OK",
    reasoning_content: "User asked me to say OK" // This field is stored
  }
];

At this point, the session is already no longer strictly OpenAI-compatible.


Step 4: The user switches provider

Now the user switches to another provider (e.g., Mistral or Groq) mid-session without clearing the session, the conversation history still contains the old messages with reasoning_content.

The same conversationHistory is reused.


Step 5: Sending Messages to the new swiched provider.

When sending the next prompt, In pipeline.ts (line 68), the code sends messages to the API:

const openaiResponse = await this.client.chat.completions.create({
  model: config.model,
  messages: messages
});

The payload still includes reasoning_content:

{
  "messages": [
    {
      "role": "assistant",
      "content": "OK",
      "reasoning_content": "User asked me to say OK."
    },
    {
      "role": "user",
      "content": "Say OK again"
    }
  ]
}

There is no sanitization step.


Step 6: The SDK forwards everything unchanged

The OpenAI SDK's client.chat.completions.create() sends the messages to the API endpoint. The SDK does not validate or strip unknown fields - it sends exactly what it receives.


Step 7: The new provider rejects the request

Strict providers/API's validation layer receives the request and checks the message format, their schema does not include reasoning_content as a valid field.

{
  "error": {
    "message": "API Error: 422 status code (no body)",
    "type": "invalid_request_error"
  }
}

Why some providers work and others fail

  • Permissive providers

    • Ignore unknown fields
    • Request succeeds
  • Strict providers

    • Enforce schema validation
    • Request fails when encountering reasoning_content

Core problem

  1. reasoning_content is added to messages
  2. It is stored in conversation history
  3. It is reused across providers
  4. It is never removed before sending

The system assumes all OpenAI-compatible APIs accept the same format, but in practice, they enforce different validation rules.


Summary

The failure occurs because:

  • Extra reasoning data is persisted
  • It is sent unchanged after switching providers
  • The new provider rejects the unknown field

This creates a hidden incompatibility that only appears when switching to some providers mid-session.

What did you expect to happen?

I expect to have a safe cross-providers switching mid-session without the risk of unability to use some models within the same session.

Client information

Qwen Code: 0.14.4 (40670d7)
Runtime: Node.js v24.14.1 / npm 11.11.0
OS: linux x64 (6.6.87.2-microsoft-standard-WSL2)
Auth: API Key - openai
Model: qwen/qwen3.5-397b-a17b
Fast Model: qwen/qwen3.5-397b-a17b
Session ID: 6e5c0e9f-b2b1-466c-8502-817535570ae1
Sandbox: no sandbox
Proxy: no proxy
Memory Usage: 476.1 MB

Login information

No response

Anything else we need to know?

How to fix

The fix is up to you but I may recommand...


1. Add setting to ContentGeneratorConfig

File: contentGenerator.ts

export type ContentGeneratorConfig = {
  model: string;
  // ... existing fields ...

  // Controls whether reasoning/thinking history is forwarded
  // Default: false (safe for all providers)
  sendThinkingHistory?: boolean;
};

2. Extend provider interface

File: provider/types.ts

export interface OpenAICompatibleProvider {
  buildHeaders(): Record<string, string | undefined>;
  buildClient(): OpenAI;

  buildRequest(
    request: OpenAI.Chat.ChatCompletionCreateParams,
    userPromptId: string,
  ): OpenAI.Chat.ChatCompletionCreateParams;

  getDefaultGenerationConfig(): GenerateContentConfig;

  // Optional capability flag
  supportsReasoningContent?(): boolean;
}

3. Default provider behavior

File: provider/default.ts

export class DefaultOpenAICompatibleProvider {
  supportsReasoningContent(): boolean {
    return false; // safe default
  }
}

Only explicitly verified providers override this.


4. Sanitize messages in pipeline (main fix)

File: pipeline.ts

private shouldStripReasoningContent(): boolean {
  const providerSupports =
    this.config.provider.supportsReasoningContent?.() ?? false;

  const userEnabled =
    this.contentGeneratorConfig.sendThinkingHistory ?? false;

  return !(providerSupports && userEnabled);
}
private sanitizeMessages(messages: OpenAI.Chat.ChatCompletionMessageParam[]) {
  if (!this.shouldStripReasoningContent()) {
    return messages;
  }

  return messages.map(({ reasoning_content, ...rest }: any) => rest);
}

Usage:

const sanitizedMessages = this.sanitizeMessages(messages);

const openaiRequest = {
  model: effectiveModel,
  messages: sanitizedMessages,
};

5. Update settings schema

File: settings.json

{
  "model": {
    "name": "mistral-large",
    "sendThinkingHistory": false
  }
}

Key design outcome

  • reasoning_content is opt-in only
  • Providers are safe by default
  • Message sanitization guarantees no cross-provider breakage
  • Switching models mid-session becomes fully safe and deterministic

Summary

The fix ensures that reasoning-related fields are only preserved when:

  • the provider explicitly supports them, and
  • the user explicitly enables them,

otherwise they are always removed before sending requests, unless selectively added to supported providers like gemini so natively to true.

Key design outcome

  • reasoning_content is opt-in only
  • Providers are safe by default
  • Message sanitization guarantees no cross-provider breakage
  • Switching models mid-session becomes fully safe and deterministic
  • Stripping reasoning_content reduces token usage, request size, and context window pressure, improving performance and allowing longer conversations without unnecessary context bloat

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions