Strict OpenAI-compatible chat completions endpoints reject replayed tool_calls.call_id / response_item_id fields

## Bug Description

`run_agent.py` leaks Codex-only replay metadata into the normal `chat.completions` payload. This is not just a Fireworks-specific issue. Any strict OpenAI-compatible chat-completions endpoint can reject the replayed assistant tool call once Hermes sends non-standard fields back in `messages[*].tool_calls[*]`.

Fireworks is a concrete reproduction of the bug and returns a `400` validation error. Other strict OpenAI-compatible endpoints may surface the same schema violation as `422 Unprocessable Entity` instead of `400 Bad Request`.

The offending extra fields are:

- `messages[*].tool_calls[*].call_id`
- `messages[*].tool_calls[*].response_item_id`

These fields are useful for the Codex Responses API replay path, but they are not valid in strict OpenAI-compatible chat-completions payloads.

## Concrete Reproduction

Observed in Hermes logs and request dumps:

```text
Error code: 400 - {'error': {'object': 'error', 'type': 'invalid_request_error', 'code': 'invalid_request_error', 'message': "2 request validation errors: Extra inputs are not permitted, field: 'messages[2].tool_calls[0].call_id', value: 'chatcmpl-tool-a5dbdbac9e5502d9'; Extra inputs are not permitted, field: 'messages[2].tool_calls[0].response_item_id', value: 'fc_chatcmpl-tool-a5dbdbac9e5502d9'"}}
```

I saw the same failure repeatedly on March 10-11, 2026 against:

- base URL: `https://api.fireworks.ai/inference/v1`
- model: `fireworks/minimax-m2p5`

One captured request dump also includes:

- request id: `fab01e5e-0dce-45ca-b07a-d8b28e29b33d`

## Why This Is Broader Than Fireworks

The payload bug is provider-agnostic: Hermes is sending non-standard fields in the OpenAI chat-completions message schema.

Observed concrete failure:

- Fireworks returns `400 invalid_request_error`

Expected equivalent failure mode on other strict OpenAI-compatible endpoints:

- `422 Unprocessable Entity`
- other schema-validation errors complaining about extra / unknown / disallowed fields under `messages[*].tool_calls[*]`

## Steps to Reproduce

1. Run Hermes on the normal chat-completions path, not `codex_responses`.
2. Use a strict OpenAI-compatible chat-completions endpoint.
3. A concrete reproduction is:
   - `base_url=https://api.fireworks.ai/inference/v1`
   - `model=fireworks/minimax-m2p5`
4. Ask Hermes to do something that triggers a tool call.
5. Let Hermes append the assistant tool call into conversation history.
6. Send the next turn / let Hermes continue the replay.
7. Hermes sends a `messages` payload containing assistant `tool_calls` with extra `call_id` and `response_item_id` fields.
8. The strict endpoint rejects the request with a schema-validation error.

## Minimal Payload Reproduction

This assistant message shape is enough to trigger the validation failure when replayed through a strict OpenAI-compatible chat-completions endpoint:

```json
{
  "role": "assistant",
  "content": "The update failed because it's looking for `upstream/main` but your git remote is set to `origin`, not `upstream`. Let me check the git setup and fix this.",
  "tool_calls": [
    {
      "id": "chatcmpl-tool-a5dbdbac9e5502d9",
      "call_id": "chatcmpl-tool-a5dbdbac9e5502d9",
      "response_item_id": "fc_chatcmpl-tool-a5dbdbac9e5502d9",
      "type": "function",
      "function": {
        "name": "terminal",
        "arguments": "{\"command\": \"cd ~/.hermes/hermes-agent && git remote -v && git branch -vv\"}"
      }
    }
  ]
}
```

The matching tool result is standard and valid:

```json
{
  "role": "tool",
  "tool_call_id": "chatcmpl-tool-a5dbdbac9e5502d9",
  "content": "{\"output\":\"...\",\"exit_code\":0,\"error\":null}"
}
```

If I remove `call_id` and `response_item_id` from the assistant tool call before sending the chat-completions request, the validation error goes away.

## Expected Behavior

Hermes should preserve Codex-specific replay metadata only for the Codex Responses API path.

For OpenAI-compatible `chat.completions` providers, Hermes should send only the standard tool call shape:

```json
{
  "id": "...",
  "type": "function",
  "function": { "name": "...", "arguments": "..." }
}
```

## Actual Behavior

Hermes replays assistant tool calls from prior turns with Codex-only fields still attached, and strict chat-completions endpoints reject the request as invalid.

This breaks tool follow-up turns on strict OpenAI-compatible chat-completions providers even though the first tool call itself succeeds.

## Suspected Root Cause

`run_agent.py::_build_api_kwargs()` appears to reuse stored assistant messages directly for the chat-completions path. Those stored messages may contain replay metadata produced for the Codex Responses path:

- `codex_reasoning_items`
- `tool_calls[*].call_id`
- `tool_calls[*].response_item_id`

That metadata should not be passed through unchanged to strict chat-completions providers.

## Proposed Fix

In the non-`codex_responses` branch of `_build_api_kwargs()`:

1. Deep-copy outgoing messages before sending them to `chat.completions`.
2. Strip `codex_reasoning_items` from assistant messages.
3. Strip `call_id` and `response_item_id` from each `tool_calls[*]`.
4. Keep the standard `tool_calls[*].id` and provider-specific valid fields intact.
5. Do not mutate the stored conversation history, because the Codex Responses replay path still needs this metadata.

## Regression Test Suggestion

Add a provider-parity test that:

1. Builds a chat-completions request with an assistant tool call containing:
   - `id`
   - `call_id`
   - `response_item_id`
2. Verifies the outgoing `messages` payload still contains:
   - `id`
   - `function.name`
   - `function.arguments`
3. Verifies the outgoing payload does **not** contain:
   - `call_id`
   - `response_item_id`
   - `codex_reasoning_items`
4. Verifies the original stored history is unchanged.

## Environment

- Repo: `NousResearch/hermes-agent`
- Reproduced against upstream `main` before local patching
- Provider path: standard `chat.completions`
- Concrete failing endpoint: `https://api.fireworks.ai/inference/v1`
- Concrete model: `fireworks/minimax-m2p5`
- Fireworks observed status: `400`
- Other strict OpenAI-compatible endpoints may return: `422`
- Observed on March 10-11, 2026


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strict OpenAI-compatible chat completions endpoints reject replayed tool_calls.call_id / response_item_id fields #893

Bug Description

Concrete Reproduction

Why This Is Broader Than Fireworks

Steps to Reproduce

Minimal Payload Reproduction

Expected Behavior

Actual Behavior

Suspected Root Cause

Proposed Fix

Regression Test Suggestion

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Strict OpenAI-compatible chat completions endpoints reject replayed tool_calls.call_id / response_item_id fields #893

Description

Bug Description

Concrete Reproduction

Why This Is Broader Than Fireworks

Steps to Reproduce

Minimal Payload Reproduction

Expected Behavior

Actual Behavior

Suspected Root Cause

Proposed Fix

Regression Test Suggestion

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions