fix: extract real token usage in /v1/chat/completions response (closes #38735) by Br1an67 · Pull Request #46293 · openclaw/openclaw

Br1an67 · 2026-03-14T15:35:21Z

Summary

Problem: /v1/chat/completions returns hardcoded zero usage tokens
What changed: Usage now extracted from agentCommandFromIngress result.meta
What did NOT change: Chat completion response format, streaming behavior

Change Type

Bug fix

Linked Issue/PR

Closes /v1/chat/completions returns hardcoded usage: { prompt_tokens: 0, completion_tokens: 0, total_tokens: 0 } #38735

Security Impact

All No

Evidence

pnpm build + pnpm check + pnpm test all passing

Compatibility / Migration

Backward compatible? Yes — same response shape, just with real values

Failure Recovery

How to revert: revert this commit

This PR was AI-assisted (fully tested with pnpm build/check/test).

greptile-apps · 2026-03-14T15:39:36Z

Greptile Summary

This PR fixes a long-standing bug (#38735) where /v1/chat/completions always returned hardcoded zero token counts by extracting real usage from result.meta.agentMeta.usage. The non-streaming fix is correct and straightforward. However, the streaming fix has a critical flaw that makes it effectively inoperative in normal usage, and the streaming path also deviates from the OpenAI spec around when usage should be emitted.

Key changes:

Adds resolveUsage(result) helper to extract prompt_tokens, completion_tokens, and total_tokens from the agent result's metadata — defensive and correctly typed.
Non-streaming path (lines 500–518): Works correctly. resolveUsage is called on the resolved result and the real token counts are included in the response.
Streaming path (lines 604–614): The new usage SSE chunk is unreachable in normal operation. emitAgentEvent dispatches listeners synchronously (agent-events.ts), so the lifecycle "end" event fires and sets closed = true while agentCommandFromIngress is still resolving. By the time the await resumes, the if (closed) return guard at line 583 exits early — the usage chunk is never written.
Spec deviation: The OpenAI streaming spec only includes a usage chunk when the request sets stream_options: { include_usage: true }. This PR writes the usage chunk unconditionally, which may confuse spec-compliant clients.

Confidence Score: 2/5

The streaming usage fix is effectively a no-op in normal operation due to a synchronous event dispatch ordering issue; merging as-is would give a false sense that streaming usage is fixed when it is not.
The non-streaming fix is correct and safe. However, the streaming half of the fix has a confirmed logic bug: emitAgentEvent is synchronous, so the lifecycle "end" handler runs and sets closed = true before agentCommandFromIngress returns, causing the if (closed) return early-exit to silently drop the usage chunk on every normal streaming call. The bug does not cause data corruption or a crash, but it means the stated goal of the PR (real token counts in streaming responses) is not achieved.
src/gateway/openai-http.ts — specifically the streaming async IIFE (lines 579–639) and the interaction between the lifecycle event handler and the closed guard.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/gateway/openai-http.ts
Line: 604-614

Comment:
**Streaming usage chunk is dead code in normal operation**

`emitAgentEvent` (in `agent-events.ts`) dispatches listeners **synchronously** via a `for` loop. In `agentCommandInternal` (`agent.ts` line 810), the lifecycle `"end"` event is emitted *before* the function returns — meaning the `onAgentEvent` handler in this file runs synchronously, sets `closed = true`, calls `writeDone(res)`, and calls `res.end()` **while `agentCommandFromIngress` is still resolving its Promise**.

By the time `await agentCommandFromIngress(...)` at line 581 resumes, `closed` is already `true`. The check at line 583 (`if (closed) return;`) then exits early, so the usage chunk below is never reached in any normal streaming run.

Evidence from `agent.ts`:
```
// line 810 — emitted synchronously, BEFORE agentCommandInternal returns
emitAgentEvent({ runId, stream: "lifecycle", data: { phase: "end", endedAt: Date.now() } });

// line 819 — execution continues (persist transcript, build return value) ...
// ... then agentCommandFromIngress finally resolves
```

One approach to fix this: move the usage write to *before* the `if (closed) return` guard, using `res.writableEnded` as the safety check instead of `closed`:

```
const result = await agentCommandFromIngress(commandInput, defaultRuntime, deps);

// Write usage before the early-return so it is sent even when the lifecycle
// event already closed `closed`. Guard on the actual socket state instead.
if (!res.writableEnded) {
  const usage = resolveUsage(result);
  writeSse(res, {
    id: runId,
    object: "chat.completion.chunk",
    created: Math.floor(Date.now() / 1000),
    model,
    choices: [{ index: 0, delta: {}, finish_reason: "stop" }],
    usage,
  });
}

if (closed) {
  return;
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/gateway/openai-http.ts
Line: 604-614

Comment:
**`stream_options.include_usage` not checked**

The OpenAI streaming spec only includes a usage chunk when the request contains `stream_options: { include_usage: true }`. Emitting the usage chunk unconditionally means callers that do not set this option will receive an unexpected extra SSE event, which could confuse well-behaved OpenAI-compatible clients that follow the spec strictly.

Consider gating the streaming usage chunk on the request field:

```typescript
const includeUsage =
  typeof (payload as { stream_options?: { include_usage?: unknown } }).stream_options
    ?.include_usage === "boolean"
    ? (payload as { stream_options: { include_usage: boolean } }).stream_options.include_usage
    : false;
```

Then only write the usage chunk when `includeUsage` is `true`. The non-streaming path is unaffected because usage is always included there per the spec.

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: f667b11}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f667b11d39

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-14T15:39:41Z

+  return {
+    prompt_tokens: prompt,
+    completion_tokens: completion,
+    total_tokens: prompt + completion,


Preserve usage.total when mapping OpenAI token counts

result.meta.agentMeta.usage can carry a total value that is not equal to input + output (for example when cache tokens are counted, or when only total usage is available), but this code always recomputes total_tokens from prompt/completion only. That causes /v1/chat/completions to under-report total usage and can still emit 0 totals for runs that reported non-zero totals, so the mapping should prefer usage.total when present and only fall back to the sum.

Useful? React with 👍 / 👎.

greptile-apps · 2026-03-14T15:39:40Z

+      if (!closed) {
+        const usage = resolveUsage(result);
+        writeSse(res, {
+          id: runId,
+          object: "chat.completion.chunk",
+          created: Math.floor(Date.now() / 1000),
+          model,
+          choices: [{ index: 0, delta: {}, finish_reason: "stop" }],
+          usage,
+        });
+      }


Streaming usage chunk is dead code in normal operation

emitAgentEvent (in agent-events.ts) dispatches listeners synchronously via a for loop. In agentCommandInternal (agent.ts line 810), the lifecycle "end" event is emitted before the function returns — meaning the onAgentEvent handler in this file runs synchronously, sets closed = true, calls writeDone(res), and calls res.end() while agentCommandFromIngress is still resolving its Promise.

By the time await agentCommandFromIngress(...) at line 581 resumes, closed is already true. The check at line 583 (if (closed) return;) then exits early, so the usage chunk below is never reached in any normal streaming run.

Evidence from agent.ts:

// line 810 — emitted synchronously, BEFORE agentCommandInternal returns emitAgentEvent({ runId, stream: "lifecycle", data: { phase: "end", endedAt: Date.now() } }); // line 819 — execution continues (persist transcript, build return value) ... // ... then agentCommandFromIngress finally resolves

One approach to fix this: move the usage write to before the if (closed) return guard, using res.writableEnded as the safety check instead of closed:

const result = await agentCommandFromIngress(commandInput, defaultRuntime, deps); // Write usage before the early-return so it is sent even when the lifecycle // event already closed `closed`. Guard on the actual socket state instead. if (!res.writableEnded) { const usage = resolveUsage(result); writeSse(res, { id: runId, object: "chat.completion.chunk", created: Math.floor(Date.now() / 1000), model, choices: [{ index: 0, delta: {}, finish_reason: "stop" }], usage, }); } if (closed) { return; }

Prompt To Fix With AI

This is a comment left during a code review. Path: src/gateway/openai-http.ts Line: 604-614 Comment: **Streaming usage chunk is dead code in normal operation** `emitAgentEvent` (in `agent-events.ts`) dispatches listeners **synchronously** via a `for` loop. In `agentCommandInternal` (`agent.ts` line 810), the lifecycle `"end"` event is emitted *before* the function returns — meaning the `onAgentEvent` handler in this file runs synchronously, sets `closed = true`, calls `writeDone(res)`, and calls `res.end()` **while `agentCommandFromIngress` is still resolving its Promise**. By the time `await agentCommandFromIngress(...)` at line 581 resumes, `closed` is already `true`. The check at line 583 (`if (closed) return;`) then exits early, so the usage chunk below is never reached in any normal streaming run. Evidence from `agent.ts`: ``` // line 810 — emitted synchronously, BEFORE agentCommandInternal returns emitAgentEvent({ runId, stream: "lifecycle", data: { phase: "end", endedAt: Date.now() } }); // line 819 — execution continues (persist transcript, build return value) ... // ... then agentCommandFromIngress finally resolves ``` One approach to fix this: move the usage write to *before* the `if (closed) return` guard, using `res.writableEnded` as the safety check instead of `closed`: ``` const result = await agentCommandFromIngress(commandInput, defaultRuntime, deps); // Write usage before the early-return so it is sent even when the lifecycle // event already closed `closed`. Guard on the actual socket state instead. if (!res.writableEnded) { const usage = resolveUsage(result); writeSse(res, { id: runId, object: "chat.completion.chunk", created: Math.floor(Date.now() / 1000), model, choices: [{ index: 0, delta: {}, finish_reason: "stop" }], usage, }); } if (closed) { return; } ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-14T15:39:41Z

+      if (!closed) {
+        const usage = resolveUsage(result);
+        writeSse(res, {
+          id: runId,
+          object: "chat.completion.chunk",
+          created: Math.floor(Date.now() / 1000),
+          model,
+          choices: [{ index: 0, delta: {}, finish_reason: "stop" }],
+          usage,
+        });
+      }


stream_options.include_usage not checked

The OpenAI streaming spec only includes a usage chunk when the request contains stream_options: { include_usage: true }. Emitting the usage chunk unconditionally means callers that do not set this option will receive an unexpected extra SSE event, which could confuse well-behaved OpenAI-compatible clients that follow the spec strictly.

Consider gating the streaming usage chunk on the request field:

const includeUsage = typeof (payload as { stream_options?: { include_usage?: unknown } }).stream_options ?.include_usage === "boolean" ? (payload as { stream_options: { include_usage: boolean } }).stream_options.include_usage : false;

Then only write the usage chunk when includeUsage is true. The non-streaming path is unaffected because usage is always included there per the spec.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/gateway/openai-http.ts Line: 604-614 Comment: **`stream_options.include_usage` not checked** The OpenAI streaming spec only includes a usage chunk when the request contains `stream_options: { include_usage: true }`. Emitting the usage chunk unconditionally means callers that do not set this option will receive an unexpected extra SSE event, which could confuse well-behaved OpenAI-compatible clients that follow the spec strictly. Consider gating the streaming usage chunk on the request field: ```typescript const includeUsage = typeof (payload as { stream_options?: { include_usage?: unknown } }).stream_options ?.include_usage === "boolean" ? (payload as { stream_options: { include_usage: boolean } }).stream_options.include_usage : false; ``` Then only write the usage chunk when `includeUsage` is `true`. The non-streaming path is unaffected because usage is always included there per the spec. How can I resolve this? If you propose a fix, please make it concise.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c49a5c52db

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-14T16:35:17Z

+  return {
+    prompt_tokens: prompt,
+    completion_tokens: completion,
+    total_tokens: meta?.total ?? prompt + completion,


Count cache tokens in fallback total_tokens calculation

When agentMeta.usage.total is missing, this fallback computes total_tokens as prompt + completion, but prompt is taken from usage.input only. In this codebase, usage can include cacheRead/cacheWrite separately (for example, CLI parsing in src/agents/cli-runner/helpers.ts populates cache fields while total remains optional), so cache-hit runs will still under-report usage.total_tokens on /v1/chat/completions. The fallback should include cache fields (or reuse a shared usage normalizer) so totals stay accurate when total is absent.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dc1ff0d0d1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-15T07:59:47Z

+      if (!closed) {
+        const usage = resolveUsage(result);


Emit streaming usage before lifecycle closes response

In the streaming path, this usage chunk is guarded by if (!closed), but closed is set to true as soon as a lifecycle phase: "end"/"error" event arrives, and agentCommandFromIngress emits one of those before returning in normal runs (src/commands/agent.ts). That means the new usage write is skipped for real streamed completions, so clients still won’t receive token usage despite this change. Consider deferring res.end() until after post-run usage emission or moving usage emission into the lifecycle-end handling path.

Useful? React with 👍 / 👎.

…openclaw#38735) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0fdd003b2e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-15T22:24:46Z

+    id === defaultSlotIdForKey("contextEngine") &&
+    normalizedOwner !== CORE_CONTEXT_ENGINE_OWNER
+  ) {
+    return { ok: false, existingOwner: CORE_CONTEXT_ENGINE_OWNER };


Register default context engine through privileged owner

This guard blocks registrations for defaultSlotIdForKey("contextEngine") unless the owner is core, but built-in startup still registers legacy via registerLegacyContextEngine() → registerContextEngine("legacy", ...) (public owner), and that failure result is ignored. As a result the default legacy engine never enters the registry, so resolveContextEngine() throws Context engine "legacy" is not registered on normal agent paths that call ensureContextEnginesInitialized() before resolution, turning default runs into hard failures unless a custom engine is configured.

Useful? React with 👍 / 👎.

Br1an67 · 2026-03-17T10:01:55Z

Closing to manage active PR count. Will reopen when slot is available.

openclaw-barnacle Bot added gateway Gateway runtime size: XS labels Mar 14, 2026

chatgpt-codex-connector Bot reviewed Mar 14, 2026

View reviewed changes

greptile-apps Bot reviewed Mar 14, 2026

View reviewed changes

Br1an67 force-pushed the fix/38735 branch from f667b11 to c49a5c5 Compare March 14, 2026 16:29

chatgpt-codex-connector Bot reviewed Mar 14, 2026

View reviewed changes

Br1an67 force-pushed the fix/38735 branch from c49a5c5 to dc1ff0d Compare March 15, 2026 07:54

chatgpt-codex-connector Bot reviewed Mar 15, 2026

View reviewed changes

fix: extract real token usage in /v1/chat/completions response (closes …

0fdd003

…openclaw#38735) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Br1an67 force-pushed the fix/38735 branch from dc1ff0d to 0fdd003 Compare March 15, 2026 22:19

openclaw-barnacle Bot added docs Improvements or additions to documentation size: S and removed size: XS labels Mar 15, 2026

chatgpt-codex-connector Bot reviewed Mar 15, 2026

View reviewed changes

Br1an67 closed this Mar 17, 2026

Lellansin mentioned this pull request Apr 8, 2026

fix(gateway): return real usage for OpenAI-compatible chat completions #62986

Merged

25 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: extract real token usage in /v1/chat/completions response (closes #38735)#46293

fix: extract real token usage in /v1/chat/completions response (closes #38735)#46293
Br1an67 wants to merge 1 commit into
openclaw:mainfrom
Br1an67:fix/38735

Br1an67 commented Mar 14, 2026

Uh oh!

greptile-apps Bot commented Mar 14, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 14, 2026

Uh oh!

greptile-apps Bot Mar 14, 2026

Uh oh!

greptile-apps Bot Mar 14, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 14, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 15, 2026

Uh oh!

Br1an67 commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Br1an67 commented Mar 14, 2026

Summary

Change Type

Linked Issue/PR

Security Impact

Evidence

Compatibility / Migration

Failure Recovery

Uh oh!

greptile-apps Bot commented Mar 14, 2026

Greptile Summary

Confidence Score: 2/5

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

Br1an67 commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant