Context
DataDesigner currently records model token usage as input/output tokens only. For reasoning models, providers often expose reasoning tokens as a breakdown inside output/completion usage rather than as an additional total.
Current code path:
Usage only carries input_tokens, output_tokens, total_tokens, and image usage.
extract_usage() maps prompt_tokens/completion_tokens or input_tokens/output_tokens, but does not read provider-specific reasoning-token breakdowns.
TokenUsageStats only exposes input/output tokens.
Provider behavior observed from docs
- OpenAI Chat Completions reports
completion_tokens_details.reasoning_tokens; those reasoning tokens are included in completion_tokens.
- OpenAI Responses reports
output_tokens_details.reasoning_tokens; those reasoning tokens are included in output_tokens.
- Anthropic extended thinking is charged as output-token usage, although the visible thinking content may be summarized or omitted.
- vLLM exposes reasoning content via
message.reasoning; we should confirm from a representative vLLM response whether its reported completion_tokens includes reasoning tokens in our supported server configuration.
Problem
DD likely reports total provider-billed output tokens correctly for OpenAI/Anthropic-style usage, but it drops the separate reasoning-token count. Users cannot inspect how much of output usage came from hidden/visible reasoning versus final answer tokens.
Proposal
Add optional reasoning-token tracking through the canonical model usage path:
- Preserve existing
output_tokens behavior exactly. output_tokens should continue to mean whatever the provider reports as output/completion tokens.
- Extend provider usage parsing to capture
completion_tokens_details.reasoning_tokens, output_tokens_details.reasoning_tokens, and any top-level provider variant if needed.
- Add a
reasoning_tokens field to canonical usage stats as a separate breakdown only. Do not add it again to output_tokens or total_tokens, since providers already include reasoning tokens in output/completion token counts when they report usage that way.
- Keep telemetry out of scope for this issue.
- Add provider-shape tests for OpenAI Chat Completions, OpenAI Responses-style usage, Anthropic thinking usage, and a captured vLLM response if available.
Acceptance criteria
- Existing
output_tokens totals remain backward compatible.
- Reasoning token counts are preserved when providers report them.
- Reasoning tokens are not double-counted in
output_tokens or total_tokens.
- Telemetry schema/events are not changed as part of this issue.
- Behavior is documented clearly enough that users understand
output_tokens includes reasoning tokens for providers that report usage that way.
Context
DataDesigner currently records model token usage as input/output tokens only. For reasoning models, providers often expose reasoning tokens as a breakdown inside output/completion usage rather than as an additional total.
Current code path:
Usageonly carriesinput_tokens,output_tokens,total_tokens, and image usage.extract_usage()mapsprompt_tokens/completion_tokensorinput_tokens/output_tokens, but does not read provider-specific reasoning-token breakdowns.TokenUsageStatsonly exposes input/output tokens.Provider behavior observed from docs
completion_tokens_details.reasoning_tokens; those reasoning tokens are included incompletion_tokens.output_tokens_details.reasoning_tokens; those reasoning tokens are included inoutput_tokens.message.reasoning; we should confirm from a representative vLLM response whether its reportedcompletion_tokensincludes reasoning tokens in our supported server configuration.Problem
DD likely reports total provider-billed output tokens correctly for OpenAI/Anthropic-style usage, but it drops the separate reasoning-token count. Users cannot inspect how much of output usage came from hidden/visible reasoning versus final answer tokens.
Proposal
Add optional reasoning-token tracking through the canonical model usage path:
output_tokensbehavior exactly.output_tokensshould continue to mean whatever the provider reports as output/completion tokens.completion_tokens_details.reasoning_tokens,output_tokens_details.reasoning_tokens, and any top-level provider variant if needed.reasoning_tokensfield to canonical usage stats as a separate breakdown only. Do not add it again tooutput_tokensortotal_tokens, since providers already include reasoning tokens in output/completion token counts when they report usage that way.Acceptance criteria
output_tokenstotals remain backward compatible.output_tokensortotal_tokens.output_tokensincludes reasoning tokens for providers that report usage that way.