fix: cache hit rate stuck at 0% + image_generation request counters + auto-precision display#423
Merged
icebear0828 merged 2 commits intodevfrom Apr 27, 2026
Merged
Conversation
…n request counters
Three orthogonal dashboard accuracy issues, fixed together:
1) **Cache hit rate stuck at 0%** — OpenAI / Anthropic / Gemini upstream
adapters were synthesizing `response.completed` with hardcoded
`input_tokens_details: {}`, dropping the cache hit info that the native
APIs do return. Now extract `prompt_tokens_details.cached_tokens`
(OpenAI), `cache_read_input_tokens` (Anthropic message_start +
message_delta), and `cachedContentTokenCount` (Gemini explicit caches),
and surface them under the standard Codex shape so the existing parsers
pick them up.
2) **No image generation request counter** — PR #422 added image token
tracking but no count of image requests (success vs failure). Now
detect `tools[].type === "image_generation"` at request parse time,
propagate `expectsImageGen` through proxy-handler, and on every
release call site (success / EmptyResponse / upstream errors)
classify as success (image_output_tokens > 0) or failed. Adds:
- AccountUsage: image_request_count, image_request_failed_count
(+ window mirrors)
- UsageSnapshot/Baseline/DataPoint/Summary: same
- /admin/usage-stats/summary: total_image_request_count,
total_image_request_failed_count
- Dashboard "Image Requests" card showing N ok · M failed
- AccountCard window image requests row when activity present
3) **Hit rate display rounded sub-0.05% values to 0.0%** — formatHitRate
now uses auto precision: ≥1% one decimal, 0.01-1% two decimals,
>0 but <0.01% shows "<0.01%", =0 shows "0%".
Backward compat: old usage-history.json snapshots without the new
image_request fields read as 0 via `?? 0`. Tests:
- 7 new unit tests for upstream cache extraction (OpenAI/Anthropic/Gemini)
- 5 new unit tests for image_request counter logic in recordUsage
- 2 new unit tests for usage-stats image_request aggregation
- tests/real/image-generation.test.ts e2e block now asserts
total_image_request_count == +1 after a single successful gen
…tempts Self-review followups on PR #423: - Anthropic message_delta: take Math.max(start, delta) for cache_read_input_tokens so a future API change emitting 0 in delta can't clobber a real hit reported in message_start. Adds defensive unit test covering the regression. - /v1/responses/compact (handleCompact): detect image_generation tool at request time and synthesize a `image_request_attempted=true, image_request_succeeded=false` usage on every release site. Compact doesn't surface tool_usage.image_gen, so any image_generation tool forwarded here is always classified as failed — at least the dashboard now catches accidental misuse rather than silently dropping the signal.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
用户反馈
仪表盘 "Cache Hit Rate: 0.0% (2.6K cached / 17.0M input)" 看起来是假的、图像生成也没显示生成多少次、失败了也没记。本 PR 一次解决三件事。
Summary
Cache 命中率被低估到 0%:
OpenAI/Anthropic/Gemini上游适配器在合成response.completed时硬编码input_tokens_details: {},丢掉了上游原本返回的缓存字段。修复后:openai-upstream.ts抽usage.prompt_tokens_details.cached_tokensanthropic-upstream.ts从message_start.usage+message_delta.usage抽cache_read_input_tokensgemini-upstream.ts抽usageMetadata.cachedContentTokenCount(Gemini 的显式缓存字段)图像生成请求计数 + 失败追踪: 之前 PR feat: track image_generation tokens separately + real-upstream stress test #422 加了 image token,本 PR 加请求计数 + 成功/失败分流:
tools[].type === "image_generation",通过expectsImageGen: boolean透传到 releasetool_usage.image_gen.output_tokens > 0分流AccountUsage/UsageSnapshot/UsageBaseline/UsageDataPoint/UsageSummary,含 window 镜像命中率显示精度自适应 (`formatHitRate`):
改动文件
上游 cache 抽取
图像请求计数 — 类型 + 累加
图像请求计数 — 持久化 / Summary
图像请求计数 — 路由 / 检测
显示
测试
验证
```json
{
"total_image_input_tokens": 2671,
"total_image_output_tokens": 29394,
"total_image_request_count": 17,
"total_image_request_failed_count": 0,
...
}
```
不做