Skip to content

fix: cache hit rate stuck at 0% + image_generation request counters + auto-precision display#423

Merged
icebear0828 merged 2 commits intodevfrom
fix/cache-hit-rate-image-request-counters
Apr 27, 2026
Merged

fix: cache hit rate stuck at 0% + image_generation request counters + auto-precision display#423
icebear0828 merged 2 commits intodevfrom
fix/cache-hit-rate-image-request-counters

Conversation

@icebear0828
Copy link
Copy Markdown
Owner

用户反馈

仪表盘 "Cache Hit Rate: 0.0% (2.6K cached / 17.0M input)" 看起来是假的、图像生成也没显示生成多少次、失败了也没记。本 PR 一次解决三件事。

Summary

  1. Cache 命中率被低估到 0%: OpenAI / Anthropic / Gemini 上游适配器在合成 response.completed 时硬编码 input_tokens_details: {},丢掉了上游原本返回的缓存字段。修复后:

    • openai-upstream.tsusage.prompt_tokens_details.cached_tokens
    • anthropic-upstream.tsmessage_start.usage + message_delta.usagecache_read_input_tokens
    • gemini-upstream.tsusageMetadata.cachedContentTokenCount(Gemini 的显式缓存字段)
  2. 图像生成请求计数 + 失败追踪: 之前 PR feat: track image_generation tokens separately + real-upstream stress test #422 加了 image token,本 PR 加请求计数 + 成功/失败分流:

    • 请求时检测 tools[].type === "image_generation",通过 expectsImageGen: boolean 透传到 release
    • 在每个 release 点(流式 + 非流式 + EmptyResponse + 上游 4xx/5xx + timeout)按 tool_usage.image_gen.output_tokens > 0 分流
    • 失败路径(usage 为 undefined)合成最小化的 attempted=true / succeeded=false 信号,确保 Free 账号被静默剥工具、上游错误、空响应 都能进 failed 计数器
    • 新字段贯穿 AccountUsage / UsageSnapshot / UsageBaseline / UsageDataPoint / UsageSummary,含 window 镜像
    • Dashboard 新增 "Image Requests" 卡片显示 `N ok · M failed`,AccountCard 在该账号有活动时显示窗口请求成功/失败行
  3. 命中率显示精度自适应 (`formatHitRate`):

    • ≥1% 一位小数 (12.3%)
    • 0.01-1% 两位小数 (0.02%)
    • 0 但 <0.01% 显 `<0.01%`

    • =0 显 `0%`
    • 之前 `pct.toFixed(1)` 把所有 <0.05% 全压成 "0.0%",真值看不见

改动文件

上游 cache 抽取

  • `src/proxy/openai-upstream.ts` — 读 `prompt_tokens_details.cached_tokens`,>0 时写 `input_tokens_details.cached_tokens`
  • `src/proxy/anthropic-upstream.ts` — 读 `cache_read_input_tokens`(message_start + message_delta 都兜底),同样写出
  • `src/proxy/gemini-upstream.ts` — 读 `cachedContentTokenCount`,同样写出

图像请求计数 — 类型 + 累加

  • `src/auth/types.ts` — `AccountUsage` 加 `image_request_count` / `image_request_failed_count` + window 版
  • `src/auth/account-registry.ts` — `recordUsage` 接受 `image_request_attempted` / `image_request_succeeded` boolean,按规则分流到两个计数器
  • `src/auth/account-lifecycle.ts` / `src/auth/account-pool.ts` — release 签名同步扩展
  • `src/translation/codex-event-extractor.ts` — `UsageInfo` 加两个 boolean

图像请求计数 — 持久化 / Summary

  • `src/auth/usage-stats.ts` — UsageSnapshot/Baseline/DataPoint/Summary 加新字段;poolTotals/getSummary/recordSnapshot/recoverBaseline/getHistory/bucketize 全部更新;构造器和 load 路径用 `?? 0` 兜底,老 `usage-history.json` 缺新字段读为 0,向后兼容

图像请求计数 — 路由 / 检测

  • `src/routes/responses.ts` — `responsesHandler` 检测 `expectsImageGen` 加进 proxyReq
  • `src/routes/shared/proxy-handler.ts` — `ProxyRequest` 加 `expectsImageGen` 字段;新增 `annotateImageGenOutcome` helper;7 个 release 点全部包一层 (流式 success / 流式 error / decision.respond / decision.releaseBeforeRetry / 非流式 success / EmptyResponse retry / EmptyResponse retry-error / EmptyResponse final fail)

显示

  • `web/src/pages/UsageStats.tsx` — `formatHitRate` auto 精度;Summary 卡片网格 `grid-cols-6 → 7`,新增 "Image Requests" 卡显示 `N / M` 并附 `{ok} ok · {failed} failed` hint
  • `web/src/components/AccountCard.tsx` — 新增 `hasImageActivity` 判定;有活动时显示 Window Image Tokens + Window Image Requests 两行,累计行追加 `· N/M img`
  • `shared/hooks/use-usage-stats.ts` + `shared/types.ts` — TS 镜像同步
  • `shared/i18n/translations.ts` — 新增 `imageRequests` / `imageRequestsHint` / `windowImageRequests`(中英)

测试

  • 新增 `tests/unit/proxy/upstream-cache-tokens.test.ts` — 7 用例覆盖三个 upstream adapter 的 cache 字段抽取(happy path / 缺字段 fallback / message_delta 优先级)
  • `tests/unit/auth/account-pool.test.ts` — 5 用例覆盖 `recordUsage` 的 attempted+succeeded / attempted+failed / pure-failure / 不 attempted / 多次累加
  • `tests/unit/auth/usage-stats.test.ts` — 老 strict-equal 断言全部补 image_request_count / image_request_failed_count = 0;新增 2 用例覆盖 summary 聚合 + snapshot totals 持久化
  • `tests/real/image-generation.test.ts` — e2e block 新增 `total_image_request_count == +1` 与 `total_image_request_failed_count` 不变 的断言

验证

  • `npx tsc --noEmit` 0 错
  • `npm test` — 1643 passed | 1 skipped (从 1629 起 14 个新)
  • `npm run test:real -- image-generation` — 5/5 tests passed (16 张矩阵图 + 1 张 e2e),end-to-end 断言 image_request_count 涨 +1 通过
  • 跑完后 `/admin/usage-stats/summary` 实测:
    ```json
    {
    "total_image_input_tokens": 2671,
    "total_image_output_tokens": 29394,
    "total_image_request_count": 17,
    "total_image_request_failed_count": 0,
    ...
    }
    ```
  • `request_count` 增量(5553 - 5536 = 17)与 `image_request_count`(0 → 17)完全对齐

不做

  • Gemini 已有 `cachedContentTokenCount` 字段(显式缓存),纳入了。没做的是: Gemini 隐式缓存(implicit cache)未在 SSE usageMetadata 里有标准字段,跳过
  • 不动 `request_count` / `empty_response_count`(用户明确说失败计数只针对图像)
  • direct-request (adapter / api-key) 路径不接 `accountPool`,无计数器可更新 — 已知 gap,文档没单独说明
  • 不显示成功率 % 数字,只显示绝对数 `N ok · M failed`(用户没要)

…n request counters

Three orthogonal dashboard accuracy issues, fixed together:

1) **Cache hit rate stuck at 0%** — OpenAI / Anthropic / Gemini upstream
   adapters were synthesizing `response.completed` with hardcoded
   `input_tokens_details: {}`, dropping the cache hit info that the native
   APIs do return. Now extract `prompt_tokens_details.cached_tokens`
   (OpenAI), `cache_read_input_tokens` (Anthropic message_start +
   message_delta), and `cachedContentTokenCount` (Gemini explicit caches),
   and surface them under the standard Codex shape so the existing parsers
   pick them up.

2) **No image generation request counter** — PR #422 added image token
   tracking but no count of image requests (success vs failure). Now
   detect `tools[].type === "image_generation"` at request parse time,
   propagate `expectsImageGen` through proxy-handler, and on every
   release call site (success / EmptyResponse / upstream errors)
   classify as success (image_output_tokens > 0) or failed. Adds:
   - AccountUsage: image_request_count, image_request_failed_count
     (+ window mirrors)
   - UsageSnapshot/Baseline/DataPoint/Summary: same
   - /admin/usage-stats/summary: total_image_request_count,
     total_image_request_failed_count
   - Dashboard "Image Requests" card showing N ok · M failed
   - AccountCard window image requests row when activity present

3) **Hit rate display rounded sub-0.05% values to 0.0%** — formatHitRate
   now uses auto precision: ≥1% one decimal, 0.01-1% two decimals,
   >0 but <0.01% shows "<0.01%", =0 shows "0%".

Backward compat: old usage-history.json snapshots without the new
image_request fields read as 0 via `?? 0`. Tests:
- 7 new unit tests for upstream cache extraction (OpenAI/Anthropic/Gemini)
- 5 new unit tests for image_request counter logic in recordUsage
- 2 new unit tests for usage-stats image_request aggregation
- tests/real/image-generation.test.ts e2e block now asserts
  total_image_request_count == +1 after a single successful gen
…tempts

Self-review followups on PR #423:

- Anthropic message_delta: take Math.max(start, delta) for
  cache_read_input_tokens so a future API change emitting 0 in delta
  can't clobber a real hit reported in message_start. Adds defensive
  unit test covering the regression.
- /v1/responses/compact (handleCompact): detect image_generation tool
  at request time and synthesize a `image_request_attempted=true,
  image_request_succeeded=false` usage on every release site. Compact
  doesn't surface tool_usage.image_gen, so any image_generation tool
  forwarded here is always classified as failed — at least the
  dashboard now catches accidental misuse rather than silently dropping
  the signal.
@icebear0828 icebear0828 merged commit 9d63abe into dev Apr 27, 2026
1 check passed
@icebear0828 icebear0828 deleted the fix/cache-hit-rate-image-request-counters branch April 27, 2026 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant