feat: architecture audit — disguise, update automation, robustness by icebear0828 · Pull Request #2 · icebear0828/codex-proxy

icebear0828 · 2026-02-17T16:59:38Z

Summary

Disguise hardening: enforce HTTP header order, add browser-level headers (Accept-Encoding, Accept-Language), add Codex-specific request body fields (tools), protect /debug/fingerprint endpoint
Update automation: integrate appcast version checker into server (30-min polling), externalize model catalog to config/models.yaml, fail fast on critical extraction failures
Robustness: 5xx retry with exponential backoff, HTML file read try-catch, config load error handling, persist error logging, token refresh retry
Code hygiene: delete leftover WHAM dist files, mark backward-compat shims as @deprecated

Test plan

TypeScript compiles with 0 errors
41/41 unit verification tests pass
Server starts without crash
/health returns update state
/v1/models returns 8 models from YAML config
Streaming chat completion works (codex model)
Non-streaming chat completion works
Model alias resolution works (codex-mini → gpt-5.1-codex-mini)

🤖 Generated with Claude Code

@deprecated

…ustness Disguise hardening (P0): - Enforce HTTP header order from fingerprint.yaml config - Add browser-level headers (Accept-Encoding, Accept-Language) - Add Codex-specific request body fields (tools, previous_response_id) - Protect /debug/fingerprint endpoint (dev/localhost only) Update automation (P0-P1): - Integrate appcast version checker into server process (30-min polling) - Expose update state via /health endpoint - Externalize model catalog to config/models.yaml - Fail fast on critical extraction failures (originator, api_base_url) - Update apply-update.ts to compare models against YAML config Robustness (P0-P1): - Add 5xx retry with exponential backoff in chat route (max 2 retries) - Wrap HTML file reads in try-catch to prevent server crashes - Add config load try-catch with friendly error messages - Log persistence errors instead of silently swallowing - Add retry (1 attempt, 5s delay) for token refresh failures Code hygiene: - Delete leftover WHAM dist files - Mark 6 backward-compat shim methods as @deprecated Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Address review feedback #2 on PR #442: detect silently-broken pooled connections proactively instead of waiting for the next real request to discover them via code=1006. Track lastActivityAt — updated by ANY pong or data message from the peer (both prove the connection is alive). On each ping tick, if now - lastActivityAt > livenessTimeoutMs, markDead the WS. Default threshold is 2.5x pingIntervalMs (~62.5s with default 25s ping): tolerates one missed pong (network blip) but evicts before a third would tick, at which point the connection is almost certainly dead and reusing it would cost a real-request cache miss. Counter-based "missed pings" alternative was rejected: it would false-positive on healthy streaming sessions where the server sends data but no separate pong, dragging working connections offline. E2E verified end-to-end from device a (Mac mini, 192.168.10.2) → proxy 192.168.10.6:8080 → chatgpt.com via a 10-turn pinned-session load script with a 70s idle gap between turns 5 and 6. Turn 6 stayed on the same pooled WS as turns 1-5 and hit 99.6% cache (matching pre-gap turn 5), with zero liveness-timeout markDead events — the keepalive pings carried the connection across the LB idle window unharmed. WsLike interface gains `on("pong", listener)`. Real ws.WebSocket already emits "pong" per RFC 6455 §5.5.3. Tests added (6 new): - liveness > marks dead when peer stays silent past timeout - liveness > pong resets the clock - liveness > data message resets the clock - liveness > livenessTimeoutMs=0 disables - liveness > default multiple keeps healthy WS alive across many cycles - ping > skips while busy (active stream keeps LB alive)

…442) * perf(proxy): WebSocket keepalive ping prevents middlebox idle drops Pooled WSes were silently RST'd by upstream LB / NAT / firewall idle timeouts after ~30-60s with no traffic, surfacing as code=1006 on the next turn. Each drop forced a fresh WebSocket against a different backend instance, losing the prompt cache prefix and dragging hit rates back to 5-9%. Send a ws-level ping frame every 25s (configurable, 0 disables) so the middlebox NAT/connection-tracker keeps the mapping alive. Real-traffic verification: single pooled WS sustained 22+ consecutive turns at 88-94% hit, vs the prior pattern of single-use WS dying after one request. * perf(proxy): skip keepalive ping while WS is busy + harden tests Address review feedback on PR #442: - sendKeepalivePing returns early when this.busy is true. The active stream's data frames already keep the upstream LB / NAT idle timers fresh, so emitting a ping during streaming would be redundant bandwidth on chatty sessions. - Strengthen the error-swallow test to assert pingCount=1 after the swallowed throw — a bare not.toThrow() would have missed a regression that crashes the interval loop after one bad ping. - Add a regression test for the busy-skip behavior. - Inline comment on WsLike.ping() to flag the narrowed signature versus real ws.WebSocket.ping(data?, mask?, callback?). * perf(proxy): add WS liveness check (pong/message tracking) Address review feedback #2 on PR #442: detect silently-broken pooled connections proactively instead of waiting for the next real request to discover them via code=1006. Track lastActivityAt — updated by ANY pong or data message from the peer (both prove the connection is alive). On each ping tick, if now - lastActivityAt > livenessTimeoutMs, markDead the WS. Default threshold is 2.5x pingIntervalMs (~62.5s with default 25s ping): tolerates one missed pong (network blip) but evicts before a third would tick, at which point the connection is almost certainly dead and reusing it would cost a real-request cache miss. Counter-based "missed pings" alternative was rejected: it would false-positive on healthy streaming sessions where the server sends data but no separate pong, dragging working connections offline. E2E verified end-to-end from device a (Mac mini, 192.168.10.2) → proxy 192.168.10.6:8080 → chatgpt.com via a 10-turn pinned-session load script with a 70s idle gap between turns 5 and 6. Turn 6 stayed on the same pooled WS as turns 1-5 and hit 99.6% cache (matching pre-gap turn 5), with zero liveness-timeout markDead events — the keepalive pings carried the connection across the LB idle window unharmed. WsLike interface gains `on("pong", listener)`. Real ws.WebSocket already emits "pong" per RFC 6455 §5.5.3. Tests added (6 new): - liveness > marks dead when peer stays silent past timeout - liveness > pong resets the clock - liveness > data message resets the clock - liveness > livenessTimeoutMs=0 disables - liveness > default multiple keeps healthy WS alive across many cycles - ping > skips while busy (active stream keeps LB alive) --------- Co-authored-by: icebear0828 <icebear0828@users.noreply.github.com>

icebear0828 merged commit 86fc406 into master Feb 17, 2026

icebear0828 deleted the feat/architecture-audit branch February 17, 2026 18:07

icebear0828 mentioned this pull request Mar 27, 2026

fix: recover usage baseline from last snapshot on startup #232

Merged

4 tasks

icebear0828 mentioned this pull request Apr 26, 2026

ci: add dev branch + beta channel auto-promotion #412

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: architecture audit — disguise, update automation, robustness#2

feat: architecture audit — disguise, update automation, robustness#2
icebear0828 merged 1 commit intomasterfrom
feat/architecture-audit

icebear0828 commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

icebear0828 commented Feb 17, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant