Skip to content

feat: architecture audit — disguise, update automation, robustness#2

Merged
icebear0828 merged 1 commit intomasterfrom
feat/architecture-audit
Feb 17, 2026
Merged

feat: architecture audit — disguise, update automation, robustness#2
icebear0828 merged 1 commit intomasterfrom
feat/architecture-audit

Conversation

@icebear0828
Copy link
Copy Markdown
Owner

Summary

  • Disguise hardening: enforce HTTP header order, add browser-level headers (Accept-Encoding, Accept-Language), add Codex-specific request body fields (tools), protect /debug/fingerprint endpoint
  • Update automation: integrate appcast version checker into server (30-min polling), externalize model catalog to config/models.yaml, fail fast on critical extraction failures
  • Robustness: 5xx retry with exponential backoff, HTML file read try-catch, config load error handling, persist error logging, token refresh retry
  • Code hygiene: delete leftover WHAM dist files, mark backward-compat shims as @deprecated

Test plan

  • TypeScript compiles with 0 errors
  • 41/41 unit verification tests pass
  • Server starts without crash
  • /health returns update state
  • /v1/models returns 8 models from YAML config
  • Streaming chat completion works (codex model)
  • Non-streaming chat completion works
  • Model alias resolution works (codex-minigpt-5.1-codex-mini)

🤖 Generated with Claude Code

…ustness

Disguise hardening (P0):
- Enforce HTTP header order from fingerprint.yaml config
- Add browser-level headers (Accept-Encoding, Accept-Language)
- Add Codex-specific request body fields (tools, previous_response_id)
- Protect /debug/fingerprint endpoint (dev/localhost only)

Update automation (P0-P1):
- Integrate appcast version checker into server process (30-min polling)
- Expose update state via /health endpoint
- Externalize model catalog to config/models.yaml
- Fail fast on critical extraction failures (originator, api_base_url)
- Update apply-update.ts to compare models against YAML config

Robustness (P0-P1):
- Add 5xx retry with exponential backoff in chat route (max 2 retries)
- Wrap HTML file reads in try-catch to prevent server crashes
- Add config load try-catch with friendly error messages
- Log persistence errors instead of silently swallowing
- Add retry (1 attempt, 5s delay) for token refresh failures

Code hygiene:
- Delete leftover WHAM dist files
- Mark 6 backward-compat shim methods as @deprecated

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@icebear0828 icebear0828 merged commit 86fc406 into master Feb 17, 2026
@icebear0828 icebear0828 deleted the feat/architecture-audit branch February 17, 2026 18:07
icebear0828 added a commit that referenced this pull request May 5, 2026
Address review feedback #2 on PR #442: detect silently-broken pooled
connections proactively instead of waiting for the next real request
to discover them via code=1006.

Track lastActivityAt — updated by ANY pong or data message from the
peer (both prove the connection is alive). On each ping tick, if
now - lastActivityAt > livenessTimeoutMs, markDead the WS. Default
threshold is 2.5x pingIntervalMs (~62.5s with default 25s ping):
tolerates one missed pong (network blip) but evicts before a third
would tick, at which point the connection is almost certainly dead
and reusing it would cost a real-request cache miss.

Counter-based "missed pings" alternative was rejected: it would
false-positive on healthy streaming sessions where the server sends
data but no separate pong, dragging working connections offline.

E2E verified end-to-end from device a (Mac mini, 192.168.10.2) →
proxy 192.168.10.6:8080 → chatgpt.com via a 10-turn pinned-session
load script with a 70s idle gap between turns 5 and 6. Turn 6 stayed
on the same pooled WS as turns 1-5 and hit 99.6% cache (matching
pre-gap turn 5), with zero liveness-timeout markDead events — the
keepalive pings carried the connection across the LB idle window
unharmed.

WsLike interface gains `on("pong", listener)`. Real ws.WebSocket
already emits "pong" per RFC 6455 §5.5.3.

Tests added (6 new):
- liveness > marks dead when peer stays silent past timeout
- liveness > pong resets the clock
- liveness > data message resets the clock
- liveness > livenessTimeoutMs=0 disables
- liveness > default multiple keeps healthy WS alive across many cycles
- ping > skips while busy (active stream keeps LB alive)
icebear0828 added a commit that referenced this pull request May 5, 2026
…442)

* perf(proxy): WebSocket keepalive ping prevents middlebox idle drops

Pooled WSes were silently RST'd by upstream LB / NAT / firewall idle
timeouts after ~30-60s with no traffic, surfacing as code=1006 on the
next turn. Each drop forced a fresh WebSocket against a different
backend instance, losing the prompt cache prefix and dragging hit rates
back to 5-9%.

Send a ws-level ping frame every 25s (configurable, 0 disables) so the
middlebox NAT/connection-tracker keeps the mapping alive. Real-traffic
verification: single pooled WS sustained 22+ consecutive turns at
88-94% hit, vs the prior pattern of single-use WS dying after one
request.

* perf(proxy): skip keepalive ping while WS is busy + harden tests

Address review feedback on PR #442:

- sendKeepalivePing returns early when this.busy is true. The active
  stream's data frames already keep the upstream LB / NAT idle timers
  fresh, so emitting a ping during streaming would be redundant
  bandwidth on chatty sessions.
- Strengthen the error-swallow test to assert pingCount=1 after the
  swallowed throw — a bare not.toThrow() would have missed a
  regression that crashes the interval loop after one bad ping.
- Add a regression test for the busy-skip behavior.
- Inline comment on WsLike.ping() to flag the narrowed signature
  versus real ws.WebSocket.ping(data?, mask?, callback?).

* perf(proxy): add WS liveness check (pong/message tracking)

Address review feedback #2 on PR #442: detect silently-broken pooled
connections proactively instead of waiting for the next real request
to discover them via code=1006.

Track lastActivityAt — updated by ANY pong or data message from the
peer (both prove the connection is alive). On each ping tick, if
now - lastActivityAt > livenessTimeoutMs, markDead the WS. Default
threshold is 2.5x pingIntervalMs (~62.5s with default 25s ping):
tolerates one missed pong (network blip) but evicts before a third
would tick, at which point the connection is almost certainly dead
and reusing it would cost a real-request cache miss.

Counter-based "missed pings" alternative was rejected: it would
false-positive on healthy streaming sessions where the server sends
data but no separate pong, dragging working connections offline.

E2E verified end-to-end from device a (Mac mini, 192.168.10.2) →
proxy 192.168.10.6:8080 → chatgpt.com via a 10-turn pinned-session
load script with a 70s idle gap between turns 5 and 6. Turn 6 stayed
on the same pooled WS as turns 1-5 and hit 99.6% cache (matching
pre-gap turn 5), with zero liveness-timeout markDead events — the
keepalive pings carried the connection across the LB idle window
unharmed.

WsLike interface gains `on("pong", listener)`. Real ws.WebSocket
already emits "pong" per RFC 6455 §5.5.3.

Tests added (6 new):
- liveness > marks dead when peer stays silent past timeout
- liveness > pong resets the clock
- liveness > data message resets the clock
- liveness > livenessTimeoutMs=0 disables
- liveness > default multiple keeps healthy WS alive across many cycles
- ping > skips while busy (active stream keeps LB alive)

---------

Co-authored-by: icebear0828 <icebear0828@users.noreply.github.com>
icebear0828 added a commit that referenced this pull request May 5, 2026
…442)

* perf(proxy): WebSocket keepalive ping prevents middlebox idle drops

Pooled WSes were silently RST'd by upstream LB / NAT / firewall idle
timeouts after ~30-60s with no traffic, surfacing as code=1006 on the
next turn. Each drop forced a fresh WebSocket against a different
backend instance, losing the prompt cache prefix and dragging hit rates
back to 5-9%.

Send a ws-level ping frame every 25s (configurable, 0 disables) so the
middlebox NAT/connection-tracker keeps the mapping alive. Real-traffic
verification: single pooled WS sustained 22+ consecutive turns at
88-94% hit, vs the prior pattern of single-use WS dying after one
request.

* perf(proxy): skip keepalive ping while WS is busy + harden tests

Address review feedback on PR #442:

- sendKeepalivePing returns early when this.busy is true. The active
  stream's data frames already keep the upstream LB / NAT idle timers
  fresh, so emitting a ping during streaming would be redundant
  bandwidth on chatty sessions.
- Strengthen the error-swallow test to assert pingCount=1 after the
  swallowed throw — a bare not.toThrow() would have missed a
  regression that crashes the interval loop after one bad ping.
- Add a regression test for the busy-skip behavior.
- Inline comment on WsLike.ping() to flag the narrowed signature
  versus real ws.WebSocket.ping(data?, mask?, callback?).

* perf(proxy): add WS liveness check (pong/message tracking)

Address review feedback #2 on PR #442: detect silently-broken pooled
connections proactively instead of waiting for the next real request
to discover them via code=1006.

Track lastActivityAt — updated by ANY pong or data message from the
peer (both prove the connection is alive). On each ping tick, if
now - lastActivityAt > livenessTimeoutMs, markDead the WS. Default
threshold is 2.5x pingIntervalMs (~62.5s with default 25s ping):
tolerates one missed pong (network blip) but evicts before a third
would tick, at which point the connection is almost certainly dead
and reusing it would cost a real-request cache miss.

Counter-based "missed pings" alternative was rejected: it would
false-positive on healthy streaming sessions where the server sends
data but no separate pong, dragging working connections offline.

E2E verified end-to-end from device a (Mac mini, 192.168.10.2) →
proxy 192.168.10.6:8080 → chatgpt.com via a 10-turn pinned-session
load script with a 70s idle gap between turns 5 and 6. Turn 6 stayed
on the same pooled WS as turns 1-5 and hit 99.6% cache (matching
pre-gap turn 5), with zero liveness-timeout markDead events — the
keepalive pings carried the connection across the LB idle window
unharmed.

WsLike interface gains `on("pong", listener)`. Real ws.WebSocket
already emits "pong" per RFC 6455 §5.5.3.

Tests added (6 new):
- liveness > marks dead when peer stays silent past timeout
- liveness > pong resets the clock
- liveness > data message resets the clock
- liveness > livenessTimeoutMs=0 disables
- liveness > default multiple keeps healthy WS alive across many cycles
- ping > skips while busy (active stream keeps LB alive)

---------

Co-authored-by: icebear0828 <icebear0828@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant