feat: Codex CLI WebSocket support for /responses#37
Merged
Conversation
Design for accepting Codex CLI's WebSocket transport on /v1/responses and /responses by terminating the WS at lunaroute and driving the existing HTTP Responses pipeline per response.create frame. Preserves LUNAROUTE markers, session recording, metrics, and provider registry unchanged; no new egress code required. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Eight-task plan covering: tokio-tungstenite workspace dep, refactor responses_passthrough streaming branch into reusable responses_sse_stream, frame parser with unit tests, WebSocket handler + read loop, router wiring, full end-to-end integration tests, prometheus WS metrics, and a smoke-test runbook for Codex CLI.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pure code motion: streaming pipeline for /v1/responses now lives in a reusable helper returning Stream<SseEvent>. HTTP handler wraps it in axum Sse as before. No behavior change; all existing integration tests pass unchanged. Prepares for WebSocket ingress on /responses.
…SSE events Addresses two regressions from refactor 21f4085: - Provider error JSON is forwarded verbatim with provider status code (no re-wrapping in lunaroute's error envelope). - Synthetic stream-error SSE events use empty event name so clients consuming default 'message' events keep receiving them. WebSocket handler (future task) keeps meaningful event names via the SseEvent struct; HTTP handler strips empty names when wrapping.
…aming path
When upstream returns a non-JSON error body (plain text, HTML),
responses_sse_stream now wraps it in the same {"error": ...}
object shape used by the non-streaming branch of responses_passthrough
instead of emitting a bare JSON string. Restores byte-compatible
error responses across both code paths.
Adds responses_ws module to lunaroute-ingress with: - ClientEvent enum (ResponseCreate variant) - FrameError enum (MalformedJson, MissingField, UnsupportedType) - parse_client_frame() for response.create frames - error_frame() helper for structured WS error payloads - 6 unit tests (all passing) Suppresses dead_code lints until Task 4 wires the handler. Foundation for Task 4 WebSocket handler.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Always inject content-type: application/json on the upstream POST built
from the WS upgrade headers (upgrade GET has no body, so the header is
typically absent).
- Overwrite response.stream unconditionally so clients that send
{"stream": false} don't break the inherently-streaming WS transport.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds responses_websocket.rs with three tests that spin up a real axum TCP listener, drive it with a tokio-tungstenite client, and assert full-pipeline behavior against a wiremock upstream: - ws_response_create_streams_events_and_records_session: verifies 3 SSE events are forwarded as WS text frames and session events (Started, RequestRecorded, Completed) are stored. - ws_runs_two_response_creates_sequentially_on_one_connection: sends two response.create frames on one connection; wiremock .expect(2) asserts both upstream POSTs land. - ws_sends_error_frame_for_unsupported_event_type: confirms a response.cancel frame is rejected with type=error / code=unsupported_event_type before hitting the upstream. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…oute - First test's mock now requires content-type: application/json and stream=true in the upstream POST; the client frame sends stream=false, proving the WS bridge overwrites it. - Unsupported event type test now asserts zero upstream requests, proving malformed frames don't leak into upstream. - New test exercises the /responses compat path (no /v1 prefix), pinning the dual-route wiring.
Parses event type from frame payload so ws_frames_total labels map to
real Responses API events (response.create/completed/...) rather than
WebSocket transport placeholders ("text", "message"). Extends
instrumentation to outbound error frames and inbound binary frames so
abnormal sessions show up in dashboards.
Addresses roborev findings on 932e2fd.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Record ws_frames_total on every inbound text frame, labeling parse failures as malformed_json/unsupported_event_type/invalid_request so bad-traffic volume is visible alongside successful response.create events. Adds unit tests pinning event_label precedence (JSON type > SSE event name > "message"). Addresses roborev findings on c1f54b5. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Use `codex exec` (the documented subcommand form) instead of bare `codex "…"`, and replace the non-existent `lunaroute-server logs -f` with running the server in the foreground at debug level so WS lifecycle logs are actually visible. Addresses roborev findings on c66dd76.
lunaroute-server builds its log filter from config + LUNAROUTE_LOG_LEVEL, not RUST_LOG. Fix the command so debug-level WS lifecycle logs actually appear. Addresses roborev finding on 7a738c0.
- lunaroute-routing: SwitchReason is Copy, use the copy instead of .clone() in the test - integration tests: use struct-update syntax for HttpClientConfig instead of mut + default+reassign These predate the branch but fail CI under -D warnings. Fixing here so the Codex WS PR can merge.
Rust 1.95's clippy::unnecessary_sort_by flags comparator closures that reduce to a Reverse key. CI runs clippy on 1.95 so both sorts in router.rs now fail -D warnings. Local stable is 1.94 which didn't catch this. Semantics unchanged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
GET /v1/responsesandGET /responsesso Codex CLI withsupports_websockets = truecan connect.response.createframes, drives the existing HTTP Responses pipeline upstream (stream=truealways), and translates SSE events back as WS text frames. No new egress code.codex_auth, header filtering. Sequential in-flight per connection.What's in the diff
responses_sse_streamfromresponses_passthrough— pure refactor, HTTP behavior unchangedresponses_ws.rswith frame parser, handler, read loop, terminal-event detection, structured error frames/responsesand/v1/responseslunaroute_ws_connections_{opened,closed}_total,lunaroute_ws_connection_duration_seconds,lunaroute_ws_frames_total{endpoint,direction,type}— labels use parsed Responses event types (response.create,response.completed, …), plusmalformed_json/unsupported_event_type/invalid_request/binary/errorfor abnormal paths/responsescompat pathevent_labelprecedenceDesign: `docs/superpowers/specs/2026-04-16-codex-websocket-responses-design.md`
Plan: `docs/superpowers/plans/2026-04-16-codex-websocket-responses.md`
Test plan
🤖 Generated with Claude Code