Skip to content

api-proxy sidecar does not support WebSocket upgrades (Codex /v1/responses streaming fails) #1485

@lpcox

Description

@lpcox

Problem

The api-proxy sidecar (containers/api-proxy/server.js) does not handle HTTP WebSocket upgrade requests. When the Codex CLI connects to the OpenAI proxy at ws://172.30.0.30:10000/v1/responses to stream responses via WebSocket, the connection fails because the proxy treats the upgrade request as a normal HTTP request.

Observed behavior (Smoke Codex CI)

From run 23688803182:

ERROR: Reconnecting... 2/5
ERROR: Reconnecting... 3/5
ERROR: Reconnecting... 4/5
ERROR: Reconnecting... 5/5
WARN: falling back to HTTP

After exhausting WebSocket retries the Codex CLI falls back to HTTP, but the agent ultimately produces no safe outputs (no PR comments, no labels), causing the smoke test to fail.

Root cause

The api-proxy is a plain http.createServer() with only request/response handling:

  1. No .on('upgrade') event handler — WebSocket upgrades are not intercepted
  2. No ws or WebSocket library in dependencies
  3. No HTTP 101 Switching Protocols handling

When a WebSocket upgrade request arrives (Connection: Upgrade, Upgrade: websocket):

  1. The proxy treats it as a normal HTTP GET
  2. Collects the (empty) body via req.on('data') / req.on('end')
  3. Forwards via https.request() to the upstream API — which fails because the WebSocket handshake is lost
  4. The upstream API rejects it (not a valid HTTP request) and the connection drops

Impact

  • Codex CLI's preferred transport (WebSocket streaming via /v1/responses) is broken
  • HTTP fallback works for the API call itself, but causes instability and wasted time on retries (~90 seconds of reconnection attempts)
  • Smoke Codex tests fail intermittently because the agent runs out of time budget after retry delays

Proposed fix

Option A: WebSocket tunnel forwarding (recommended)

Add an upgrade event handler to each HTTP server that tunnels the raw TCP socket to the upstream API server, injecting auth headers during the initial HTTP upgrade request:

const { WebSocket } = require('ws');

server.on('upgrade', (req, socket, head) => {
  // Build upstream WebSocket URL
  const upstreamUrl = `wss://${OPENAI_API_TARGET}${req.url}`;

  // Create upstream WebSocket with auth headers
  const upstream = new WebSocket(upstreamUrl, {
    headers: {
      'Authorization': `Bearer ${OPENAI_API_KEY}`,
      ...req.headers,
    },
    agent: new HttpsProxyAgent(proxyUrl), // route through Squid
  });

  // On upstream open, complete the client handshake
  upstream.on('open', () => {
    socket.write(
      'HTTP/1.1 101 Switching Protocols\r\n' +
      'Upgrade: websocket\r\n' +
      'Connection: Upgrade\r\n\r\n'
    );
    // Pipe raw sockets bidirectionally
    upstream._socket.pipe(socket);
    socket.pipe(upstream._socket);
  });

  upstream.on('error', (err) => {
    logRequest('error', 'websocket_upgrade_failed', { error: err.message });
    socket.destroy();
  });
});

Dependencies to add: ws (npm package)

Option B: Raw socket tunnel (simpler, no ws dependency)

Use Node.js net/tls to establish a raw TCP tunnel to the upstream server through Squid (via CONNECT), then replay the upgrade request with injected auth headers:

server.on('upgrade', (req, socket, head) => {
  // Establish CONNECT tunnel through Squid to upstream
  const connectReq = http.request({
    host: SQUID_HOST, port: SQUID_PORT,
    method: 'CONNECT',
    path: `${OPENAI_API_TARGET}:443`,
  });
  connectReq.on('connect', (_, upstream) => {
    // TLS upgrade on the tunnel
    const tls = require('tls');
    const tlsSocket = tls.connect({ socket: upstream, servername: OPENAI_API_TARGET });
    // Replay the upgrade request with auth
    const headers = { ...req.headers, 'Authorization': `Bearer ${OPENAI_API_KEY}` };
    tlsSocket.write(`GET ${req.url} HTTP/1.1\r\nHost: ${OPENAI_API_TARGET}\r\n`);
    for (const [k, v] of Object.entries(headers)) tlsSocket.write(`${k}: ${v}\r\n`);
    tlsSocket.write('\\r\\n');
    // Bidirectional pipe
    tlsSocket.pipe(socket);
    socket.pipe(tlsSocket);
  });
  connectReq.end();
});

This avoids adding the ws dependency but requires more careful error handling.

Additional considerations

  • All four proxy servers (OpenAI:10000, Anthropic:10001, Copilot:10002, OpenCode:10004) should add upgrade handling, even if only OpenAI uses it today
  • Rate limiting should apply to the initial upgrade request (not per-frame)
  • Logging should record WebSocket upgrade attempts and outcomes
  • The Squid proxy already allows CONNECT tunnels for HTTPS, so WebSocket-over-TLS should work through the existing ACL rules
  • Tests should verify WebSocket upgrade + auth injection + Squid routing

References

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions