Conversation
Add proxyWebSocket() function using raw CONNECT tunnel through Squid (no new dependencies). Handles WebSocket upgrade requests on all four proxy servers: OpenAI (10000), Anthropic (10001), Copilot (10002), OpenCode (10004). - Validates upgrade type and URL (rejects non-WebSocket, SSRF prevention) - Rate-limits the initial upgrade request - Opens CONNECT tunnel through Squid to upstream:443 - TLS-upgrades the tunnel and replays Upgrade request with auth injected - Strips client-supplied auth headers; injects proxy credentials - Bidirectionally pipes raw TCP sockets for the WebSocket session - Logs upgrade start/complete/failed events with metrics - Adds 12 unit tests covering validation and tunnel paths" Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/f1b847f0-b4fd-412c-b244-1132754fe11b Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
…clarity - Use separate close handlers for socket/tlsSocket to avoid double-destroy - Add comment explaining WebSocket frame bytes in test fixture Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/f1b847f0-b4fd-412c-b244-1132754fe11b Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
✅ Coverage Check PassedOverall Coverage
📁 Per-file Coverage Changes (1 files)
Coverage comparison generated by |
The detection job (added by gh-aw v0.64.2 compiler) runs npm ci and npm run build but has no Checkout repository step, causing npm ci to fail with EUSAGE (missing package-lock.json). Add checkout step to the lock file and update the postprocessing script to automatically inject checkout steps before awf build steps in any job that lacks one. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Smoke Test Results — PASS
|
🤖 Smoke Test Results✅ GitHub MCP — Last 2 merged PRs: #1484 "[WIP] Fix failing GitHub Actions workflow agent", #1483 "fix: recompile smoke-codex workflow with gh-aw v0.64.2 to unblock github.com" Overall: PASS | Author:
|
Chroot Version Comparison Results
Overall: ❌ FAILED — Python and Node.js versions differ between host and chroot.
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
There was a problem hiding this comment.
Pull request overview
Adds WebSocket upgrade handling to the api-proxy sidecar so Codex CLI /v1/responses streaming works correctly through the Squid proxy, and updates CI smoke workflow post-processing to ensure the repo is checked out before local AWF build steps.
Changes:
- Add
proxyWebSocket(...)implementation and registerupgradehandlers for OpenAI/Anthropic/Copilot/OpenCode proxy servers. - Add unit tests covering WebSocket upgrade validation, CONNECT/TLS failure modes, header injection/stripping, and
headforwarding. - Ensure smoke workflows include a checkout step before the injected local build steps (and apply this to smoke-codex.lock.yml).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
containers/api-proxy/server.js |
Implements WebSocket upgrade tunneling via CONNECT→TLS through Squid; wires upgrade handlers. |
containers/api-proxy/server.test.js |
Adds unit tests for proxyWebSocket validation and failure/success paths. |
scripts/ci/postprocess-smoke-workflows.ts |
Injects checkout steps before “Install awf dependencies” when missing. |
.github/workflows/smoke-codex.lock.yml |
Adds a checkout step before npm ci/npm run build in the smoke codex workflow. |
Comments suppressed due to low confidence (1)
containers/api-proxy/server.js:631
- The client
socketonly gets an'error'handler inside thesecureConnectcallback. If the client disconnects or the socket errors earlier (during CONNECT/TLS), Node can emit an'error'event with no listener, which can crash the process and/or leakactive_requests(finalize never runs). Attachsocket.once('error', …)/socket.once('close', …)immediately after validation (and have it callabort/finalize) so early failures are handled safely.
// Finalize once when either side closes; destroy the other side.
socket.once('close', () => { finalize(false); tlsSocket.destroy(); });
tlsSocket.once('close', () => { finalize(false); socket.destroy(); });
// Suppress unhandled-error crashes; destroy triggers the close handler.
socket.on('error', () => socket.destroy());
tlsSocket.on('error', () => tlsSocket.destroy());
});
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const checkoutStep = [ | ||
| `${indent}- name: Checkout repository`, | ||
| `${indent} uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2`, | ||
| `${indent} with:`, | ||
| `${indent} persist-credentials: false`, | ||
| ].join('\n'); | ||
| lines.splice(i, 0, checkoutStep); | ||
| injectedCheckouts++; | ||
| i += 4; // Skip past the inserted lines |
There was a problem hiding this comment.
checkoutStep is built as a multi-line string and inserted via lines.splice(i, 0, checkoutStep), which adds only one array element containing embedded \ns. That makes i += 4 incorrect (it didn’t insert 4 elements) and can cause the loop to skip/behave incorrectly when there are multiple install steps in a file. Build checkoutStepLines as an array and splice with ...checkoutStepLines, then advance i by the number of inserted lines (or remove the manual increment).
| const checkoutStep = [ | |
| `${indent}- name: Checkout repository`, | |
| `${indent} uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2`, | |
| `${indent} with:`, | |
| `${indent} persist-credentials: false`, | |
| ].join('\n'); | |
| lines.splice(i, 0, checkoutStep); | |
| injectedCheckouts++; | |
| i += 4; // Skip past the inserted lines | |
| const checkoutStepLines = [ | |
| `${indent}- name: Checkout repository`, | |
| `${indent} uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2`, | |
| `${indent} with:`, | |
| `${indent} persist-credentials: false`, | |
| ]; | |
| lines.splice(i, 0, ...checkoutStepLines); | |
| injectedCheckouts++; |
| // ── Step 1: CONNECT tunnel through Squid to targetHost:443 ──────────── | ||
| const connectReq = http.request({ | ||
| host: proxyHost, | ||
| port: proxyPort, | ||
| method: 'CONNECT', | ||
| path: `${targetHost}:443`, | ||
| headers: { 'Host': `${targetHost}:443` }, | ||
| }); | ||
|
|
||
| connectReq.once('error', (err) => abort(`CONNECT error: ${err.message}`)); | ||
|
|
||
| connectReq.once('connect', (connectRes, tunnel) => { | ||
| if (connectRes.statusCode !== 200) { | ||
| abort(`CONNECT failed: HTTP ${connectRes.statusCode}`, tunnel); | ||
| return; | ||
| } | ||
|
|
||
| // ── Step 2: TLS-upgrade the raw tunnel ────────────────────────────── | ||
| const tlsSocket = tls.connect({ socket: tunnel, servername: targetHost, rejectUnauthorized: true }); | ||
|
|
||
| // Pre-TLS error handler: removed once TLS is established. | ||
| const onTlsError = (err) => abort(`TLS handshake error: ${err.message}`, tunnel); | ||
| tlsSocket.once('error', onTlsError); | ||
|
|
There was a problem hiding this comment.
The CONNECT and TLS handshake path has no explicit timeouts (connectReq or the TLS socket). If Squid/upstream stalls mid-handshake, the sockets (and active_requests gauge) can hang indefinitely. Consider setting reasonable timeouts for CONNECT and TLS negotiation and calling abort(...) on timeout to ensure resources/metrics are released.
| describe('proxy configuration errors', () => { | ||
| it('returns 502 when HTTPS_PROXY is not configured', () => { | ||
| // The module was loaded without HTTPS_PROXY; proxyWebSocket should fail-safe. | ||
| const socket = makeMockSocket(); | ||
| proxyWebSocket(makeUpgradeReq(), socket, Buffer.alloc(0), 'api.openai.com', {}, 'openai'); | ||
| expect(socket.write).toHaveBeenCalledWith(expect.stringContaining('HTTP/1.1 502 Bad Gateway')); | ||
| expect(socket.destroy).toHaveBeenCalled(); | ||
| }); |
There was a problem hiding this comment.
This test assumes proxyWebSocket was loaded with HTTPS_PROXY unset, but ./server is required at the top of the file so the module-level HTTPS_PROXY constant will reflect whatever env the test runner already has. In environments where HTTPS_PROXY is set by default, this will fail. Make the test deterministic by deleting process.env.HTTPS_PROXY/HTTP_PROXY before requiring the module (e.g. via jest.isolateModules/jest.resetModules) and restoring afterward.
| return new Promise(resolve => setTimeout(() => { | ||
| // The upgrade request is written as a string to tlsSocket | ||
| const upgradeWrite = tlsSocket.write.mock.calls.find( | ||
| c => typeof c[0] === 'string' && c[0].startsWith('GET ') | ||
| ); | ||
| expect(upgradeWrite).toBeDefined(); | ||
| const upgradeReqStr = upgradeWrite[0]; | ||
| expect(upgradeReqStr).toContain('Authorization: Bearer secret'); | ||
| expect(upgradeReqStr).toContain('host: api.openai.com'); | ||
| // Both sides should be piped | ||
| expect(tlsSocket.pipe).toHaveBeenCalledWith(socket); | ||
| expect(socket.pipe).toHaveBeenCalledWith(tlsSocket); | ||
| resolve(); | ||
| }, 30)); |
There was a problem hiding this comment.
Several of these tests rely on small fixed setTimeout(..., 30) delays to wait for async events, which can be flaky on slower CI machines. Prefer awaiting a specific event/condition (e.g., resolve a Promise when tlsSocket.write is called or when secureConnect is emitted) rather than sleeping for an arbitrary duration.
| const indent = installMatch[1]; | ||
| const checkoutStep = [ | ||
| `${indent}- name: Checkout repository`, | ||
| `${indent} uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2`, |
There was a problem hiding this comment.
This injects actions/checkout@de0fac2e... with a # v6.0.2 comment. Elsewhere in the repo the same SHA is commented as # v4 (e.g. .github/workflows/build.yml:28), so the version annotation is currently inconsistent. Consider using the same comment format/version across workflows to avoid confusion about what the SHA represents.
| `${indent} uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2`, | |
| `${indent} uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v4`, |
| // Finalize once when either side closes; destroy the other side. | ||
| socket.once('close', () => { finalize(false); tlsSocket.destroy(); }); | ||
| tlsSocket.once('close', () => { finalize(false); socket.destroy(); }); |
There was a problem hiding this comment.
finalize(false) is triggered solely by socket close events, so an upstream handshake rejection (e.g. 401/403/400 instead of 101 Switching Protocols) will still be logged/recorded as websocket_upgrade_complete with status_class: '1xx'. If you want metrics/logs to reflect actual upgrade success, consider parsing the first upstream response line/headers to confirm status=101 before marking success; otherwise treat early close as an upgrade failure.
This issue also appears on line 624 of the same file.
| // Finalize once when either side closes; destroy the other side. | |
| socket.once('close', () => { finalize(false); tlsSocket.destroy(); }); | |
| tlsSocket.once('close', () => { finalize(false); socket.destroy(); }); | |
| // Track whether the upstream actually completed the HTTP 101 upgrade. | |
| let upgradeEstablished = false; | |
| const inspectStatusLine = (chunk) => { | |
| // Only need the first response chunk to determine if the upgrade succeeded. | |
| try { | |
| const str = chunk.toString('ascii'); | |
| const firstLineEnd = str.indexOf('\r\n'); | |
| const statusLine = firstLineEnd === -1 ? str : str.slice(0, firstLineEnd); | |
| const match = /^HTTP\/1\.[01]\s+(\d{3})/.exec(statusLine); | |
| if (match && match[1] === '101') { | |
| upgradeEstablished = true; | |
| } | |
| } finally { | |
| // Ensure we don't keep inspecting further data chunks. | |
| tlsSocket.removeListener('data', inspectStatusLine); | |
| } | |
| }; | |
| tlsSocket.once('data', inspectStatusLine); | |
| // Finalize once when either side closes; destroy the other side. | |
| socket.once('close', () => { finalize(!upgradeEstablished); tlsSocket.destroy(); }); | |
| tlsSocket.once('close', () => { finalize(!upgradeEstablished); socket.destroy(); }); |
| forwardHeaders[name] = value; | ||
| } | ||
| } | ||
| Object.assign(forwardHeaders, injectHeaders); |
There was a problem hiding this comment.
proxyWebSocket generates a requestId when the client doesn't supply a valid x-request-id, but that generated ID is never forwarded upstream. In proxyRequest you always set headers['x-request-id'] = requestId; doing the same here would keep tracing/log correlation consistent for WebSocket upgrades too.
| Object.assign(forwardHeaders, injectHeaders); | |
| Object.assign(forwardHeaders, injectHeaders); | |
| forwardHeaders['x-request-id'] = requestId; // Ensure consistent tracing for WebSocket upgrades |
| - name: Install Codex CLI | ||
| run: npm install -g @openai/codex@latest | ||
| - name: Checkout repository | ||
| uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 |
There was a problem hiding this comment.
The pinned actions/checkout@de0fac2e... SHA is annotated here as # v6.0.2, but the same SHA is annotated as # v4 in other workflows (e.g. .github/workflows/build.yml:28). Please align the version comment with the repo’s standard so the pin is unambiguous.
| uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 | |
| uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v4 |
| opencodeServer.on('upgrade', (req, socket, head) => { | ||
| const anthropicHeaders = { 'x-api-key': ANTHROPIC_API_KEY }; | ||
| if (!req.headers['anthropic-version']) { | ||
| anthropicHeaders['anthropic-version'] = '2023-06-01'; | ||
| } | ||
| proxyWebSocket(req, socket, head, ANTHROPIC_API_TARGET, anthropicHeaders, 'opencode'); | ||
| }); |
There was a problem hiding this comment.
In the OpenCode server, the new WebSocket upgrade path uses provider: 'opencode', but the existing HTTP handler just above still calls proxyRequest(...) without passing a provider (so metrics/rate limiting labels differ and may end up as provider=undefined). Consider updating the HTTP path to pass 'opencode' (and basePath if relevant) so both HTTP and WS requests are attributed consistently.
The api-proxy sidecar had no
upgradeevent handler, so WebSocket connections from Codex CLI (ws://172.30.0.30:10000/v1/responses) were silently mishandled — exhausting retries (~90s) before falling back to broken HTTP, causing smoke test failures.Approach
Raw CONNECT tunnel through Squid (Node.js built-ins only, no new dependencies):
websocketonly) and URL (relative paths, SSRF prevention)GET … HTTP/1.1 / Upgrade: websocketrequest with auth injected, client-supplied auth strippedChanges
server.js:proxyWebSocket(req, socket, head, targetHost, injectHeaders, provider, basePath)— full metrics (gaugeInc/Dec,requests_total,request_duration_ms) and structured logging (websocket_upgrade_start/complete/failed)upgradehandlers: OpenAI :10000, Anthropic :10001, Copilot :10002, OpenCode :10004; unconfigured OpenAI server returns 503server.test.js: 12 new unit tests — validation rejection, CONNECT/TLS failure paths, auth injection, client-auth stripping,headbyte forwarding, Squid proxy routing📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.