feat(byte-codec): route all transports through byte-faithful codec (xxj.2)#63
Merged
Conversation
…[LAW:single-enforcer] (xxj.2)
Replace per-site TextDecoder calls with bytesToLatin1 from byte-codec.ts:
- connectors/websocket/transport.ts: remove BINARY_DECODER (the active UTF-8 bug for binary frames); decodeFrame now uses bytesToLatin1 for ArrayBuffer and ArrayBufferView paths
- connectors/websocket/server.ts: Uint8Array text-frame path uses bytesToLatin1 (bridge JSON is pure-ASCII; latin1 = UTF-8 for all current protocol values)
- transport/spawn.ts: setEncoding('latin1') documented as [LAW:single-enforcer] exception — Node-only Buffer.toString('latin1') is genuinely 1:1 byte↔code-unit (unlike browser TextDecoder('latin1') which is windows-1252)
Tests updated: ArrayBuffer test now probes the 0x80–0x9F landmine range directly (bytesToLatin1 vs TextDecoder('latin1') diverge exactly there) rather than asserting UTF-8 round-trip behavior.
There was a problem hiding this comment.
Pull request overview
Routes all transport-layer byte→string conversions through the byte-faithful bytesToLatin1 codec, fixing a UTF-8 corruption bug for binary WebSocket frames in the websocketTransport adapter and aligning the bridge server's Uint8Array text-frame path. Also documents spawn.ts's setEncoding('latin1') as a sanctioned Node-only exception to the single-enforcer law.
Changes:
- Replace
TextDecoderwithbytesToLatin1inwebsocket/transport.ts(bothArrayBufferandArrayBufferViewbranches) and update its test to probe the 0x80–0x9F windows-1252 landmine range. - Replace
new TextDecoder().decode(data)withbytesToLatin1for theUint8Arraytext-frame path inwebsocket/server.ts. - Add
[LAW:single-enforcer] exceptioncomment inspawn.tsdocumenting whysetEncoding('latin1')is allowed.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/connectors/websocket/transport.ts | Removes BINARY_DECODER; routes both binary frame branches through bytesToLatin1. |
| src/connectors/websocket/server.ts | Switches the Uint8Array text-frame fallback to bytesToLatin1. |
| src/transport/spawn.ts | Documents setEncoding('latin1') as a Node-only single-enforcer exception. |
| tests/unit/websocket-transport.test.ts | Updates ArrayBuffer/typed-array frame tests to assert byte-faithful decoding across 0x80–0x9F. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
brandon-fryslie
added a commit
that referenced
this pull request
May 27, 2026
…terals PaneBytesEnvelope = PaneOutputMessage requires type: "output" since PR #63 aligned the wire envelope with the protocol type. The satisfies checks in the renderer-side tests were missing the discriminant field.
brandon-fryslie
added a commit
that referenced
this pull request
May 27, 2026
#64) * test(byte-codec): cross-transport byte-faithfulness contract — [LAW:behavior-not-structure] (xxj.3) Adds 8 contract tests asserting ∀ byte b in probe payload, every transport's onData delivers a string where charCodeAt(i) === b. Node file covers websocketTransport (ArrayBuffer, Uint8Array, sub-array view with non-zero byteOffset) and the spawn transport's setEncoding('latin1') path via a PassThrough stream. Browser file runs the same websocketTransport assertions in happy-dom, guarding against future TextDecoder('latin1') regression (windows-1252 remaps 0x80-0x9F in browsers). Canonical probe payload in _helpers/probe-bytes.ts: windows-1252 landmine bytes 0x80/0x81/0x8D/0x8F/0x9D/0x9F, emoji U+1F600 (F0 9F 98 80), CJK U+4E2D (E4 B8 AD), ASCII baseline. [LAW:single-enforcer] probe payload defined once in probe-bytes.ts [LAW:behavior-not-structure] tests assert the per-byte contract, not decode impl * fix(lint): pre-existing ESLint errors blocking CI ReadonlyArray<T> → readonly T[], type → interface for PaneMeta, remove unused PaneBytesEnvelope import in electron/main.ts. None of these were introduced by this PR; all existed on master. * fix(test): align FakeWebSocket signatures and correct TextDecoder comment - send()/close() now match BrowserWebSocketLike parameter signatures - Comment corrected: TextDecoder('latin1') is windows-1252 everywhere per the WHATWG Encoding Standard, not just in browsers; only String.fromCharCode (bytesToLatin1) is byte-faithful in all runtimes * fix(test): clarify TextDecoder vs Node stream latin1 in spawn test comments TextDecoder('latin1') is windows-1252 in ALL WHATWG runtimes (not just browsers). Node stream setEncoding('latin1') is byte-faithful but Node-only. String.fromCharCode (bytesToLatin1) is the portable cross-runtime path. * fix(format): apply prettier to pre-existing formatting issues These files had pre-existing prettier violations that caused format:check to fail in CI. No logic changes — formatting only. * fix(typecheck): add missing type discriminant to PaneBytesEnvelope literals PaneBytesEnvelope = PaneOutputMessage requires type: "output" since PR #63 aligned the wire envelope with the protocol type. The satisfies checks in the renderer-side tests were missing the discriminant field. * refactor(test-helpers): extract assertByteFaithful + FakeWebSocket to shared helper [LAW:single-enforcer] The two contract test files defined identical assertByteFaithful and FakeWebSocket. Extracting to _helpers/websocket-fake.ts ensures a single change point if BrowserWebSocketLike evolves.
brandon-fryslie
added a commit
that referenced
this pull request
May 27, 2026
…[LAW:single-enforcer] (xxj.2) (#63) Replace per-site TextDecoder calls with bytesToLatin1 from byte-codec.ts: - connectors/websocket/transport.ts: remove BINARY_DECODER (the active UTF-8 bug for binary frames); decodeFrame now uses bytesToLatin1 for ArrayBuffer and ArrayBufferView paths - connectors/websocket/server.ts: Uint8Array text-frame path uses bytesToLatin1 (bridge JSON is pure-ASCII; latin1 = UTF-8 for all current protocol values) - transport/spawn.ts: setEncoding('latin1') documented as [LAW:single-enforcer] exception — Node-only Buffer.toString('latin1') is genuinely 1:1 byte↔code-unit (unlike browser TextDecoder('latin1') which is windows-1252) Tests updated: ArrayBuffer test now probes the 0x80–0x9F landmine range directly (bytesToLatin1 vs TextDecoder('latin1') diverge exactly there) rather than asserting UTF-8 round-trip behavior.
brandon-fryslie
added a commit
that referenced
this pull request
May 27, 2026
#64) * test(byte-codec): cross-transport byte-faithfulness contract — [LAW:behavior-not-structure] (xxj.3) Adds 8 contract tests asserting ∀ byte b in probe payload, every transport's onData delivers a string where charCodeAt(i) === b. Node file covers websocketTransport (ArrayBuffer, Uint8Array, sub-array view with non-zero byteOffset) and the spawn transport's setEncoding('latin1') path via a PassThrough stream. Browser file runs the same websocketTransport assertions in happy-dom, guarding against future TextDecoder('latin1') regression (windows-1252 remaps 0x80-0x9F in browsers). Canonical probe payload in _helpers/probe-bytes.ts: windows-1252 landmine bytes 0x80/0x81/0x8D/0x8F/0x9D/0x9F, emoji U+1F600 (F0 9F 98 80), CJK U+4E2D (E4 B8 AD), ASCII baseline. [LAW:single-enforcer] probe payload defined once in probe-bytes.ts [LAW:behavior-not-structure] tests assert the per-byte contract, not decode impl * fix(lint): pre-existing ESLint errors blocking CI ReadonlyArray<T> → readonly T[], type → interface for PaneMeta, remove unused PaneBytesEnvelope import in electron/main.ts. None of these were introduced by this PR; all existed on master. * fix(test): align FakeWebSocket signatures and correct TextDecoder comment - send()/close() now match BrowserWebSocketLike parameter signatures - Comment corrected: TextDecoder('latin1') is windows-1252 everywhere per the WHATWG Encoding Standard, not just in browsers; only String.fromCharCode (bytesToLatin1) is byte-faithful in all runtimes * fix(test): clarify TextDecoder vs Node stream latin1 in spawn test comments TextDecoder('latin1') is windows-1252 in ALL WHATWG runtimes (not just browsers). Node stream setEncoding('latin1') is byte-faithful but Node-only. String.fromCharCode (bytesToLatin1) is the portable cross-runtime path. * fix(format): apply prettier to pre-existing formatting issues These files had pre-existing prettier violations that caused format:check to fail in CI. No logic changes — formatting only. * fix(typecheck): add missing type discriminant to PaneBytesEnvelope literals PaneBytesEnvelope = PaneOutputMessage requires type: "output" since PR #63 aligned the wire envelope with the protocol type. The satisfies checks in the renderer-side tests were missing the discriminant field. * refactor(test-helpers): extract assertByteFaithful + FakeWebSocket to shared helper [LAW:single-enforcer] The two contract test files defined identical assertByteFaithful and FakeWebSocket. Extracting to _helpers/websocket-fake.ts ensures a single change point if BrowserWebSocketLike evolves.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
BINARY_DECODER = new TextDecoder()inwebsocket/transport.ts— the active UTF-8 bug for binary WebSocket frames — and replaces bothArrayBufferandArrayBufferViewpaths indecodeFramewithbytesToLatin1frombyte-codec.tsnew TextDecoder().decode(data)inwebsocket/server.tsfor theUint8Arraytext-frame path withbytesToLatin1(bridge protocol frames are pure-ASCII JSON; latin1 = UTF-8 for all current values)setEncoding('latin1')inspawn.tsas a[LAW:single-enforcer] exception— Node'sBuffer.toString('latin1')is genuinely 1:1 byte↔code-unit (unlike browserTextDecoder('latin1')which is windows-1252 for 0x80–0x9F)Test plan
TextDecoder("latin1")andbytesToLatin1diverge); asserts they arrive as the exact same latin1 code unitspnpm run test:all(unit + integration): 565 tests passpnpm run build: clean