Skip to content

feat(byte-codec): route all transports through byte-faithful codec (xxj.2)#63

Merged
brandon-fryslie merged 1 commit into
masterfrom
tmux-byte-codec-xxj.2
May 27, 2026
Merged

feat(byte-codec): route all transports through byte-faithful codec (xxj.2)#63
brandon-fryslie merged 1 commit into
masterfrom
tmux-byte-codec-xxj.2

Conversation

@brandon-fryslie

Copy link
Copy Markdown
Contributor

Summary

  • Removes BINARY_DECODER = new TextDecoder() in websocket/transport.ts — the active UTF-8 bug for binary WebSocket frames — and replaces both ArrayBuffer and ArrayBufferView paths in decodeFrame with bytesToLatin1 from byte-codec.ts
  • Replaces new TextDecoder().decode(data) in websocket/server.ts for the Uint8Array text-frame path with bytesToLatin1 (bridge protocol frames are pure-ASCII JSON; latin1 = UTF-8 for all current values)
  • Documents setEncoding('latin1') in spawn.ts as a [LAW:single-enforcer] exception — Node's Buffer.toString('latin1') is genuinely 1:1 byte↔code-unit (unlike browser TextDecoder('latin1') which is windows-1252 for 0x80–0x9F)

Test plan

  • ArrayBuffer decode test updated to probe bytes 0x80–0x9F (the windows-1252 landmine range where TextDecoder("latin1") and bytesToLatin1 diverge); asserts they arrive as the exact same latin1 code units
  • pnpm run test:all (unit + integration): 565 tests pass
  • pnpm run build: clean

…[LAW:single-enforcer] (xxj.2)

Replace per-site TextDecoder calls with bytesToLatin1 from byte-codec.ts:
- connectors/websocket/transport.ts: remove BINARY_DECODER (the active UTF-8 bug for binary frames); decodeFrame now uses bytesToLatin1 for ArrayBuffer and ArrayBufferView paths
- connectors/websocket/server.ts: Uint8Array text-frame path uses bytesToLatin1 (bridge JSON is pure-ASCII; latin1 = UTF-8 for all current protocol values)
- transport/spawn.ts: setEncoding('latin1') documented as [LAW:single-enforcer] exception — Node-only Buffer.toString('latin1') is genuinely 1:1 byte↔code-unit (unlike browser TextDecoder('latin1') which is windows-1252)

Tests updated: ArrayBuffer test now probes the 0x80–0x9F landmine range directly (bytesToLatin1 vs TextDecoder('latin1') diverge exactly there) rather than asserting UTF-8 round-trip behavior.
Copilot AI review requested due to automatic review settings May 27, 2026 19:46

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Routes all transport-layer byte→string conversions through the byte-faithful bytesToLatin1 codec, fixing a UTF-8 corruption bug for binary WebSocket frames in the websocketTransport adapter and aligning the bridge server's Uint8Array text-frame path. Also documents spawn.ts's setEncoding('latin1') as a sanctioned Node-only exception to the single-enforcer law.

Changes:

  • Replace TextDecoder with bytesToLatin1 in websocket/transport.ts (both ArrayBuffer and ArrayBufferView branches) and update its test to probe the 0x80–0x9F windows-1252 landmine range.
  • Replace new TextDecoder().decode(data) with bytesToLatin1 for the Uint8Array text-frame path in websocket/server.ts.
  • Add [LAW:single-enforcer] exception comment in spawn.ts documenting why setEncoding('latin1') is allowed.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
src/connectors/websocket/transport.ts Removes BINARY_DECODER; routes both binary frame branches through bytesToLatin1.
src/connectors/websocket/server.ts Switches the Uint8Array text-frame fallback to bytesToLatin1.
src/transport/spawn.ts Documents setEncoding('latin1') as a Node-only single-enforcer exception.
tests/unit/websocket-transport.test.ts Updates ArrayBuffer/typed-array frame tests to assert byte-faithful decoding across 0x80–0x9F.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@brandon-fryslie brandon-fryslie merged commit 28cd39f into master May 27, 2026
1 of 2 checks passed
@brandon-fryslie brandon-fryslie deleted the tmux-byte-codec-xxj.2 branch May 27, 2026 19:49
brandon-fryslie added a commit that referenced this pull request May 27, 2026
…terals

PaneBytesEnvelope = PaneOutputMessage requires type: "output" since PR #63
aligned the wire envelope with the protocol type. The satisfies checks in
the renderer-side tests were missing the discriminant field.
brandon-fryslie added a commit that referenced this pull request May 27, 2026
#64)

* test(byte-codec): cross-transport byte-faithfulness contract — [LAW:behavior-not-structure] (xxj.3)

Adds 8 contract tests asserting ∀ byte b in probe payload, every transport's
onData delivers a string where charCodeAt(i) === b.

Node file covers websocketTransport (ArrayBuffer, Uint8Array, sub-array view
with non-zero byteOffset) and the spawn transport's setEncoding('latin1') path
via a PassThrough stream. Browser file runs the same websocketTransport
assertions in happy-dom, guarding against future TextDecoder('latin1')
regression (windows-1252 remaps 0x80-0x9F in browsers).

Canonical probe payload in _helpers/probe-bytes.ts: windows-1252 landmine
bytes 0x80/0x81/0x8D/0x8F/0x9D/0x9F, emoji U+1F600 (F0 9F 98 80), CJK
U+4E2D (E4 B8 AD), ASCII baseline.

[LAW:single-enforcer] probe payload defined once in probe-bytes.ts
[LAW:behavior-not-structure] tests assert the per-byte contract, not decode impl

* fix(lint): pre-existing ESLint errors blocking CI

ReadonlyArray<T> → readonly T[], type → interface for PaneMeta,
remove unused PaneBytesEnvelope import in electron/main.ts.
None of these were introduced by this PR; all existed on master.

* fix(test): align FakeWebSocket signatures and correct TextDecoder comment

- send()/close() now match BrowserWebSocketLike parameter signatures
- Comment corrected: TextDecoder('latin1') is windows-1252 everywhere per
  the WHATWG Encoding Standard, not just in browsers; only String.fromCharCode
  (bytesToLatin1) is byte-faithful in all runtimes

* fix(test): clarify TextDecoder vs Node stream latin1 in spawn test comments

TextDecoder('latin1') is windows-1252 in ALL WHATWG runtimes (not just
browsers). Node stream setEncoding('latin1') is byte-faithful but Node-only.
String.fromCharCode (bytesToLatin1) is the portable cross-runtime path.

* fix(format): apply prettier to pre-existing formatting issues

These files had pre-existing prettier violations that caused format:check
to fail in CI. No logic changes — formatting only.

* fix(typecheck): add missing type discriminant to PaneBytesEnvelope literals

PaneBytesEnvelope = PaneOutputMessage requires type: "output" since PR #63
aligned the wire envelope with the protocol type. The satisfies checks in
the renderer-side tests were missing the discriminant field.

* refactor(test-helpers): extract assertByteFaithful + FakeWebSocket to shared helper

[LAW:single-enforcer] The two contract test files defined identical
assertByteFaithful and FakeWebSocket. Extracting to _helpers/websocket-fake.ts
ensures a single change point if BrowserWebSocketLike evolves.
brandon-fryslie added a commit that referenced this pull request May 27, 2026
…[LAW:single-enforcer] (xxj.2) (#63)

Replace per-site TextDecoder calls with bytesToLatin1 from byte-codec.ts:
- connectors/websocket/transport.ts: remove BINARY_DECODER (the active UTF-8 bug for binary frames); decodeFrame now uses bytesToLatin1 for ArrayBuffer and ArrayBufferView paths
- connectors/websocket/server.ts: Uint8Array text-frame path uses bytesToLatin1 (bridge JSON is pure-ASCII; latin1 = UTF-8 for all current protocol values)
- transport/spawn.ts: setEncoding('latin1') documented as [LAW:single-enforcer] exception — Node-only Buffer.toString('latin1') is genuinely 1:1 byte↔code-unit (unlike browser TextDecoder('latin1') which is windows-1252)

Tests updated: ArrayBuffer test now probes the 0x80–0x9F landmine range directly (bytesToLatin1 vs TextDecoder('latin1') diverge exactly there) rather than asserting UTF-8 round-trip behavior.
brandon-fryslie added a commit that referenced this pull request May 27, 2026
#64)

* test(byte-codec): cross-transport byte-faithfulness contract — [LAW:behavior-not-structure] (xxj.3)

Adds 8 contract tests asserting ∀ byte b in probe payload, every transport's
onData delivers a string where charCodeAt(i) === b.

Node file covers websocketTransport (ArrayBuffer, Uint8Array, sub-array view
with non-zero byteOffset) and the spawn transport's setEncoding('latin1') path
via a PassThrough stream. Browser file runs the same websocketTransport
assertions in happy-dom, guarding against future TextDecoder('latin1')
regression (windows-1252 remaps 0x80-0x9F in browsers).

Canonical probe payload in _helpers/probe-bytes.ts: windows-1252 landmine
bytes 0x80/0x81/0x8D/0x8F/0x9D/0x9F, emoji U+1F600 (F0 9F 98 80), CJK
U+4E2D (E4 B8 AD), ASCII baseline.

[LAW:single-enforcer] probe payload defined once in probe-bytes.ts
[LAW:behavior-not-structure] tests assert the per-byte contract, not decode impl

* fix(lint): pre-existing ESLint errors blocking CI

ReadonlyArray<T> → readonly T[], type → interface for PaneMeta,
remove unused PaneBytesEnvelope import in electron/main.ts.
None of these were introduced by this PR; all existed on master.

* fix(test): align FakeWebSocket signatures and correct TextDecoder comment

- send()/close() now match BrowserWebSocketLike parameter signatures
- Comment corrected: TextDecoder('latin1') is windows-1252 everywhere per
  the WHATWG Encoding Standard, not just in browsers; only String.fromCharCode
  (bytesToLatin1) is byte-faithful in all runtimes

* fix(test): clarify TextDecoder vs Node stream latin1 in spawn test comments

TextDecoder('latin1') is windows-1252 in ALL WHATWG runtimes (not just
browsers). Node stream setEncoding('latin1') is byte-faithful but Node-only.
String.fromCharCode (bytesToLatin1) is the portable cross-runtime path.

* fix(format): apply prettier to pre-existing formatting issues

These files had pre-existing prettier violations that caused format:check
to fail in CI. No logic changes — formatting only.

* fix(typecheck): add missing type discriminant to PaneBytesEnvelope literals

PaneBytesEnvelope = PaneOutputMessage requires type: "output" since PR #63
aligned the wire envelope with the protocol type. The satisfies checks in
the renderer-side tests were missing the discriminant field.

* refactor(test-helpers): extract assertByteFaithful + FakeWebSocket to shared helper

[LAW:single-enforcer] The two contract test files defined identical
assertByteFaithful and FakeWebSocket. Extracting to _helpers/websocket-fake.ts
ensures a single change point if BrowserWebSocketLike evolves.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants