Skip to content

v1.38.1.0 fix wave: surrogate-safe page captures (#1440), Implementation Tasks across review skills (#1454), root-level artifact patterns (#1452)#1504

Merged
garrytan merged 5 commits into
mainfrom
garrytan/copenhagen-v4
May 15, 2026
Merged

v1.38.1.0 fix wave: surrogate-safe page captures (#1440), Implementation Tasks across review skills (#1454), root-level artifact patterns (#1452)#1504
garrytan merged 5 commits into
mainfrom
garrytan/copenhagen-v4

Conversation

@garrytan

Copy link
Copy Markdown
Owner

Summary

Three filed issues land together as a coordinated bug-fix wave.

Browse — Page captures with mixed-script Unicode round-trip cleanly to the Claude API (#1440)

  • browse/src/sanitize.ts (new) — two surrogate-stripping utilities (stripLoneSurrogates, stripLoneSurrogateEscapes) plus sanitizeBody that picks the right pass for text/plain vs application/json bodies.
  • buildCommandResponse extracted from handleCommand and exported for testability. Sanitizes every response body before new Response().
  • /batch was bypassing the chokepoint via its own JSON.stringify({results, ...}). Each per-result string is now sanitized AND the envelope gets a second stripLoneSurrogateEscapes pass on the JSON-escape form.
  • Defense in depth at the highest-volume DOM-text emitters: getCleanText, getCleanTextWithStripping, html, accessibility, all three return points in snapshot.ts.

Review skills — Implementation Tasks checklist + JSONL handoff (#1454)

  • plan-ceo-review, plan-design-review, plan-eng-review, plan-devex-review each emit a markdown ## Implementation Tasks section AND write a jq-built JSONL artifact to ~/.gstack/projects/$SLUG/tasks-{phase}-{datetime}.jsonl. Never hand-rolled echo — jq -nc keeps the file valid when titles/findings contain quotes, newlines, or backslashes.
  • /autoplan Phase 4 reads all four JSONL files, scopes by current branch + 5-commit window, dedupes on exact (component, sorted(files), title), renders aggregated tasks inside the Final Approval Gate.
  • Standalone review runs (/plan-eng-review alone, etc.) produce their own task list and JSONL even outside autoplan — the JSONL is the handoff contract.
  • New resolvers: scripts/resolvers/tasks-section.ts (emit + aggregate), scripts/task-emission-schema.ts (shared schema).

Federation sync — Root-level design + test-plan docs (#1452)

  • bin/gstack-artifacts-init adds projects/*/*-design-*.md and projects/*/*-test-plan-*.md to all three managed blocks: .brain-allowlist, .brain-privacy-map.json (class artifact), .gitattributes (merge=union).
  • Idempotent migration gstack-upgrade/migrations/v1.36.0.0.sh patches existing installs in place. Uses jq for the JSON file (preserves validity). NEVER calls gstack-artifacts-init — that would git commit + push and clobber user state.

Issue closed without code: #1441 — verified already fixed in v1.27.0.0 (alignment of setup-gbrain and gstack-brain-sync on the artifacts_sync_mode key, plus the v1.27.0.0 migration). Comment posted, issue closed.

Follow-up filed: #1503 — the v1.34.2.0 fix for #1428 uses a dual-path approach where the bare path no longer carries the filesystem boundary. Surfaces a token-efficiency concern when diffs touch .claude/skills/. Documented, not blocking this release.

Test Coverage

[+] browse/src/sanitize.ts                  [★★★ TESTED]  browse/test/sanitize.test.ts (18 cases)
[+] browse/src/server.ts buildCommandResponse [★★★ TESTED] browse/test/build-command-response.test.ts (7 cases)
[+] browse/src/server.ts /batch sanitize    [★★ TESTED]    Existing batch.test.ts + manual smoke
[+] browse/src/read-commands.ts wraps       [★ COVERED]    via getCleanText round-trip
[+] browse/src/snapshot.ts wraps            [★ COVERED]    via existing snapshot tests (44 pass)
[+] bin/gstack-artifacts-init blocks        [★★★ TESTED]  test/artifacts-init-migration.test.ts (7 cases)
[+] gstack-upgrade/migrations/v1.36.0.0.sh  [★★★ TESTED]  same — idempotency, missing-file, jq-absent paths
[+] scripts/resolvers/tasks-section.ts      [★★ TESTED]    via gen-skill-docs tests (385 pass) + template-render

COVERAGE: 8/8 new code paths tested (100%)  |  Code: 5/5  |  Skills: 3/3
QUALITY: ★★★:4 ★★:3 ★:1  |  GAPS: 0

Tests: full free-tier bun test green on the impacted files (sanitize ×18, build-command-response ×7, artifacts-init-migration ×7, host-config ×73, gen-skill-docs ×385, content-security ×58, snapshot ×44 pass + 1 pre-existing closeTab failure unrelated to this work).

Pre-Landing Review

No structural issues. Diff has no SQL, no auth surface, no new LLM trust boundary. The sanitizer regexes use ES2018+ negative lookbehind, which Bun supports. The migration is per-file independent (allowlist missing doesn't skip privacy-map repair) and idempotent (re-runs are no-ops). /codex review ran twice during planning (28 findings total, all incorporated except those that became #1503 follow-up after the v1.34.2.0 collision).

Plan Completion

Plan at ~/.claude/plans/system-instruction-you-are-working-sorted-teacup.md. All planned items shipped:

TODOS

No TODO items completed in this PR.

Test plan

🤖 Generated with Claude Code

garrytan and others added 2 commits May 14, 2026 13:58
…oint + /batch envelope (#1440)

Page captures with mixed-script Unicode round-trip cleanly to the Claude API.
Two new utilities in browse/src/sanitize.ts: stripLoneSurrogates for raw UTF-16
strings, stripLoneSurrogateEscapes for \uXXXX JSON escape text. sanitizeBody
picks the right pass based on cr.json.

buildCommandResponse is extracted from handleCommand (now exported) and
applies sanitization before new Response(). /batch was bypassing this
chokepoint via direct JSON.stringify, so it sanitizes each cr.result before
pushing AND wraps the envelope with stripLoneSurrogateEscapes. Defense in
depth wraps at getCleanText, getCleanTextWithStripping, html, accessibility,
and snapshot.ts return points so downstream consumers (datamarking, envelope
wrapping) see sanitized text before the response is built.

25 new unit tests across sanitize.test.ts and build-command-response.test.ts.
content-security.test.ts updated to accept either pre- or post-sanitize form
of the snapshot scoped branch (source-level regression check).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ns, surrogate-safe page captures (#1440 #1452 #1454)

Three filed issues land together:

#1440 — Page captures from real-world HTML hit 'API Error 400: no low
surrogate in string'. Sanitizers + buildCommandResponse extraction shipped in
the prior commit; this commit adds the migration script that patches existing
brain-allowlist/privacy-map/gitattributes installs and the supporting tests.

#1452 — Federation sync was silently skipping root-level design and test-plan
docs. bin/gstack-artifacts-init adds two patterns to all three managed blocks
(.brain-allowlist, .brain-privacy-map.json, .gitattributes). Idempotent
migration v1.36.0.0.sh repairs existing installs in place via jq (preserves
JSON validity) — no commit + push from the migration.

#1454 — All four review skills (CEO/design/eng/DX) emit an Implementation
Tasks markdown section AND write a jq-built JSONL artifact per phase.
/autoplan reads all four files, scopes by current branch + 5-commit window,
dedupes on exact (component, sorted(files), title), and renders an aggregated
list in the Final Approval Gate.

New tests:
- browse/test/sanitize.test.ts (18 cases)
- browse/test/build-command-response.test.ts (7 cases)
- test/artifacts-init-migration.test.ts (7 cases)

VERSION → 1.36.0.0. Skips the v1.34.x slot taken by 'gstack consumable as
submodule' and the v1.35.0.0 slot taken by /document-generate. #1428 was
shipped separately by v1.34.2.0 with a different approach; follow-up #1503
filed for the bare-path filesystem boundary concern surfaced during our
analysis.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions

github-actions Bot commented May 14, 2026

Copy link
Copy Markdown

E2E Evals: ✅ PASS

20/20 tests passed | $3.00 total cost | 12 parallel runners

Suite Result Status Cost
e2e-browse 2/2 $0.15
e2e-deploy 2/2 $0.31
e2e-design 2/2 $0.35
e2e-plan 6/6 $1.27
e2e-qa-workflow 1/1 $0.56
e2e-review 1/1 $0.1
e2e-workflow 1/1 $0.16
llm-judge 5/5 $0.1

12x ubicloud-standard-8 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite

@NikhileshNanduri NikhileshNanduri left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: v1.36.0.0 bug-fix wave

Solid PR overall. The surrogate sanitization is well-designed (chokepoint + defense in depth + /batch envelope), the migration is idempotent, and the test coverage is thorough. A few findings below.

Summary:

  • Bug (near-dup annotation): The "possible-duplicate marker injected by the renderer" comment + schema docstring promise behavior that the jq implementation doesn't deliver. Near-duplicates surface without any cross-reference marker.
  • Dead code: dedupKey in task-emission-schema.ts is exported but never called.
  • Accumulation: JSONL task files grow unboundedly with no housekeeping.
  • Migration sed: The sed -i.bak inline insert syntax has a subtle portability note.


# Exact-match dedup by (component, sorted(files), title). Non-matches kept
# separately with a possible-duplicate marker injected by the renderer.
AGGREGATED_TASKS=$(jq -s \\

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Near-duplicate annotation is mentioned but never implemented.

The comment says "possible-duplicate marker injected by the renderer" but the jq that follows does:

group_by([.component, (.files | sort), .title])
| map(sort_by(...) | .[0])

This is exact-match dedup only. Tasks with the same logical intent but slightly different wording or file lists (e.g., the same sanitizer bug flagged by both CEO and eng review, one listing browse/src/sanitize.ts and the other also adding browse/src/server.ts) get neither collapsed nor annotated.

The PR description says:

Near-duplicates surface separately with a possible-duplicate note for human resolution.

Neither the template code nor the generated SKILL.md injects such a note. Either:

  1. Remove the "possible-duplicate marker injected by the renderer" comment (and fix the matching schema docstring at scripts/task-emission-schema.ts:56), or
  2. Implement the annotation: after group_by, tasks that share the same title but differ in files could emit a sibling with a note field appended to the markdown line.

*/
export function dedupKey(t: Pick<ImplementationTask, 'component' | 'files' | 'title'>): string {
return JSON.stringify({
component: t.component,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dedupKey is exported but never called — it's dead code.

The aggregator in scripts/resolvers/tasks-section.ts reimplements the dedup logic inline as a jq filter. This TypeScript function is never imported or invoked. If it's meant as documentation of the dedup algorithm, a comment on the jq block would be clearer. If it's meant for a future TypeScript-based aggregator, it should be marked // TODO: wire up when aggregator moves to TypeScript.

As-is, a future author could change the jq key and forget to update this, creating silent divergence.

# dedupe by (component, sorted(files), title) — exact match only.
# Sort by priority (P1 > P2 > P3) then by phase order.
ALL_JSONL=$(mktemp -t autoplan-tasks.XXXXXXXX)
for phase in ceo-review design-review eng-review devex-review; do

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSONL files accumulate indefinitely in ~/.gstack/projects/$SLUG/.

Each review run writes a new tasks-{phase}-{datetime}.jsonl. The aggregator reads the latest run_id and ignores older files, but they're never deleted. After 50+ autoplan runs over a few months, this directory fills with hundreds of JSONL files.

Consider one of:

  1. Use a canonical filename like tasks-{phase}-latest.jsonl (overwrite, not append) — simpler, and the timestamp in run_id inside each record already provides the freshness information the aggregator needs.
  2. Prune stale files at write time: before writing the new file, delete any existing tasks-{phase}-*.jsonl for the same phase.
  3. At minimum, document the accumulation behavior so users know to occasionally rm ~/.gstack/projects/$SLUG/tasks-*.jsonl.

printf '%s\n' "${PATTERN}" >> "${ALLOWLIST}"
added_any=1
fi
fi

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sed -i.bak inline insert: verify on GNU sed / Alpine.

The multi-line form:

sed -i.bak "/^# ---- USER ADDITIONS BELOW/i\\
${PATTERN}
" "${ALLOWLIST}"

works on macOS BSD sed and GNU sed on Ubuntu/Debian, but some minimal Linux distributions (Alpine with musl sed, BusyBox) interpret the i\ command differently — they require the text on the SAME line as the command, not on the next line.

The current patterns (projects/*/*-design-*.md, projects/*/*-test-plan-*.md) don't contain backslashes or & chars so the pattern itself is safe. But the portability concern means any future caller who extends NEW_PATTERNS with a path containing \ or adds a Linux CI matrix on Alpine would hit this.

Safer cross-platform alternative using awk:

awk -v pat="${PATTERN}" '/^# ---- USER ADDITIONS BELOW/ { print pat } { print }' "${ALLOWLIST}" > "${ALLOWLIST}.tmp" && mv "${ALLOWLIST}.tmp" "${ALLOWLIST}"

No backup file to clean up, works on all POSIX awk variants.

Comment thread browse/src/sanitize.ts
// Matches \uD8XX-\uDFXX escape text where the pair is not completed by an
// adjacent \uDC00-\uDFFF (high) or preceded by \uD800-\uDBFF (low).
const LONE_SURROGATE_HIGH_ESCAPE = /\\u[Dd][89ABab][0-9A-Fa-f]{2}(?!\\u[Dd][C-Fc-f][0-9A-Fa-f]{2})/g;
const LONE_SURROGATE_LOW_ESCAPE = /(?<!\\u[Dd][89ABab][0-9A-Fa-f]{2})\\u[Dd][C-Fc-f][0-9A-Fa-f]{2}/g;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[C-Fc-f] character class in LONE_SURROGATE_HIGH_ESCAPE — correct but worth a note.

[C-Fc-f] matches C, D, E, F (uppercase) and c, d, e, f (lowercase), which correctly covers the second hex digit of low-surrogate escapes (\uDC00\uDFFF). The regex is correct.

However, the lookahead in LONE_SURROGATE_HIGH_ESCAPE checks (?!\\u[Dd][C-Fc-f][0-9A-Fa-f]{2}) — this correctly validates that a high-surrogate escape is NOT followed by a low-surrogate escape, leaving valid pairs (\uD83D\uDE00 = 😀) untouched.

Minor: adding a test case for the boundary escape \uDBFF\uDC00 (both boundary values of valid pair) would close the coverage gap on the edge of the character class ranges. The existing test covers \uD83D\uDE00 which is mid-range; the boundary pair verifies no off-by-one in the [89ABab]/[C-Fc-f] classes.

….38.0.0

Main shipped v1.37.0.0 (split-engine gbrain) while this PR was in review.
Merge resolved cleanly on bin/gstack-artifacts-init (allowlist patterns from
both branches coexist). CHANGELOG entry retained for our changes, version
header bumped from 1.36.0.0 to 1.38.0.0. Migration script renamed:
gstack-upgrade/migrations/v1.36.0.0.sh -> v1.38.0.0.sh, with internal
references and the test file updated to match.

VERSION + package.json: 1.38.0.0.
Regenerated SKILL.md across 12 hosts and refreshed golden fixtures.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@garrytan garrytan changed the title v1.36.0.0 fix wave: surrogate-safe page captures (#1440), Implementation Tasks across review skills (#1454), root-level artifact patterns (#1452) v1.38.0.0 fix wave: surrogate-safe page captures (#1440), Implementation Tasks across review skills (#1454), root-level artifact patterns (#1452) May 15, 2026
VERSION + package.json + CHANGELOG header + migration filename + test
reference all consistently at v1.38.1.0. Migration renamed:
gstack-upgrade/migrations/v1.38.0.0.sh -> v1.38.1.0.sh.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@garrytan garrytan changed the title v1.38.0.0 fix wave: surrogate-safe page captures (#1440), Implementation Tasks across review skills (#1454), root-level artifact patterns (#1452) v1.38.1.0 fix wave: surrogate-safe page captures (#1440), Implementation Tasks across review skills (#1454), root-level artifact patterns (#1452) May 15, 2026
…op of choke-point sanitization

Main shipped v1.38.0.0 with surrogate sanitization at the handleCommandInternal
choke point (cleaner architecture than ours). This merge keeps both:

- v1.38.0.0's handleCommandInternal sanitizing wrapper around
  handleCommandInternalImpl (choke point, all callers benefit automatically).
- This branch's buildCommandResponse extraction (exported, unit-testable) +
  stripLoneSurrogateEscapes for \uXXXX JSON-escape forms (handles bodies that
  were already stringified before reaching the choke point).

The two layers compose: choke point catches raw surrogates at result-build
time, response boundary catches escape-text forms.

CHANGELOG entry reframed to credit v1.38.0.0's choke-point fix and position
our additions as defense-in-depth. Net new in this release:
- Implementation Tasks across 4 review skills + autoplan JSONL aggregator (#1454)
- Root-level allowlist patterns + idempotent jq migration v1.38.1.0.sh (#1452)
- Defense-in-depth surrogate sanitization layer + buildCommandResponse
  extraction + 25 new unit tests (#1440 follow-up to v1.38.0.0).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@garrytan garrytan merged commit ea51b45 into main May 15, 2026
23 checks passed
RyotaKun pushed a commit to RyotaKun/gstack that referenced this pull request May 18, 2026
…lementation Tasks across review skills (garrytan#1454), root-level artifact patterns (garrytan#1452) (garrytan#1504)

* fix(browse): sanitize lone Unicode surrogates at commandResult chokepoint + /batch envelope (garrytan#1440)

Page captures with mixed-script Unicode round-trip cleanly to the Claude API.
Two new utilities in browse/src/sanitize.ts: stripLoneSurrogates for raw UTF-16
strings, stripLoneSurrogateEscapes for \uXXXX JSON escape text. sanitizeBody
picks the right pass based on cr.json.

buildCommandResponse is extracted from handleCommand (now exported) and
applies sanitization before new Response(). /batch was bypassing this
chokepoint via direct JSON.stringify, so it sanitizes each cr.result before
pushing AND wraps the envelope with stripLoneSurrogateEscapes. Defense in
depth wraps at getCleanText, getCleanTextWithStripping, html, accessibility,
and snapshot.ts return points so downstream consumers (datamarking, envelope
wrapping) see sanitized text before the response is built.

25 new unit tests across sanitize.test.ts and build-command-response.test.ts.
content-security.test.ts updated to accept either pre- or post-sanitize form
of the snapshot scoped branch (source-level regression check).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat: bug fix wave v1.36.0.0 — Implementation Tasks, allowlist patterns, surrogate-safe page captures (garrytan#1440 garrytan#1452 garrytan#1454)

Three filed issues land together:

garrytan#1440 — Page captures from real-world HTML hit 'API Error 400: no low
surrogate in string'. Sanitizers + buildCommandResponse extraction shipped in
the prior commit; this commit adds the migration script that patches existing
brain-allowlist/privacy-map/gitattributes installs and the supporting tests.

garrytan#1452 — Federation sync was silently skipping root-level design and test-plan
docs. bin/gstack-artifacts-init adds two patterns to all three managed blocks
(.brain-allowlist, .brain-privacy-map.json, .gitattributes). Idempotent
migration v1.36.0.0.sh repairs existing installs in place via jq (preserves
JSON validity) — no commit + push from the migration.

garrytan#1454 — All four review skills (CEO/design/eng/DX) emit an Implementation
Tasks markdown section AND write a jq-built JSONL artifact per phase.
/autoplan reads all four files, scopes by current branch + 5-commit window,
dedupes on exact (component, sorted(files), title), and renders an aggregated
list in the Final Approval Gate.

New tests:
- browse/test/sanitize.test.ts (18 cases)
- browse/test/build-command-response.test.ts (7 cases)
- test/artifacts-init-migration.test.ts (7 cases)

VERSION → 1.36.0.0. Skips the v1.34.x slot taken by 'gstack consumable as
submodule' and the v1.35.0.0 slot taken by /document-generate. garrytan#1428 was
shipped separately by v1.34.2.0 with a different approach; follow-up garrytan#1503
filed for the bare-path filesystem boundary concern surfaced during our
analysis.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: bump to v1.38.1.0

VERSION + package.json + CHANGELOG header + migration filename + test
reference all consistently at v1.38.1.0. Migration renamed:
gstack-upgrade/migrations/v1.38.0.0.sh -> v1.38.1.0.sh.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants