v1.38.1.0 fix wave: surrogate-safe page captures (#1440), Implementation Tasks across review skills (#1454), root-level artifact patterns (#1452)#1504
Conversation
…oint + /batch envelope (#1440) Page captures with mixed-script Unicode round-trip cleanly to the Claude API. Two new utilities in browse/src/sanitize.ts: stripLoneSurrogates for raw UTF-16 strings, stripLoneSurrogateEscapes for \uXXXX JSON escape text. sanitizeBody picks the right pass based on cr.json. buildCommandResponse is extracted from handleCommand (now exported) and applies sanitization before new Response(). /batch was bypassing this chokepoint via direct JSON.stringify, so it sanitizes each cr.result before pushing AND wraps the envelope with stripLoneSurrogateEscapes. Defense in depth wraps at getCleanText, getCleanTextWithStripping, html, accessibility, and snapshot.ts return points so downstream consumers (datamarking, envelope wrapping) see sanitized text before the response is built. 25 new unit tests across sanitize.test.ts and build-command-response.test.ts. content-security.test.ts updated to accept either pre- or post-sanitize form of the snapshot scoped branch (source-level regression check). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ns, surrogate-safe page captures (#1440 #1452 #1454) Three filed issues land together: #1440 — Page captures from real-world HTML hit 'API Error 400: no low surrogate in string'. Sanitizers + buildCommandResponse extraction shipped in the prior commit; this commit adds the migration script that patches existing brain-allowlist/privacy-map/gitattributes installs and the supporting tests. #1452 — Federation sync was silently skipping root-level design and test-plan docs. bin/gstack-artifacts-init adds two patterns to all three managed blocks (.brain-allowlist, .brain-privacy-map.json, .gitattributes). Idempotent migration v1.36.0.0.sh repairs existing installs in place via jq (preserves JSON validity) — no commit + push from the migration. #1454 — All four review skills (CEO/design/eng/DX) emit an Implementation Tasks markdown section AND write a jq-built JSONL artifact per phase. /autoplan reads all four files, scopes by current branch + 5-commit window, dedupes on exact (component, sorted(files), title), and renders an aggregated list in the Final Approval Gate. New tests: - browse/test/sanitize.test.ts (18 cases) - browse/test/build-command-response.test.ts (7 cases) - test/artifacts-init-migration.test.ts (7 cases) VERSION → 1.36.0.0. Skips the v1.34.x slot taken by 'gstack consumable as submodule' and the v1.35.0.0 slot taken by /document-generate. #1428 was shipped separately by v1.34.2.0 with a different approach; follow-up #1503 filed for the bare-path filesystem boundary concern surfaced during our analysis. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
E2E Evals: ✅ PASS20/20 tests passed | $3.00 total cost | 12 parallel runners
12x ubicloud-standard-8 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite |
NikhileshNanduri
left a comment
There was a problem hiding this comment.
Review: v1.36.0.0 bug-fix wave
Solid PR overall. The surrogate sanitization is well-designed (chokepoint + defense in depth + /batch envelope), the migration is idempotent, and the test coverage is thorough. A few findings below.
Summary:
- Bug (near-dup annotation): The "possible-duplicate marker injected by the renderer" comment + schema docstring promise behavior that the jq implementation doesn't deliver. Near-duplicates surface without any cross-reference marker.
- Dead code:
dedupKeyintask-emission-schema.tsis exported but never called. - Accumulation: JSONL task files grow unboundedly with no housekeeping.
- Migration sed: The
sed -i.bakinline insert syntax has a subtle portability note.
|
|
||
| # Exact-match dedup by (component, sorted(files), title). Non-matches kept | ||
| # separately with a possible-duplicate marker injected by the renderer. | ||
| AGGREGATED_TASKS=$(jq -s \\ |
There was a problem hiding this comment.
Near-duplicate annotation is mentioned but never implemented.
The comment says "possible-duplicate marker injected by the renderer" but the jq that follows does:
group_by([.component, (.files | sort), .title])
| map(sort_by(...) | .[0])This is exact-match dedup only. Tasks with the same logical intent but slightly different wording or file lists (e.g., the same sanitizer bug flagged by both CEO and eng review, one listing browse/src/sanitize.ts and the other also adding browse/src/server.ts) get neither collapsed nor annotated.
The PR description says:
Near-duplicates surface separately with a possible-duplicate note for human resolution.
Neither the template code nor the generated SKILL.md injects such a note. Either:
- Remove the "possible-duplicate marker injected by the renderer" comment (and fix the matching schema docstring at
scripts/task-emission-schema.ts:56), or - Implement the annotation: after
group_by, tasks that share the sametitlebut differ infilescould emit a sibling with anotefield appended to the markdown line.
| */ | ||
| export function dedupKey(t: Pick<ImplementationTask, 'component' | 'files' | 'title'>): string { | ||
| return JSON.stringify({ | ||
| component: t.component, |
There was a problem hiding this comment.
dedupKey is exported but never called — it's dead code.
The aggregator in scripts/resolvers/tasks-section.ts reimplements the dedup logic inline as a jq filter. This TypeScript function is never imported or invoked. If it's meant as documentation of the dedup algorithm, a comment on the jq block would be clearer. If it's meant for a future TypeScript-based aggregator, it should be marked // TODO: wire up when aggregator moves to TypeScript.
As-is, a future author could change the jq key and forget to update this, creating silent divergence.
| # dedupe by (component, sorted(files), title) — exact match only. | ||
| # Sort by priority (P1 > P2 > P3) then by phase order. | ||
| ALL_JSONL=$(mktemp -t autoplan-tasks.XXXXXXXX) | ||
| for phase in ceo-review design-review eng-review devex-review; do |
There was a problem hiding this comment.
JSONL files accumulate indefinitely in ~/.gstack/projects/$SLUG/.
Each review run writes a new tasks-{phase}-{datetime}.jsonl. The aggregator reads the latest run_id and ignores older files, but they're never deleted. After 50+ autoplan runs over a few months, this directory fills with hundreds of JSONL files.
Consider one of:
- Use a canonical filename like
tasks-{phase}-latest.jsonl(overwrite, not append) — simpler, and the timestamp inrun_idinside each record already provides the freshness information the aggregator needs. - Prune stale files at write time: before writing the new file, delete any existing
tasks-{phase}-*.jsonlfor the same phase. - At minimum, document the accumulation behavior so users know to occasionally
rm ~/.gstack/projects/$SLUG/tasks-*.jsonl.
| printf '%s\n' "${PATTERN}" >> "${ALLOWLIST}" | ||
| added_any=1 | ||
| fi | ||
| fi |
There was a problem hiding this comment.
sed -i.bak inline insert: verify on GNU sed / Alpine.
The multi-line form:
sed -i.bak "/^# ---- USER ADDITIONS BELOW/i\\
${PATTERN}
" "${ALLOWLIST}"works on macOS BSD sed and GNU sed on Ubuntu/Debian, but some minimal Linux distributions (Alpine with musl sed, BusyBox) interpret the i\ command differently — they require the text on the SAME line as the command, not on the next line.
The current patterns (projects/*/*-design-*.md, projects/*/*-test-plan-*.md) don't contain backslashes or & chars so the pattern itself is safe. But the portability concern means any future caller who extends NEW_PATTERNS with a path containing \ or adds a Linux CI matrix on Alpine would hit this.
Safer cross-platform alternative using awk:
awk -v pat="${PATTERN}" '/^# ---- USER ADDITIONS BELOW/ { print pat } { print }' "${ALLOWLIST}" > "${ALLOWLIST}.tmp" && mv "${ALLOWLIST}.tmp" "${ALLOWLIST}"No backup file to clean up, works on all POSIX awk variants.
| // Matches \uD8XX-\uDFXX escape text where the pair is not completed by an | ||
| // adjacent \uDC00-\uDFFF (high) or preceded by \uD800-\uDBFF (low). | ||
| const LONE_SURROGATE_HIGH_ESCAPE = /\\u[Dd][89ABab][0-9A-Fa-f]{2}(?!\\u[Dd][C-Fc-f][0-9A-Fa-f]{2})/g; | ||
| const LONE_SURROGATE_LOW_ESCAPE = /(?<!\\u[Dd][89ABab][0-9A-Fa-f]{2})\\u[Dd][C-Fc-f][0-9A-Fa-f]{2}/g; |
There was a problem hiding this comment.
[C-Fc-f] character class in LONE_SURROGATE_HIGH_ESCAPE — correct but worth a note.
[C-Fc-f] matches C, D, E, F (uppercase) and c, d, e, f (lowercase), which correctly covers the second hex digit of low-surrogate escapes (\uDC00–\uDFFF). The regex is correct.
However, the lookahead in LONE_SURROGATE_HIGH_ESCAPE checks (?!\\u[Dd][C-Fc-f][0-9A-Fa-f]{2}) — this correctly validates that a high-surrogate escape is NOT followed by a low-surrogate escape, leaving valid pairs (\uD83D\uDE00 = 😀) untouched.
Minor: adding a test case for the boundary escape \uDBFF\uDC00 (both boundary values of valid pair) would close the coverage gap on the edge of the character class ranges. The existing test covers \uD83D\uDE00 which is mid-range; the boundary pair verifies no off-by-one in the [89ABab]/[C-Fc-f] classes.
….38.0.0 Main shipped v1.37.0.0 (split-engine gbrain) while this PR was in review. Merge resolved cleanly on bin/gstack-artifacts-init (allowlist patterns from both branches coexist). CHANGELOG entry retained for our changes, version header bumped from 1.36.0.0 to 1.38.0.0. Migration script renamed: gstack-upgrade/migrations/v1.36.0.0.sh -> v1.38.0.0.sh, with internal references and the test file updated to match. VERSION + package.json: 1.38.0.0. Regenerated SKILL.md across 12 hosts and refreshed golden fixtures. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
VERSION + package.json + CHANGELOG header + migration filename + test reference all consistently at v1.38.1.0. Migration renamed: gstack-upgrade/migrations/v1.38.0.0.sh -> v1.38.1.0.sh. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…op of choke-point sanitization Main shipped v1.38.0.0 with surrogate sanitization at the handleCommandInternal choke point (cleaner architecture than ours). This merge keeps both: - v1.38.0.0's handleCommandInternal sanitizing wrapper around handleCommandInternalImpl (choke point, all callers benefit automatically). - This branch's buildCommandResponse extraction (exported, unit-testable) + stripLoneSurrogateEscapes for \uXXXX JSON-escape forms (handles bodies that were already stringified before reaching the choke point). The two layers compose: choke point catches raw surrogates at result-build time, response boundary catches escape-text forms. CHANGELOG entry reframed to credit v1.38.0.0's choke-point fix and position our additions as defense-in-depth. Net new in this release: - Implementation Tasks across 4 review skills + autoplan JSONL aggregator (#1454) - Root-level allowlist patterns + idempotent jq migration v1.38.1.0.sh (#1452) - Defense-in-depth surrogate sanitization layer + buildCommandResponse extraction + 25 new unit tests (#1440 follow-up to v1.38.0.0). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…lementation Tasks across review skills (garrytan#1454), root-level artifact patterns (garrytan#1452) (garrytan#1504) * fix(browse): sanitize lone Unicode surrogates at commandResult chokepoint + /batch envelope (garrytan#1440) Page captures with mixed-script Unicode round-trip cleanly to the Claude API. Two new utilities in browse/src/sanitize.ts: stripLoneSurrogates for raw UTF-16 strings, stripLoneSurrogateEscapes for \uXXXX JSON escape text. sanitizeBody picks the right pass based on cr.json. buildCommandResponse is extracted from handleCommand (now exported) and applies sanitization before new Response(). /batch was bypassing this chokepoint via direct JSON.stringify, so it sanitizes each cr.result before pushing AND wraps the envelope with stripLoneSurrogateEscapes. Defense in depth wraps at getCleanText, getCleanTextWithStripping, html, accessibility, and snapshot.ts return points so downstream consumers (datamarking, envelope wrapping) see sanitized text before the response is built. 25 new unit tests across sanitize.test.ts and build-command-response.test.ts. content-security.test.ts updated to accept either pre- or post-sanitize form of the snapshot scoped branch (source-level regression check). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat: bug fix wave v1.36.0.0 — Implementation Tasks, allowlist patterns, surrogate-safe page captures (garrytan#1440 garrytan#1452 garrytan#1454) Three filed issues land together: garrytan#1440 — Page captures from real-world HTML hit 'API Error 400: no low surrogate in string'. Sanitizers + buildCommandResponse extraction shipped in the prior commit; this commit adds the migration script that patches existing brain-allowlist/privacy-map/gitattributes installs and the supporting tests. garrytan#1452 — Federation sync was silently skipping root-level design and test-plan docs. bin/gstack-artifacts-init adds two patterns to all three managed blocks (.brain-allowlist, .brain-privacy-map.json, .gitattributes). Idempotent migration v1.36.0.0.sh repairs existing installs in place via jq (preserves JSON validity) — no commit + push from the migration. garrytan#1454 — All four review skills (CEO/design/eng/DX) emit an Implementation Tasks markdown section AND write a jq-built JSONL artifact per phase. /autoplan reads all four files, scopes by current branch + 5-commit window, dedupes on exact (component, sorted(files), title), and renders an aggregated list in the Final Approval Gate. New tests: - browse/test/sanitize.test.ts (18 cases) - browse/test/build-command-response.test.ts (7 cases) - test/artifacts-init-migration.test.ts (7 cases) VERSION → 1.36.0.0. Skips the v1.34.x slot taken by 'gstack consumable as submodule' and the v1.35.0.0 slot taken by /document-generate. garrytan#1428 was shipped separately by v1.34.2.0 with a different approach; follow-up garrytan#1503 filed for the bare-path filesystem boundary concern surfaced during our analysis. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: bump to v1.38.1.0 VERSION + package.json + CHANGELOG header + migration filename + test reference all consistently at v1.38.1.0. Migration renamed: gstack-upgrade/migrations/v1.38.0.0.sh -> v1.38.1.0.sh. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Three filed issues land together as a coordinated bug-fix wave.
Browse — Page captures with mixed-script Unicode round-trip cleanly to the Claude API (#1440)
browse/src/sanitize.ts(new) — two surrogate-stripping utilities (stripLoneSurrogates,stripLoneSurrogateEscapes) plussanitizeBodythat picks the right pass for text/plain vs application/json bodies.buildCommandResponseextracted fromhandleCommandand exported for testability. Sanitizes every response body beforenew Response()./batchwas bypassing the chokepoint via its ownJSON.stringify({results, ...}). Each per-result string is now sanitized AND the envelope gets a secondstripLoneSurrogateEscapespass on the JSON-escape form.getCleanText,getCleanTextWithStripping,html,accessibility, all three return points insnapshot.ts.Review skills — Implementation Tasks checklist + JSONL handoff (#1454)
plan-ceo-review,plan-design-review,plan-eng-review,plan-devex-revieweach emit a markdown## Implementation Taskssection AND write ajq-built JSONL artifact to~/.gstack/projects/$SLUG/tasks-{phase}-{datetime}.jsonl. Never hand-rolled echo —jq -nckeeps the file valid when titles/findings contain quotes, newlines, or backslashes./autoplanPhase 4 reads all four JSONL files, scopes by current branch + 5-commit window, dedupes on exact(component, sorted(files), title), renders aggregated tasks inside the Final Approval Gate./plan-eng-reviewalone, etc.) produce their own task list and JSONL even outside autoplan — the JSONL is the handoff contract.scripts/resolvers/tasks-section.ts(emit + aggregate),scripts/task-emission-schema.ts(shared schema).Federation sync — Root-level design + test-plan docs (#1452)
bin/gstack-artifacts-initaddsprojects/*/*-design-*.mdandprojects/*/*-test-plan-*.mdto all three managed blocks:.brain-allowlist,.brain-privacy-map.json(classartifact),.gitattributes(merge=union).gstack-upgrade/migrations/v1.36.0.0.shpatches existing installs in place. Usesjqfor the JSON file (preserves validity). NEVER callsgstack-artifacts-init— that wouldgit commit + pushand clobber user state.Issue closed without code: #1441 — verified already fixed in v1.27.0.0 (alignment of
setup-gbrainandgstack-brain-syncon theartifacts_sync_modekey, plus the v1.27.0.0 migration). Comment posted, issue closed.Follow-up filed: #1503 — the v1.34.2.0 fix for #1428 uses a dual-path approach where the bare path no longer carries the filesystem boundary. Surfaces a token-efficiency concern when diffs touch
.claude/skills/. Documented, not blocking this release.Test Coverage
Tests: full free-tier
bun testgreen on the impacted files (sanitize ×18, build-command-response ×7, artifacts-init-migration ×7, host-config ×73, gen-skill-docs ×385, content-security ×58, snapshot ×44 pass + 1 pre-existing closeTab failure unrelated to this work).Pre-Landing Review
No structural issues. Diff has no SQL, no auth surface, no new LLM trust boundary. The sanitizer regexes use ES2018+ negative lookbehind, which Bun supports. The migration is per-file independent (allowlist missing doesn't skip privacy-map repair) and idempotent (re-runs are no-ops).
/codex reviewran twice during planning (28 findings total, all incorporated except those that became #1503 follow-up after the v1.34.2.0 collision).Plan Completion
Plan at
~/.claude/plans/system-instruction-you-are-working-sorted-teacup.md. All planned items shipped:TODOS
No TODO items completed in this PR.
Test plan
bun testpasses on browse/test/sanitize.test.ts (18 cases)bun testpasses on browse/test/build-command-response.test.ts (7 cases)bun testpasses on test/artifacts-init-migration.test.ts (7 cases)bun testpasses on test/host-config.test.ts (73 — golden fixtures refreshed)bun testpasses on test/gen-skill-docs.test.ts (385 — all hosts regenerated)bun testpasses on browse/test/content-security.test.ts (58 — source-level regression updated to accept post-sanitize form)bun run gen:skill-docsregenerates cleanly for all 12 hosts🤖 Generated with Claude Code