chore(deps): bump diff from 7.0.0 to 9.0.0 in the npm_and_yarn group across 1 directory by dependabot[bot] · Pull Request #1 · adamtasteslikegood/gstack

dependabot · 2026-05-06T19:35:52Z

Bumps the npm_and_yarn group with 1 update in the / directory: diff.

Updates diff from 7.0.0 to 9.0.0

Changelog

9.0.0

(All changes part of PR #672.)

ES5 support is dropped. parsePatch now uses TextDecoder and Uint8Array, which are not available in ES5, and TypeScript is now compiled with the "es6" target. From now on, I intend to freely use any features that are deemed "Widely available" by Baseline. Users who need ES5 support should stick to version 8.

C-style quoted strings in filename headers are now properly supported.

When the name of either the old or new file in a patch contains "special characters", both GNU diff and Git quote the filename in the patch's headers and escape special characters using the same escape sequences that are used in string literals in C, including octal escapes for all non-ASCII characters. Previously, jsdiff had very little support for this; parsePatch would remove the quotes, and unescape any escaped backslashes, but would not unescape other escape sequences. formatPatch, meanwhile, did not quote or escape special characters at all.

Now, parsePatch parses all the possible escape sequences that GNU diff (or Git) ever output, and formatPatch quotes and escapes filenames containing special characters in the same way GNU diff does.

formatPatch now omits file headers when oldFileName or newFileName in the provided patch object are undefined, regardless of the headerOptions parameter. (Previously, it would treat the absence of oldFileName or newFileName as indicating the filename was the word "undefined" and emit headers --- undefined / +++ undefined.)

formatPatch no longer outputs trailing tab characters at the end of ---/+++ headers.

Previously, if formatPatch was passed a patch object to serialize that had empty strings for the oldHeader or newHeader property, it would include a trailing tab character after the filename in the --- and/or +++ file header. Now, this scenario is treated the same as when oldHeader/newHeader is undefined - i.e. the trailing tab is omitted.

formatPatch no longer mutates its input when serializing a patch containing a hunk where either the old or new content contained zero lines. (Such a hunk occurs only when the hunk has no context lines and represents a pure insertion or pure deletion, which for instance will occur whenever one of the two files being diffed is completely empty.) Previously formatPatch would provide the correct output but also mutate the oldLines or newLines property on the hunk, changing the meaning of the underlying patch.

Git-style patches are now supported by parsePatch, formatPatch, and reversePatch.

Patches output by git diff can include some features that are unlike those output by GNU diff, and therefore not handled by an ordinary unified diff format parser. An ordinary diff simply describes the differences between the content of two files, but Git diffs can also indicate, via "extended headers", the creation or deletion of (potentially empty) files, indicate that a file was renamed, and contain information about file mode changes. Furthermore, when these changes appear in a diff in the absence of a content change (e.g. when an empty file is created, or a file is renamed without content changes), the patch will contain no associated ---/+++ file headers nor any hunks.

jsdiff previously did not support parsing Git's extended headers, nor hunkless patches. Now parsePatch parses some of the extended headers, parses hunkless Git patches, and can determine filenames (e.g. from the extended headers) when parsing a patch that includes no --- or +++ file headers. The additional information conveyed by the extended headers we support is recorded on new fields on the result object returned by parsePatch. See isGit and subsequent properties in the docs in the README.md file.

formatPatch now outputs extended headers based on these new Git-specific properties, and reversePatch respects them as far as possible (with one unavoidable caveat noted in the README.md file).

Unpaired file headers now cause parsePatch to throw.

It remains acceptable to have a patch with no file headers whatsoever (e.g. one that begins with a @@ hunk header on the very first line), but a patch with only a --- header or only a +++ header is now considered an error.

parsePatch is now more tolerant of "trailing garbage"

That is: after a patch, or between files/indexes in a patch, it is now acceptable to have arbitrary lines of "garbage" (so long as they unambiguously have no syntactic meaning - e.g. trailing garbage that leads with a +, -, or and thus is interpretable as part of a hunk still triggers a throw).

This means we no longer reject patches output by tools that include extra data in "garbage" lines not understood by generic unified diff parsers. (For example, SVN patches can include "Property changes on:" lines that generic unified diff parsers should discard as garbage; jsdiff previously threw errors when encountering them.)

This change brings jsdiff's behaviour more in line with GNU patch, which is highly permissive of "garbage".

The oldFileName and newFileName fields of StructuredPatch are now typed as string | undefined instead of string. This type change reflects the (pre-existing) reality that parsePatch can produce patches without filenames (e.g. when parsing a patch that simply contains hunks with no file headers).

8.0.4

#667 - fix another bug in diffWords when used with an Intl.Segmenter. If the text to be diffed included a combining mark after a whitespace character (i.e. roughly speaking, an accented space), diffWords would previously crash. Now this case is handled correctly.

8.0.3

#631 - fix support for using an Intl.Segmenter with diffWords. This has been almost completely broken since the feature was added in v6.0.0, since it would outright crash on any text that featured two consecutive newlines between a pair of words (a very common case).

#635 - small tweaks to tokenization behaviour of diffWords when used without an Intl.Segmenter. Specifically, the soft hyphen (U+00AD) is no longer considered to be a word break, and the multiplication and division signs (× and ÷) are now treated as punctuation instead of as letters / word characters.

... (truncated)

Commits

ed13aca Update version in package.json and in release notes (#683)
7a49317 Bump dependencies again (#682)
afe5aec Add Git support, and otherwise variously improve & fix parsePatch (and other ...
2e46779 Fix a typo (#679)
dd2f994 8.0.4 release (#678)
3cc4384 Update docs on releasing to reflect migration to yarn berry (#677)
6fc2aa6 yarn up '*' && yarn up -R '**' (#676)
af7393a yarn up '*' && yarn up -R '**' (#670)
4b5d180 Fix another bug in diffWords's "intlSegmenter" mode (#667)
10da50c yarn up '*' && yarn up -R '**' (#666)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore <dependency name> major version will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself)
@dependabot ignore <dependency name> minor version will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself)
@dependabot ignore <dependency name> will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself)
@dependabot unignore <dependency name> will remove all of the ignore conditions of the specified dependency
@dependabot unignore <dependency name> <ignore condition> will remove the ignore condition of the specified dependency and ignore conditions
You can disable automated security fix PRs for this repo from the Security Alerts page.

Bumps the npm_and_yarn group with 1 update in the / directory: [diff](https://github.com/kpdecker/jsdiff). Updates `diff` from 7.0.0 to 9.0.0 - [Changelog](https://github.com/kpdecker/jsdiff/blob/master/release-notes.md) - [Commits](kpdecker/jsdiff@7.0.0...v9.0.0) --- updated-dependencies: - dependency-name: diff dependency-version: 9.0.0 dependency-type: direct:production dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com>

… rename (garrytan#1351) * feat: gstack-gbrain-mcp-verify helper for remote MCP probe Probes a remote gbrain MCP endpoint with bearer auth. POSTs initialize, classifies failures into NETWORK / AUTH / MALFORMED with one-line remediation hints, and runs a tools/list capability probe to detect sources_add MCP support (forward-compat for when gbrain ships URL ingest). Token consumed from GBRAIN_MCP_TOKEN env, never argv. Required to set both 'application/json' AND 'text/event-stream' in Accept; that gotcha costs 10 minutes of debugging when missed (regression-tested). Live-verified against wintermute (gbrain v0.27.1). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: gstack-artifacts-init + gstack-artifacts-url helpers artifacts-init replaces brain-init with provider choice (gh / glab / manual), per-user gstack-artifacts-$USER repo, HTTPS-canonical storage in ~/.gstack-artifacts-remote.txt, and a "send this to your brain admin" hookup printout. Always prints the command, never auto-executes — gbrain v0.26.x has no admin-scope MCP probe (codex Finding #3). artifacts-url centralizes HTTPS↔SSH/host/owner-repo conversion so callers don't each string-mangle (codex Finding garrytan#10). The remote-conflict check in artifacts-init compares at the canonical level so re-running with HTTPS input doesn't trip on a stored SSH URL for the same logical repo. The "URL form not supported" branch prints a two-line clone-then-path form for gbrain v0.26.x; the supported branch is a one-liner with --url ready for when gbrain ships URL ingest. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: extend gstack-gbrain-detect with mcp_mode + artifacts_remote Adds two new fields to detect's JSON output: - gbrain_mcp_mode: local-stdio | remote-http | none Resolved via 3-tier fallback (codex Finding D3): claude mcp get --json → claude mcp list text-grep → ~/.claude.json jq read. If Anthropic moves the file format, the first two tiers absorb it. - gstack_artifacts_remote: HTTPS URL from ~/.gstack-artifacts-remote.txt Falls back to ~/.gstack-brain-remote.txt during the v1.27.0.0 migration window so detect doesn't return empty between upgrade and migration. Existing detect tests still pass (15/15). New 19 tests cover every fallback tier independently, plus a schema regression for /sync-gbrain compat. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: setup-gbrain Path 4 (remote MCP) + artifacts rename Path 4 lets users paste an HTTPS MCP URL + bearer token and registers it as an HTTP-transport MCP without needing a local gbrain CLI install. The flow: - Step 2 gains a fourth option (Remote gbrain MCP) - Step 4 adds Path 4 sub-flow: collect URL, secret-read bearer, verify via gstack-gbrain-mcp-verify (NETWORK / AUTH / MALFORMED classifier) - Step 5 (local doctor), Step 7.5 (transcript ingest), Step 5a's stdio branch all skip on Path 4 - Step 5a adds an HTTP+bearer registration form: claude mcp add --transport http --header "Authorization: Bearer ..." - Step 7 renamed "session memory sync" → "artifacts sync" and now calls gstack-artifacts-init (which always prints the brain-admin hookup command — no auto-execute, codex Finding #3) - Step 8 CLAUDE.md block branches: remote-http includes URL + server version (never the token); local-stdio keeps engine + config-file - Step 9 smoke test on Path 4 prints the curl-equivalent for post-restart verification (MCP tools aren't visible mid-session) - Step 10 verdict block has separate templates per mode Idempotency: re-running with gbrain_mcp_mode=remote-http already in detect output skips Step 2 entirely and goes to verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor: rename gbrain_sync_mode → artifacts_sync_mode (v1.27.0.0 prep) Hard rename, no dual-read alias (codex Finding D4). The on-disk migration script (Phase C, separate commit) renames the config key in users' ~/.gstack/config.yaml and any CLAUDE.md blocks. Touched call sites: - bin/gstack-config defaults + validation + list/defaults output - bin/gstack-gbrain-detect (gstack_brain_sync_mode field still emitted with the same name for downstream-tool compat; reads new key) - bin/gstack-brain-sync, bin/gstack-brain-enqueue, bin/gstack-brain-uninstall - bin/gstack-timeline-log (comment ref) - scripts/resolvers/preamble/generate-brain-sync-block.ts: renames key, branches on gbrain_mcp_mode=remote-http to emit "ARTIFACTS_SYNC: remote-mode (managed by brain server <host>)" instead of the local mode/queue/last_push line (codex Finding garrytan#11) - bin/gstack-brain-restore + bin/gstack-gbrain-source-wireup: read ~/.gstack-artifacts-remote.txt with ~/.gstack-brain-remote.txt fallback during the migration window - bin/gstack-artifacts-init: tolerant of unrecognized URL forms (local paths, file://, self-hosted gitea) so test infrastructure and unusual remotes work without canonicalization - test/brain-sync.test.ts: gstack-brain-init → gstack-artifacts-init - test/skill-e2e-brain-privacy-gate.test.ts: artifacts_sync_mode keys - test/gen-skill-docs.test.ts: budget 35K → 36.5K for the new MCP-mode probe in the preamble resolver - health/SKILL.md.tmpl, sync-gbrain/SKILL.md.tmpl: comment + verdict line Hard delete: - bin/gstack-brain-init (replaced by bin/gstack-artifacts-init in v1.27.0.0) - test/gstack-brain-init-gh-mock.test.ts (replaced by gstack-artifacts-init.test.ts) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: regenerate SKILL.md files after artifacts-sync rename Mechanical regen via \`bun run gen:skill-docs --host all\`. All */SKILL.md files reflect the renamed config key (gbrain_sync_mode → artifacts_sync_mode), the renamed remote-helper file (~/.gstack-artifacts-remote.txt with brain fallback), the renamed init script (gstack-artifacts-init), and the new ARTIFACTS_SYNC: remote-mode status line that fires when a remote-http MCP is registered. Golden fixtures (test/fixtures/golden/*-ship-SKILL.md) refreshed to match the regenerated default-ship output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: v1.27.0.0 migration — gstack-brain → gstack-artifacts rename Journaled, interruption-safe migration. Six steps, each writes to ~/.gstack/.migrations/v1.27.0.0.journal on success; re-entry resumes from the next un-done step. On final success, journal is replaced by ~/.gstack/.migrations/v1.27.0.0.done. Steps: 1. gh_repo_renamed gh/glab repo rename gstack-brain-$USER → gstack-artifacts-$USER (idempotent: detects already-renamed and skips) 2. remote_txt_renamed mv ~/.gstack-brain-remote.txt → artifacts file, rewriting URL path to match the new repo name 3. config_key_renamed sed -i in ~/.gstack/config.yaml flips gbrain_sync_mode → artifacts_sync_mode 4. claude_md_block sed flips "- Memory sync:" → "- Artifacts sync:" in cwd CLAUDE.md and ~/.gstack/CLAUDE.md 5. sources_swapped gbrain sources add NEW (verify) → remove OLD (codex Finding #6: add-before-remove ordering, no downtime window). On remote-MCP mode, prints commands for the brain admin instead of executing. 6. done touchfile + delete journal User opt-out: any "n" or "skip-for-now" answer at the initial prompt writes a marker file that prevents re-prompting; user can re-invoke via /setup-gbrain --rerun-migration. 11 unit tests cover: nothing-to-migrate, GitHub happy path, idempotent re-run, journal-resume mid-flight, remote-MCP print-only path, add-before-remove ordering verification, add-fail → old source stays registered, CLAUDE.md field rewrite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: regression suite + E2E for v1.27.0.0 rename Three new regression tests guard the rename's blast radius (per codex Findings #1, garrytan#8, garrytan#9, garrytan#12): - test/no-stale-gstack-brain-refs.test.ts: greps bin/, scripts/, *.tmpl, test/ for forbidden identifiers (gstack-brain-init, gbrain_sync_mode); fails CI if any non-allowlisted file references them. - test/post-rename-doc-regen.test.ts: confirms gen-skill-docs output has no stale references in any */SKILL.md (the cross-product blind spot). - test/setup-gbrain-path4-structure.test.ts: structural lint over the Path 4 prose contract — STOP gates after verify failure, never-write- token rules, mode-aware CLAUDE.md block, bearer always via env-var. Two new gate-tier E2E tests (deterministic stub HTTP server, fixed inputs): - test/skill-e2e-setup-gbrain-remote.test.ts: Path 4 happy path. Stubs an HTTP MCP server, drives the skill via Agent SDK with a stubbed bearer, asserts claude.json gets the http MCP entry, CLAUDE.md gets the remote-http block, the secret token NEVER leaks to CLAUDE.md. - test/skill-e2e-setup-gbrain-bad-token.test.ts: stub server returns 401; asserts the AUTH classifier hint surfaces, no MCP registration occurs, CLAUDE.md is unchanged. Regression guard for the "verify failed → STOP" rule. touchfiles.ts: setup-gbrain-remote and setup-gbrain-bad-token added at gate-tier so CI catches Path 4 regressions on every PR. Plus a few comment refs flipped: bin/gstack-jsonl-merge, bin/gstack-timeline-log (legacy gstack-brain-init mentions in headers). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * release: v1.27.0.0 — /setup-gbrain Path 4 + brain → artifacts rename Bumps VERSION 1.26.4.0 → 1.27.0.0 (MINOR per CLAUDE.md scale-aware bump guidance: ~1500 line net change including a new path in /setup-gbrain, two new bin helpers, a journaled migration, 59 new tests, and a config key rename across the codebase). CHANGELOG entry covers: Path 4 (Remote MCP) end-to-end, the brain → artifacts rename, the journaled migration, the verify-helper error classifier, the artifacts-init multi-host provider choice. Includes the canonical Garry-voice headline + numbers table + audience close per the release-summary format. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: demote setup-gbrain Path 4 E2E to periodic-tier The Agent SDK E2E tests for Path 4 (skill-e2e-setup-gbrain-remote and skill-e2e-setup-gbrain-bad-token) are inherently non-deterministic — the model interprets "follow Path 4 only" prompts flexibly and can skip Step 8 (CLAUDE.md write) or shortcut past the verify helper, which makes the gate-tier assertions flaky. The deterministic gate coverage for Path 4 is in test/setup-gbrain-path4-structure.test.ts: a fast structural lint that catches AUQ-pacing regressions and prose contract drift in <200ms with zero token spend. That test is the right tool for catching the failure mode the gate-tier was meant to guard against. The Agent SDK E2E tests stay available on-demand for periodic-tier runs (EVALS=1 EVALS_TIER=periodic bun test test/skill-e2e-setup-gbrain-*.test.ts). Also tightened the verify-error assertion to the literal field shape ("error_class": "AUTH") instead of a substring match that false-matches the parent claude session's "needs-auth" MCP discovery markers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: sync package.json version to 1.27.0.0 VERSION was bumped to 1.27.0.0 in f6ec11e but package.json was not updated in the same commit. The gen-skill-docs.test.ts assertion "package.json version matches VERSION file" caught the drift. This is the DRIFT_STALE_PKG case the /ship Step 12 idempotency check is designed for; the fix is the documented sync-only repair (no re-bump, package.json synced to existing VERSION). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v1.26.5.0 fix wave: gbrain ingest writer (hybrid frontmatter) + gbrain-valid source ids (#1344) * fix: use correct `gbrain put <slug>` CLI verb in memory ingest `put_page` is the MCP tool name, not a CLI subcommand. The actual gbrain verb is `put <slug>` with content via stdin and tags in YAML frontmatter. Every transcript / memory ingest fails today on clean installs. Switch to the right verb and inject title/type/tags into the frontmatter that buildTranscriptPage / buildArtifactPage already produce. Bundled in the same function: - timeout: 30s → 60s. Auto-link reconciliation hits 30s once the brain has a few hundred pages. - maxBuffer: 1MB → 16MB. Without it Node truncates gbrain's stderr and callers see only `Command failed:` with no detail. - Surface stderr/stdout in the returned error instead of the bare exception. Verified: bun test test/gstack-memory-ingest.test.ts -> 15/15 pass. bun test on the three test files touching this path -> 362/362. * fix(sync-gbrain): generate gbrain-valid source ids for repos with dots or long names `deriveCodeSourceId` previously concatenated the canonicalized remote with only `/` and whitespace stripped, leaving dots from hostnames (`github.com`) and no length cap. gbrain rejects any source id containing characters outside [a-z0-9-] or longer than 32 chars, so `github.com/<org>/<repo>` produced `gstack-code-github.com-<org>-<repo>` (40 chars, plus dots) and registration failed: code source registration failed: Invalid source id "gstack-code-github.com-radubach-platform". Must be 1-32 lowercase alnum chars with optional interior hyphens. Fix: - Drop the host segment (`github.com` is the same for nearly every user and just consumes the 32-char budget). Use only the last two path segments (org-repo). - Sanitize any remaining non-alnum to hyphens, then collapse and trim. - For genuinely long org/repo names that still exceed the budget, keep the tail (most distinctive end of the slug) and append a 6-char sha1 hash for collision resistance. Adds a regression test that spawns the CLI in temp git repos with controlled remotes (dot in hostname, SCP-style, multi-dot host, long names forcing hash-truncation) and asserts every derived id is ≤32 chars and matches the gbrain validator regex. * fix(memory-ingest): hybrid frontmatter writer + tightened gbrain availability probe PR #1328 (merged in the prior commit) correctly injects title/type/tags into the YAML frontmatter that buildTranscriptPage already prepends. But buildArtifactPage emits raw markdown without frontmatter, so design-docs, learnings, and builder-profile-entries were landing in gbrain with empty title/type/tags. Add the no-frontmatter wrap branch so artifact pages get the same metadata the inject branch provides for transcripts. Also bring in gbrainAvailable()'s --help probe (originally proposed in PR #1341 by Alex Medina), with the regex tightened from /(^|\s)put(\s|$)/m to /^\s+put\s/m. Anchoring on the indented subcommand format gbrain's help actually uses keeps the probe from matching "put" appearing as prose in help text, while still failing fast with one clean error if a future gbrain renames or removes the put subcommand. Updates the V1.5 NOTE doc block at the top of the file to describe the current put-via-stdin shape rather than the legacy put_page flag form. Co-Authored-By: Alex Medina <oficina@puntoverdemc.com> * test+fix(memory-ingest): strengthen regression tests, fix inject for malformed-close frontmatter Imports the shim-based regression tests from PR #1341 (Alex Medina) and strengthens them to assert title, type, and tags actually arrive in put stdin — not just `agent: claude-code`. Asserting the metadata fields matches the regression class that's caused this fix wave: writers can "succeed" while metadata is silently lost. The original PR #1341 tests would have passed even with title/type/tags missing. Strengthening the test surfaced a deeper issue. buildTranscriptPage joins frontmatter array elements with "\n" and does not append a trailing newline, so the close fence is "\n---<content>" directly, not "\n---\n". PR #1328's inject branch searched for "\n---\n" and never matched — which means even with PR #1328 alone, transcript pages were landing in gbrain with no title/type/tags. Two-line fix: search for "\n---" only, since the inject lands before the close fence regardless of what follows it. Also imports PR #1341's V1.5 NOTE doc-block update and the section comment refresh so the prose stays accurate against the new writer shape. Co-Authored-By: Alex Medina <oficina@puntoverdemc.com> * fix+test(gbrain-sync): handle empty-slug edge in constrainSourceId, add no-origin and basename-empty regression tests PR #1330 (merged in the prior commit) addressed the dot-in-host and length-overflow cases for source-id derivation, but constrainSourceId silently returned "${prefix}-" when the input sanitized to an empty slug — invalid per gbrain's `^[a-z0-9](?:[a-z0-9-]{0,30}[a-z0-9])?$` validator on the trailing hyphen. Adds an explicit empty-slug branch that falls back to a sha1-prefixed id ("gstack-code-<6hex>") so the output stays gbrain-valid for every input shape. Two new regression tests cover the corners PR #1330's coverage left exposed: - no-origin fallback: a cwd repo with no `origin` remote configured must still derive a valid id from the basename. - basename-sanitizes-to-empty: a repo whose path basename is all non-alnum (e.g. "___") must produce the hash-only fallback, not an invalid trailing-hyphen id. Both run the CLI inside temp git repos for genuine end-to-end coverage (matches the pattern PR #1330 established for its own four remote-shape cases). Co-Authored-By: Richard Dubach <radubach@gmail.com> * chore: bump VERSION to 1.26.5.0 + CHANGELOG entry for fix wave PATCH bump. Three bug fixes (memory-ingest put_page CLI verb mismatch, hybrid frontmatter writer for transcripts AND artifacts, gbrain-valid source-id derivation for github-hosted repos), no new user capability. CHANGELOG release-summary leads with what users can now do (clean- install transcripts populate the brain, github-hosted repos register code sources) and tabulates before/after numbers from real gbrain v0.25.1 smoke output. Itemized changes credit @smithjoshua, @AZ-1224, and @radubach for the originating PRs plus the additional hybrid branch + strengthened tests added on top per Codex plan-review. * docs(todos): file P2 (gbrain install-pin staleness) + P3 (source-id host-collision) follow-ups Two follow-ups surfaced during the v1.26.5.0 fix-wave plan review. P2 — Issue #1305 part 2: bin/gstack-gbrain-install pins gbrain to v0.18.2 (commit 08b3698) but doesn't move when gstack ships features that depend on newer gbrain ops or schema. Fresh /setup-gbrain on v1.26.x lands users on schema 24 with v1.26 features expecting 32+. Captured for a future fix-wave. P3 — Codex P1.3 from the v1.26.5.0 plan review: deriveCodeSourceId drops the host segment to fit gbrain's 32-char source-id budget, which means github.com/acme/foo and gitlab.com/acme/foo collapse to the same source id. Real but rare; PR #1330 author explicitly considered this and chose budget over cross-host uniqueness. Captured as a long-tail concern. --------- Co-authored-by: Joshua Smith <joshualowellsmith@gmail.com> Co-authored-by: Richard Dubach <radubach@gmail.com> Co-authored-by: Alex Medina <oficina@puntoverdemc.com> * v1.27.0.0 feat: /setup-gbrain Path 4 (remote MCP) + brain → artifacts rename (#1351) * feat: gstack-gbrain-mcp-verify helper for remote MCP probe Probes a remote gbrain MCP endpoint with bearer auth. POSTs initialize, classifies failures into NETWORK / AUTH / MALFORMED with one-line remediation hints, and runs a tools/list capability probe to detect sources_add MCP support (forward-compat for when gbrain ships URL ingest). Token consumed from GBRAIN_MCP_TOKEN env, never argv. Required to set both 'application/json' AND 'text/event-stream' in Accept; that gotcha costs 10 minutes of debugging when missed (regression-tested). Live-verified against wintermute (gbrain v0.27.1). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: gstack-artifacts-init + gstack-artifacts-url helpers artifacts-init replaces brain-init with provider choice (gh / glab / manual), per-user gstack-artifacts-$USER repo, HTTPS-canonical storage in ~/.gstack-artifacts-remote.txt, and a "send this to your brain admin" hookup printout. Always prints the command, never auto-executes — gbrain v0.26.x has no admin-scope MCP probe (codex Finding #3). artifacts-url centralizes HTTPS↔SSH/host/owner-repo conversion so callers don't each string-mangle (codex Finding #10). The remote-conflict check in artifacts-init compares at the canonical level so re-running with HTTPS input doesn't trip on a stored SSH URL for the same logical repo. The "URL form not supported" branch prints a two-line clone-then-path form for gbrain v0.26.x; the supported branch is a one-liner with --url ready for when gbrain ships URL ingest. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: extend gstack-gbrain-detect with mcp_mode + artifacts_remote Adds two new fields to detect's JSON output: - gbrain_mcp_mode: local-stdio | remote-http | none Resolved via 3-tier fallback (codex Finding D3): claude mcp get --json → claude mcp list text-grep → ~/.claude.json jq read. If Anthropic moves the file format, the first two tiers absorb it. - gstack_artifacts_remote: HTTPS URL from ~/.gstack-artifacts-remote.txt Falls back to ~/.gstack-brain-remote.txt during the v1.27.0.0 migration window so detect doesn't return empty between upgrade and migration. Existing detect tests still pass (15/15). New 19 tests cover every fallback tier independently, plus a schema regression for /sync-gbrain compat. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: setup-gbrain Path 4 (remote MCP) + artifacts rename Path 4 lets users paste an HTTPS MCP URL + bearer token and registers it as an HTTP-transport MCP without needing a local gbrain CLI install. The flow: - Step 2 gains a fourth option (Remote gbrain MCP) - Step 4 adds Path 4 sub-flow: collect URL, secret-read bearer, verify via gstack-gbrain-mcp-verify (NETWORK / AUTH / MALFORMED classifier) - Step 5 (local doctor), Step 7.5 (transcript ingest), Step 5a's stdio branch all skip on Path 4 - Step 5a adds an HTTP+bearer registration form: claude mcp add --transport http --header "Authorization: Bearer ..." - Step 7 renamed "session memory sync" → "artifacts sync" and now calls gstack-artifacts-init (which always prints the brain-admin hookup command — no auto-execute, codex Finding #3) - Step 8 CLAUDE.md block branches: remote-http includes URL + server version (never the token); local-stdio keeps engine + config-file - Step 9 smoke test on Path 4 prints the curl-equivalent for post-restart verification (MCP tools aren't visible mid-session) - Step 10 verdict block has separate templates per mode Idempotency: re-running with gbrain_mcp_mode=remote-http already in detect output skips Step 2 entirely and goes to verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor: rename gbrain_sync_mode → artifacts_sync_mode (v1.27.0.0 prep) Hard rename, no dual-read alias (codex Finding D4). The on-disk migration script (Phase C, separate commit) renames the config key in users' ~/.gstack/config.yaml and any CLAUDE.md blocks. Touched call sites: - bin/gstack-config defaults + validation + list/defaults output - bin/gstack-gbrain-detect (gstack_brain_sync_mode field still emitted with the same name for downstream-tool compat; reads new key) - bin/gstack-brain-sync, bin/gstack-brain-enqueue, bin/gstack-brain-uninstall - bin/gstack-timeline-log (comment ref) - scripts/resolvers/preamble/generate-brain-sync-block.ts: renames key, branches on gbrain_mcp_mode=remote-http to emit "ARTIFACTS_SYNC: remote-mode (managed by brain server <host>)" instead of the local mode/queue/last_push line (codex Finding #11) - bin/gstack-brain-restore + bin/gstack-gbrain-source-wireup: read ~/.gstack-artifacts-remote.txt with ~/.gstack-brain-remote.txt fallback during the migration window - bin/gstack-artifacts-init: tolerant of unrecognized URL forms (local paths, file://, self-hosted gitea) so test infrastructure and unusual remotes work without canonicalization - test/brain-sync.test.ts: gstack-brain-init → gstack-artifacts-init - test/skill-e2e-brain-privacy-gate.test.ts: artifacts_sync_mode keys - test/gen-skill-docs.test.ts: budget 35K → 36.5K for the new MCP-mode probe in the preamble resolver - health/SKILL.md.tmpl, sync-gbrain/SKILL.md.tmpl: comment + verdict line Hard delete: - bin/gstack-brain-init (replaced by bin/gstack-artifacts-init in v1.27.0.0) - test/gstack-brain-init-gh-mock.test.ts (replaced by gstack-artifacts-init.test.ts) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: regenerate SKILL.md files after artifacts-sync rename Mechanical regen via \`bun run gen:skill-docs --host all\`. All */SKILL.md files reflect the renamed config key (gbrain_sync_mode → artifacts_sync_mode), the renamed remote-helper file (~/.gstack-artifacts-remote.txt with brain fallback), the renamed init script (gstack-artifacts-init), and the new ARTIFACTS_SYNC: remote-mode status line that fires when a remote-http MCP is registered. Golden fixtures (test/fixtures/golden/*-ship-SKILL.md) refreshed to match the regenerated default-ship output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: v1.27.0.0 migration — gstack-brain → gstack-artifacts rename Journaled, interruption-safe migration. Six steps, each writes to ~/.gstack/.migrations/v1.27.0.0.journal on success; re-entry resumes from the next un-done step. On final success, journal is replaced by ~/.gstack/.migrations/v1.27.0.0.done. Steps: 1. gh_repo_renamed gh/glab repo rename gstack-brain-$USER → gstack-artifacts-$USER (idempotent: detects already-renamed and skips) 2. remote_txt_renamed mv ~/.gstack-brain-remote.txt → artifacts file, rewriting URL path to match the new repo name 3. config_key_renamed sed -i in ~/.gstack/config.yaml flips gbrain_sync_mode → artifacts_sync_mode 4. claude_md_block sed flips "- Memory sync:" → "- Artifacts sync:" in cwd CLAUDE.md and ~/.gstack/CLAUDE.md 5. sources_swapped gbrain sources add NEW (verify) → remove OLD (codex Finding #6: add-before-remove ordering, no downtime window). On remote-MCP mode, prints commands for the brain admin instead of executing. 6. done touchfile + delete journal User opt-out: any "n" or "skip-for-now" answer at the initial prompt writes a marker file that prevents re-prompting; user can re-invoke via /setup-gbrain --rerun-migration. 11 unit tests cover: nothing-to-migrate, GitHub happy path, idempotent re-run, journal-resume mid-flight, remote-MCP print-only path, add-before-remove ordering verification, add-fail → old source stays registered, CLAUDE.md field rewrite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: regression suite + E2E for v1.27.0.0 rename Three new regression tests guard the rename's blast radius (per codex Findings #1, #8, #9, #12): - test/no-stale-gstack-brain-refs.test.ts: greps bin/, scripts/, *.tmpl, test/ for forbidden identifiers (gstack-brain-init, gbrain_sync_mode); fails CI if any non-allowlisted file references them. - test/post-rename-doc-regen.test.ts: confirms gen-skill-docs output has no stale references in any */SKILL.md (the cross-product blind spot). - test/setup-gbrain-path4-structure.test.ts: structural lint over the Path 4 prose contract — STOP gates after verify failure, never-write- token rules, mode-aware CLAUDE.md block, bearer always via env-var. Two new gate-tier E2E tests (deterministic stub HTTP server, fixed inputs): - test/skill-e2e-setup-gbrain-remote.test.ts: Path 4 happy path. Stubs an HTTP MCP server, drives the skill via Agent SDK with a stubbed bearer, asserts claude.json gets the http MCP entry, CLAUDE.md gets the remote-http block, the secret token NEVER leaks to CLAUDE.md. - test/skill-e2e-setup-gbrain-bad-token.test.ts: stub server returns 401; asserts the AUTH classifier hint surfaces, no MCP registration occurs, CLAUDE.md is unchanged. Regression guard for the "verify failed → STOP" rule. touchfiles.ts: setup-gbrain-remote and setup-gbrain-bad-token added at gate-tier so CI catches Path 4 regressions on every PR. Plus a few comment refs flipped: bin/gstack-jsonl-merge, bin/gstack-timeline-log (legacy gstack-brain-init mentions in headers). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * release: v1.27.0.0 — /setup-gbrain Path 4 + brain → artifacts rename Bumps VERSION 1.26.4.0 → 1.27.0.0 (MINOR per CLAUDE.md scale-aware bump guidance: ~1500 line net change including a new path in /setup-gbrain, two new bin helpers, a journaled migration, 59 new tests, and a config key rename across the codebase). CHANGELOG entry covers: Path 4 (Remote MCP) end-to-end, the brain → artifacts rename, the journaled migration, the verify-helper error classifier, the artifacts-init multi-host provider choice. Includes the canonical Garry-voice headline + numbers table + audience close per the release-summary format. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: demote setup-gbrain Path 4 E2E to periodic-tier The Agent SDK E2E tests for Path 4 (skill-e2e-setup-gbrain-remote and skill-e2e-setup-gbrain-bad-token) are inherently non-deterministic — the model interprets "follow Path 4 only" prompts flexibly and can skip Step 8 (CLAUDE.md write) or shortcut past the verify helper, which makes the gate-tier assertions flaky. The deterministic gate coverage for Path 4 is in test/setup-gbrain-path4-structure.test.ts: a fast structural lint that catches AUQ-pacing regressions and prose contract drift in <200ms with zero token spend. That test is the right tool for catching the failure mode the gate-tier was meant to guard against. The Agent SDK E2E tests stay available on-demand for periodic-tier runs (EVALS=1 EVALS_TIER=periodic bun test test/skill-e2e-setup-gbrain-*.test.ts). Also tightened the verify-error assertion to the literal field shape ("error_class": "AUTH") instead of a substring match that false-matches the parent claude session's "needs-auth" MCP discovery markers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: sync package.json version to 1.27.0.0 VERSION was bumped to 1.27.0.0 in f6ec11eb but package.json was not updated in the same commit. The gen-skill-docs.test.ts assertion "package.json version matches VERSION file" caught the drift. This is the DRIFT_STALE_PKG case the /ship Step 12 idempotency check is designed for; the fix is the documented sync-only repair (no re-bump, package.json synced to existing VERSION). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v1.27.1.0 fix: anti-shortcut clause + gate-tier AskUserQuestion floor tests for all plan-* skills (#1354) * feat(test/helpers): runPlanSkillFloorCheck — minimal AskUserQuestion-floor observer Adds a focused PTY observer that exits at the first non-permission numbered-option render. Catches the May 2026 transcript-bug class (model wrote plan + ExitPlanMode without firing any AUQ) without needing to fingerprint or navigate past the AUQ. Why separate from runPlanSkillCounting: plan-mode AUQs render every option on a single logical line via cursor-positioning escapes that stripAnsi can't simulate, so parseNumberedOptions returns < 2 options and never records a fingerprint. Counting tests work on 25-min budgets because eventually one frame parses cleanly; gate-tier floor tests need to exit early on the first observation. Trades fingerprint precision for early-exit reliability. Also drops COMPLETION_SUMMARY_RE check from this helper — it matches "GSTACK REVIEW REPORT" anywhere in the buffer including when the agent does recon by reading existing plan files. plan_ready (claude's actual "Ready to execute" confirmation) is the reliable terminal signal for "agent finished without asking." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(resolvers): generateAntiShortcutClause shared resolver Adds {{ANTI_SHORTCUT_CLAUSE}} placeholder backed by a single resolver function in scripts/resolvers/review.ts. Plan-* review skills can now include the clause via one placeholder line in their .tmpl rather than cloning the paragraph four times. Future tightening edits one resolver, all four skills update on next gen-skill-docs. Wired into the existing RESOLVERS map alongside generateReviewDashboard and generatePlanFileReviewReport — no gen-skill-docs.ts change needed because the generator already does generic placeholder substitution against that map. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(plan-*-review): anti-shortcut clause in all four review skills Inserts {{ANTI_SHORTCUT_CLAUSE}} placeholder immediately after the **Anti-skip rule:** paragraph in plan-{eng,ceo,design,devex}-review SKILL.md.tmpl. The four templates use different surrounding section headers (eng "Review Sections (after scope is agreed)" vs ceo/design/devex variants), so anchoring on the paragraph rather than the heading works across all four. Closes the May 2026 transcript-bug loophole: existing STOP gates name forbidden actions only AFTER a per-section finding is identified. The anti-shortcut clause adds the pre-emptive rule — "the plan file is the OUTPUT of the interactive review, not a substitute for it" — covering the case the transcript exhibited (skip per-section walk, dump every finding into one plan write, call ExitPlanMode). Regenerated SKILL.md for all hosts via bun run gen:skill-docs --host all. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: gate-tier AskUserQuestion floor tests for all plan-* review skills Adds 4 finding-floor tests (one per plan-* skill) that catch the May 2026 transcript-bug class — model wrote a plan and called ExitPlanMode without firing any review-phase AskUserQuestion. Asserts via runPlanSkillFloorCheck that ANY non-permission AUQ render fires before the agent reaches plan_ready. Verified: - Eng floor: passed in 59s - CEO floor: passed in 197s - Design floor: passed - Devex floor: passed - Total ~$2-6 per CI run; only triggers on diff against the 4 plan-* templates, the shared resolver review.ts, the seeds fixture, or the PTY runner helper. Fixtures live in test/fixtures/forcing-finding-seeds.ts, one constant per skill. Each seed is engineered to force at least one obvious finding under that skill's review focus (architectural smell for eng, scope-creep for ceo, UI-slop for design, painful onboarding for devex). Touchfiles wiring: - E2E_TOUCHFILES: 4 plan-*-finding-floor entries with deps on the matching skill template, the shared resolver, the seeds fixture, and the PTY runner helper - E2E_TIERS: all 4 entries marked 'gate' - touchfiles.test.ts: count assertion bumped 21→22 with explicit plan-ceo-finding-floor containment check Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v1.27.1.0) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v1.28.0.0 feat: browse --headed/--proxy/--navigate + gstack/llms.txt + webdriver-only stealth (#1363) * feat(browse): SOCKS5 bridge with auth + cred redaction helper Adds browse/src/socks-bridge.ts: a 127.0.0.1-only SOCKS5 listener that accepts unauthenticated connections from Chromium and relays them through an authenticated upstream proxy. Chromium does not prompt for SOCKS5 auth at launch, so this bridge is the workaround for using auth-required residential SOCKS5 upstreams. - startSocksBridge({ upstream, port: 0 }) → ephemeral 127.0.0.1 listener - testUpstream({ upstream, retries: 3, backoffMs: 500, budgetMs: 5000 }) pre-flight that connects to a known endpoint (default 1.1.1.1:443) - Stream-error policy: kill affected client + upstream sockets on any error mid-stream; no transport retries (a transport-layer retry can corrupt browser traffic) Adds browse/src/proxy-redact.ts: single source of truth for redacting credentials in any logged proxy URL or upstream config. Every code path that prints proxy config goes through this helper. Adds the socks npm dep (~30KB) and 16 tests covering: 127.0.0.1-only bind, byte-for-byte round trip through the bridge, auth rejection, mid-stream upstream drop kills client conn, listener teardown, testUpstream success + retry-exhaust paths, redaction of every credential shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(browse): --proxy and --headed flags wire bridge into daemon Adds the global --proxy <url> and --headed flags to the browse CLI. Resolves cred policy and routes the daemon launch through the SOCKS5 bridge (or pass-through for HTTP/HTTPS) before chromium.launch(). CLI (cli.ts): - extractGlobalFlags() strips --proxy/--headed from argv, parses URL via Node URL class, validates D9 cred-mixing (env BROWSE_PROXY_USER/PASS + URL creds → exit 1 with hint), composes canonical proxy URL with resolved creds, computes a stable configHash for daemon-mismatch - ensureServer() now reads existing daemon's configHash from state file and refuses (exit 1 with disconnect hint) if --proxy/--headed mismatch the existing daemon. No silent restart that would drop tab state. - All proxy-related stderr lines go through redactProxyUrl proxy-config.ts (new): - parseProxyConfig() — URL parser + D9 cred-mixing detector + scheme allowlist - computeConfigHash() — stable hash of (proxy URL minus creds + headed flag) - toUpstreamConfig() — map ParsedProxyConfig → socks-bridge.UpstreamConfig Server (server.ts): - Reads BROWSE_PROXY_URL at startup; for SOCKS5+auth, runs testUpstream pre-flight (5s budget, 3 retries, 500ms backoff) and exits 1 on failure with redacted error - Spawns startSocksBridge() on 127.0.0.1:<ephemeral> and points Chromium at it via socks5://127.0.0.1:<port> - HTTP/HTTPS or unauth SOCKS5 → pass-through to chromium.launch proxy.server (with username/password if present) - State file gains optional configHash for daemon-mismatch check - Bridge tears down via process.on('exit') Browser manager (browser-manager.ts): - New setProxyConfig({ server, username, password }) called by server.ts before launch - chromium.launch() and both launchPersistentContext sites pass the proxy config through when set Tests: 22 new across proxy-config (parse + cred-mixing + hash stability) and extractGlobalFlags (flag stripping + cred-mixing rejection + cred rotation hash stability + redaction). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(browse): Xvfb auto-spawn with PID + start-time validation Adds browse/src/xvfb.ts: a Linux-only Xvfb auto-spawn module for running headed Chromium in containers without DISPLAY. The module walks a display range to pick a free one (never hardcodes :99) and validates orphan PIDs by BOTH /proc/<pid>/cmdline matching 'Xvfb' AND start-time matching the recorded value before sending any signal. Defends against PID reuse — refuses to kill anything that doesn't match both checks. - shouldSpawnXvfb(env, platform) — pure decision: skip on macOS/Windows, on Linux skip when DISPLAY or WAYLAND_DISPLAY is set (codex F2) - pickFreeDisplay(99..120) — probes via xdpyinfo - spawnXvfb(display) — returns { pid, startTime, display } handle - isOurXvfb(pid, startTime) — both-checks validator - cleanupXvfb(state) — best-effort, validates ownership before SIGTERM Wired into server.ts startup: when shouldSpawnXvfb says yes, picks a free display, spawns Xvfb, sets DISPLAY for chromium.launchHeaded, and records xvfbPid/xvfbStartTime/xvfbDisplay in the state file. Cleanup runs on process.on('exit'). The CLI's disconnect path also runs cleanupXvfb() in the force-cleanup branch when the server is dead. Disconnect now applies to any non-default daemon (headed mode OR configHash-tagged daemon — i.e. one started with --proxy/--headed), not just headed mode. Adds xvfb + x11-utils to .github/docker/Dockerfile.ci so CI exercises the Linux container --headed path on every run. Without it the most common production path would go untested. Tests: 17 new across decision logic, PID validation defenses (cmdline mismatch, start-time mismatch), no-op safety on bad inputs, and a Linux+Xvfb-installed gate for the spawn → validate → cleanup round trip. Tests skip on macOS/Windows automatically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(browse): webdriver-mask stealth + Chromium-through-bridge e2e D7 (codex narrowing): mask navigator.webdriver only via addInitScript. The wintermute approach (fake plugins=[1..5], fake languages=['en-US', 'en'], stub window.chrome) is intentionally NOT applied — modern fingerprinters check consistency between plugins.length, languages, userAgent, and platform, and synthesizing fixed values can flag MORE bot-like, not less. The honest minimum is webdriver, which Chromium exposes as a known automation tell. Adds browse/src/stealth.ts: single source of truth for the stealth init script and launch args. Both browser-manager.launch() (headless) and launchHeaded() (persistent context with extension) call applyStealth(context) and pass STEALTH_LAUNCH_ARGS into chromium.launch. The pre-existing launchHeaded stealth that did fake plugins/languages is removed for the same reason. The cdc_/__webdriver runtime cleanup and Permissions API patch are kept — they remove automation-injected artifacts, not synthesize fake natural-browser values. Adds bridge-chromium-e2e.test.ts (codex F3): the test that proves the FEATURE works. Real Chromium with proxy.server = 'socks5://127.0.0.1: <bridgePort>' navigates to a local HTTP fixture; the auth upstream's connect counter and the HTTP fixture's hit counter both increment, proving traffic actually traversed bridge → auth-upstream → destination. Without this test, we could ship a working byte-relay and a broken Chromium integration and never know. Adds bridge-port-restart.test.ts (codex F1, reframed): old test assumed two daemons coexist, which contradicts D2 single-daemon model. Reframed as restart-then-restart, asserting fresh ephemeral ports (never the hardcoded 1090) on each spin-up. Adds stealth-webdriver.test.ts: navigator.webdriver=false in both fresh contexts and persistent contexts; navigator.plugins/languages are NOT replaced with the wintermute fake list (D7 verification). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(gstack): generate llms.txt — single-file capability index for AI agents Adds scripts/gen-llms-txt.ts: produces gstack/llms.txt at repo root, indexing every skill (47), every browse command (75), and design commands when the design CLI is present. Per the llmstxt.org convention, agents can read one file to learn what gstack offers instead of crawling 47 SKILL.md files. Sources: - skill SKILL.md.tmpl frontmatter (name + description block scalar) - browse/src/commands.ts COMMAND_DESCRIPTIONS (sorted by category) - design/src/commands.ts COMMAND_DESCRIPTIONS if present (best-effort) Wired into scripts/gen-skill-docs.ts as a post-step so it regenerates on every `bun run gen:skill-docs` (the same script that re-emits all SKILL.md files). Failures are non-fatal warnings, not build breaks — the generator never blocks SKILL.md regen. Strict mode (--strict, also used by tests) throws when a skill is missing name or description in its frontmatter, catching missing metadata before it ships. Tests: shape (top-level sections, sort order, single-line summary discipline), every-skill-and-command-appears, strict-mode rejection of incomplete frontmatter, and freshness check that the committed gstack/llms.txt matches what the generator produces now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(browse): --navigate flag on download for browser-triggered files Adds the --navigate strategy from community PR #1355 (originally from @garrytan-agents). When set, download navigates to the URL with waitUntil:'commit' and captures the resulting browser download via page.waitForEvent('download'), then saves via download.saveAs(). Handles URLs that trigger files via Content-Disposition headers, multi-hop CDN redirects requiring browser cookies, or anti-bot CDN chains where page.request.fetch() can't follow the auth/redirect chain. Defaults still use the existing direct-fetch strategy. --navigate is opt-in. Goes through the same validateNavigationUrl SSRF gate as goto, so download --navigate cannot reach IPv4 metadata endpoints (AWS IMDSv1, GCP/Azure equivalents) or arbitrary internal hosts. Inferred content type from suggested filename for common extensions (epub, pdf, zip, gz, mp3/mp4, jpg/jpeg/png, txt, html, json) — falls back to application/octet-stream. Same 200MB cap as Strategy 1. Frames the use case generically (anti-bot CDN, Content-Disposition, redirect chains) rather than naming any specific site, per project voice rules. Co-Authored-By: @garrytan-agents Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: v1.28.0.0 — browse SKILL section + VERSION + CHANGELOG VERSION 1.27.1.0 → 1.28.0.0 (MINOR — substantial new capability: five new flags/features, ~600 LOC added, new socks dep, multiple new modules). browse/SKILL.md.tmpl: new "Headed Mode + Proxy + Anti-Bot Sites" section between User Handoff and Snapshot Flags. Documents --headed (auto-Xvfb on Linux), --proxy (with embedded SOCKS5 bridge for auth), download --navigate, the cred-mixing policy, daemon-discipline (refuse-on-mismatch), the narrowed webdriver-only stealth, container support caveats, and the fail-fast/no-retry failure modes. CHANGELOG entry follows the release-summary format from CLAUDE.md: two-line headline, lead paragraph, "The numbers that matter" table tied to specific test files that prove each capability, "What this means for AI agents" closing tied to a real workflow shift, then itemized Added/Changed/Fixed/For-contributors sections. Browse SKILL.md regenerated via bun run gen:skill-docs. gstack/llms.txt regenerated automatically from the same pipeline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(browse): integration coverage for daemon mismatch + proxy fail-fast Adds two integration tests that exercise the full process boundary, not just the module-level wiring. daemon-mismatch-refuse.test.ts (D2): - Stubs a healthy state file with a fake configHash and a fake /health HTTP server, runs the actual cli.ts binary with a mismatching --proxy, asserts exit 1 + 'different config' / 'browse disconnect' hint in stderr. - Same shape with the plain-daemon-meets---headed case. - Positive case: matching configHash → CLI does NOT emit the mismatch hint (regardless of whether the actual command succeeds). server-proxy-fail-fast.test.ts: - Starts the rejecting SOCKS5 upstream, spawns server.ts with BROWSE_PROXY_URL pointing at it, BROWSE_HEADLESS_SKIP=1 to skip Chromium launch. - Asserts exit 1, 'FAIL upstream' in stderr (testUpstream pre-flight ran), no raw credential leakage in any output (redaction works on the failure path), and exit within 30s upper bound. Both tests use the existing spawn-bun-cli pattern from commands.test.ts so they run on the same CI infrastructure as the rest of the bun test suite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(gen-skill-docs): keep module sync so test require() still works Two regressions caught by the full test suite after the v1.28.0.0 landing pass: 1) package.json version mismatch — VERSION was bumped to 1.28.0.0 but package.json still pinned to 1.27.1.0. test/gen-skill-docs.test.ts asserts they match. 2) Top-level await in scripts/gen-llms-txt.ts (CLI entry block) and scripts/gen-skill-docs.ts (post-step) made gen-skill-docs an async module. test/gen-skill-docs.test.ts uses require() to pull extractVoiceTriggers/processVoiceTriggers from gen-skill-docs, which Bun rejects on async modules with: "TypeError: require() async module ... unsupported. use 'await import()' instead." Fix: wrap the await blocks in void IIFEs so the modules remain sync from a require() perspective. After fix: all 379 gen-skill-docs tests pass, all 77 new feature tests pass (3 skipped on macOS — Linux+Xvfb gates). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(browse): apply codex adversarial findings on the new lifecycle Codex outside-voice review caught five real production-failure modes in the v1.28.0.0 proxy/headed lifecycle. Fixed: 1) `browse disconnect` skip-graceful for proxy-only daemons (browse/src/cli.ts). The graceful /command POST went out with stray `domains,` shorthand and (even fixed) the server's disconnect handler only tears down headed mode — proxy-only daemons returned 200 "Not in headed mode" while leaving the bridge running. Now disconnect short-circuits to force-cleanup for non-headed daemons, which kicks process.on('exit') in server.ts to close the bridge + Xvfb. 2) sendCommand crash retry preserves --proxy / --headed (browse/src/cli.ts). The ECONNRESET retry path called startServer() with no extraEnv, silently dropping the proxied flags. A daemon that died mid-command would silently restart in default direct/headless mode and bypass the SOCKS bridge. Now reapplies BROWSE_PROXY_URL, BROWSE_HEADED, and BROWSE_CONFIG_HASH from the resolved global flags. 3) `connect` honors --proxy (browse/src/cli.ts). The headed-mode `connect` command built its own serverEnv that didn't include BROWSE_PROXY_URL, so `browse --proxy <url> connect` launched headed Chromium without the proxy. Now threads proxyUrl + configHash into the connect serverEnv. 4) SOCKS5 bridge handles fragmented TCP frames (browse/src/socks-bridge.ts). Previously used once('data') and parsed each chunk as a complete SOCKS5 frame — TCP doesn't preserve message boundaries and split greetings/CONNECT requests caused intermittent handshake failures. Replaced with a single state machine that buffers chunks and uses size predicates on the SOCKS5 header to know when a complete frame has arrived. Pauses the client socket during upstream connect and replays any remainder bytes into the upstream on success. 5) Xvfb cleanup-then-state-delete ordering (browse/src/server.ts). emergencyCleanup() previously deleted the state file BEFORE any Xvfb cleanup could read it, orphaning Xvfb on uncaughtException / unhandledRejection. Now reads the state file first, calls cleanupXvfb() (which validates cmdline + start-time before kill), then deletes the state file. Adds a regression test for #4: writes the SOCKS5 greeting + CONNECT one byte at a time with 5ms ticks, asserts a clean round trip after the fragmented handshake. Codex's sixth finding (bridge advertises NO_AUTH on 127.0.0.1, so any co-located process can use the authenticated upstream) is documented as a known limitation — gstack's threat model assumes single-user hosts. Adding bridge-side auth is a separate change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: update BROWSER.md + TODOS.md for v1.28.0.0 BROWSER.md picks up a "Headed mode + proxy + browser-native downloads (v1.28.0.0)" subsection inside Real-browser mode plus the new source-map entries (socks-bridge.ts, proxy-config.ts, proxy-redact.ts, xvfb.ts, stealth.ts). TODOS.md anti-bot-stealth item updated to reflect the v1.28 narrowing — the "fake plugins" line is no longer accurate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ci): include bun.lock in image build for deterministic install CI evals all failed on PR #1363 with: error: Could not resolve: "smart-buffer". Maybe you need to "bun install"? error: Could not resolve: "ip-address". Maybe you need to "bun install"? at /opt/node_modules_cache/socks/build/client/socksclient.js:15 The cached node_modules layer in the pre-baked Docker image had `socks` (the new dep) but was missing its transitive deps (smart-buffer, ip-address). The image build copied only package.json into the build context — without bun.lock, `bun install` resolved a different tree than local `bun install` did, dropping required transitive deps. Reproduces locally as 229 packages (correct) when bun.lock is present or absent. Why CI diverged isn't fully understood — possibly Docker layer cache reuse across image rebuilds — but the deterministic fix is to include the lockfile in the image build context and use `--frozen-lockfile`, matching what every CI doc recommends. Changes: - .github/docker/Dockerfile.ci: COPY bun.lock alongside package.json, switch `bun install` → `bun install --frozen-lockfile` so any future lockfile drift fails loudly during image build instead of producing a partially-installed cache that breaks downstream eval jobs. - .github/workflows/evals.yml: include bun.lock in the image-tag hash so adding/removing a dep invalidates the image, AND copy bun.lock into the docker context alongside package.json. - .github/workflows/evals-periodic.yml: same updates. - .github/workflows/ci-image.yml: rebuild trigger now fires on bun.lock changes too; build context includes bun.lock. Image hash changes → fresh image gets built on next CI run → install matches the lockfile exactly → no missing transitive deps. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ci): use hardlink copy instead of symlink for node_modules cache After the bun.lock fix landed, the eval matrix STILL failed identically: Could not resolve: "smart-buffer" / "ip-address" at /opt/node_modules_cache/socks/build/client/socksclient.js But the hash-tagged image actually contains smart-buffer + ip-address + socks all flat in /opt/node_modules_cache (verified by pulling and inspecting the image). 207 packages, all present. Root cause: the workflow used `ln -s /opt/node_modules_cache node_modules` to restore deps. Bun build (and Node module resolution generally) walks a file's realpath to find sibling deps. From the symlinked /workspace/node_modules/socks/build/client/socksclient.js, realpath resolves to /opt/node_modules_cache/socks/build/client/socksclient.js, and walking up to find a node_modules/smart-buffer dir fails — there's no `node_modules` segment in the realpath. Switch `ln -s` → `cp -al` (hardlink-copy). Each file in the cache becomes a hardlink at /workspace/node_modules/<pkg>, sharing inodes (no data copy). Realpath of /workspace/node_modules/socks/.../socksclient.js stays inside /workspace/node_modules, so sibling deps resolve correctly. Speed is comparable to symlink — `cp -al` on ~200 packages on tmpfs is sub-second. Same caching story preserved. Both evals.yml and evals-periodic.yml updated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ci): cp -r instead of cp -al — /opt and /workspace are different filesystems The hardlink-copy fix landed and immediately broke with: cp: cannot create hard link 'node_modules/<file>' to '/opt/node_modules_cache/<file>': Invalid cross-device link GitHub Actions runners mount the workspace volume at /workspace (overlay-fs layered onto the runner image), and /opt is the runner image's own filesystem. Cross-filesystem hardlinks aren't supported. Switch `cp -al` → `cp -r`. Cost: ~5s for ~200 packages of small JS files vs ~0s for the broken symlink. Still cheaper than the ~15s `bun install` fallback. Realpath of /workspace/node_modules/<pkg>/... stays inside /workspace, so bun build's sibling-dep resolution works. Both evals.yml and evals-periodic.yml updated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v1.29.0.0 feat: worktree-aware gbrain code sources via path-hash IDs and CWD pin (#1382) * feat: worktree-aware gbrain code sources via path-hash IDs and CWD pin Conductor sibling worktrees of the same repo no longer collide on a shared gstack-code-<slug> source ID. /sync-gbrain now derives a path-hashed source ID per worktree, runs gbrain sources attach to write .gbrain-source in the worktree root, and removes the legacy unsuffixed source on first new-format sync to prevent orphan accumulation. Bug fixes surfaced by /codex during /ship: - Silent attach failure now treated as stage failure (no more ok:true while pin is missing → unqualified code-def hits wrong source). - Startup preamble checks .gbrain-source in the cwd worktree, not global state, so an unsynced worktree no longer claims "indexed" because a sibling synced. - Code stage no longer skipped on remote-MCP (Path 4); the early-exit was in the SKILL template, not the orchestrator. - Source registration routes through lib/gbrain-sources.ts only; deleted the near-duplicate ensureSourceRegisteredSync from the orchestrator. Requires gbrain v0.30.0+ (uses sources attach). Phase 0 spike report: ~/.gstack/projects/garrytan-gstack/2026-05-08-gbrain-split-engine-spike.md Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: bump version and changelog (v1.29.0.0) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * v1.30.0.0 fix wave: 21 community PRs + Windows CI extension + codex flag-semantics smoke (#1391) * fix(codex): use resume-compatible flags * fix: V-001 security vulnerability Automated security fix generated by Orbis Security AI * docs: align prompt-injection thresholds to security.ts (v1.6.4.0 catch-up) CLAUDE.md:290 and ARCHITECTURE.md:159 were missed when WARN was bumped 0.60 → 0.75 in d75402bb (v1.6.4.0, "cut Haiku classifier FP from 44% to 23%, gate now enforced", #1135). browse/src/security.ts:37 has WARN: 0.75 and BROWSER.md:743 was updated alongside that commit; CLAUDE.md and ARCHITECTURE.md still read 0.60. Also adds the SOLO_CONTENT_BLOCK: 0.92 entry to CLAUDE.md (already in security.ts:50 and BROWSER.md:745, missing from CLAUDE.md's threshold table). No code change. No behavior change. Pure doc-vs-code alignment. Verification: $ grep -n "WARN" browse/src/security.ts CLAUDE.md ARCHITECTURE.md BROWSER.md browse/src/security.ts:37: WARN: 0.75, CLAUDE.md:290: - \`WARN: 0.75\` ... ARCHITECTURE.md:159: ...>= \`WARN\` (0.75)... BROWSER.md:743: - \`WARN: 0.75\` ... Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: Korean/CJK IME input and rendering in Sidebar Terminal Fixes #1272 This commit addresses three separate Korean/CJK bugs in the Sidebar Terminal: **Bug 1 - IME Input**: Korean text typed via IME composition was not reaching the PTY correctly. Added compositionstart/compositionend event listeners to suppress partial jamo fragments and only send the final composed string. **Bug 2a - Font Rendering**: Added CJK monospace font fallbacks ("Noto Sans Mono CJK KR", "Malgun Gothic") to both the xterm.js fontFamily config and the CSS --font-mono variable. This ensures consistent cell-width calculations for Korean characters. **Bug 2b - UTF-8 Boundary Detection**: Added buffering logic to prevent multi-byte UTF-8 characters (Korean is 3 bytes) from being split across WebSocket chunks. This follows the same pattern as PR #1007 which fixed the sidebar-agent path, but extends it to the terminal-agent path. Special thanks to @ldybob for the excellent root cause analysis and proposed solutions in issue #1272. Tested on WSL2 + Windows 11 with Korean IME. * fix(ship): tighten Plan Completion gate (VAS-449 remediation) VAS-446 shipped with a PLAN.md acceptance criterion (domain-hq has /docs/dashboard.md) silently skipped. /ship's Plan Completion subagent existed at ship time (added in v1.4.1.0) but the gate let the failure through. Four structural fixes: 1. Path concreteness rule: items naming a concrete filesystem path MUST be classified DONE/NOT DONE via [ -f <path> ], never UNVERIFIABLE. 2. Validator detection: CONTENT-SHAPE items scan target repo's package.json for validate-* scripts and run them before falling back to UNVERIFIABLE. 3. Per-item UNVERIFIABLE confirmation: replaces blanket "I've checked each one" with per-item Y/N/D loop. The blanket-confirm path is the exact failure VAS-449 surfaced. 4. Subagent fail-closed: if Plan Completion subagent + inline fallback both fail, surface explicit AskUserQuestion instead of silent pass. Replaces the prior "Never block /ship on subagent failure" fail-open. Locked in by test/ship-plan-completion-invariants.test.ts (5 assertions, no LLM dependency, ~60ms). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(browse): bash.exe wrap for telemetry on Windows reportAttemptTelemetry() in browse/src/security.ts calls spawn(bin, args) where bin is the gstack-telemetry-log bash script. On Windows this fails silently with ENOENT — CreateProcess can't dispatch on shebang lines. Adopts v1.24.0.0's Bun.which + GSTACK_*_BIN override pattern (from browse/src/claude-bin.ts:resolveClaudeCommand, introduced in #1252) for resolving bash.exe. resolveBashBinary() honors GSTACK_BASH_BIN absolute-path or PATH-resolvable override, falling back to Bun.which('bash') which finds Git Bash on the standard Windows install. buildTelemetrySpawnCommand() wraps the script invocation on win32 only; POSIX path is bit-identical. Returns null when bash can't be resolved on Windows so caller skips spawn — local attempts.jsonl audit trail keeps working without surfacing a Windows-only failure. 8 new unit tests cover resolveBashBinary (POSIX bash, absolute override, quote-stripping, BASH_BIN fallback, empty-PATH null) and buildTelemetrySpawnCommand (POSIX pass-through, win32 bash wrap, win32 null on unresolvable, arg-array immutability). POSIX path is bit-identical — Bun.which('bash') on Linux/macOS returns the same /bin/bash or /usr/bin/bash that the old hardcoded spawn relied on. * fix(make-pdf): Bun.which-based binary resolution for browse + pdftotext on Windows Extends v1.24.0.0's Bun.which + GSTACK_*_BIN override pattern (introduced in browse/src/claude-bin.ts via #1252) to the two other binary resolvers in the codebase: make-pdf/src/browseClient.ts:resolveBrowseBin and make-pdf/src/pdftotext.ts:resolvePdftotext. Same Windows quirks (fs.accessSync(X_OK) degrades to existence-check; `which` isn't available outside Git Bash; bun --compile --outfile X emits X.exe), same Bun.which-based fix shape, same env override convention. Changes: - GSTACK_BROWSE_BIN / GSTACK_PDFTOTEXT_BIN as the v1.24-aligned overrides; BROWSE_BIN / PDFTOTEXT_BIN remain as back-compat aliases. - Bun.which() replaces execFileSync('which', ...) for PATH lookup. Handles Windows PATHEXT natively; no more `where`-vs-`which` branch. - findExecutable(base) helper exported from each module, probes .exe/.cmd/.bat after the bare-path miss on win32. Linux/macOS behavior is bit-identical (isExecutable short-circuits before the win32 branch ever runs). - macCandidates renamed posixCandidates (always was — /opt/homebrew, /usr/local, /usr/bin). No Windows candidates added; Poppler installs scatter across Scoop/Chocolatey/portable zips and guessing causes false positives. - Error messages get a Windows install hint (scoop install poppler / oschwartz10612) and `setx` example for GSTACK_*_BIN. - Pre-existing test 'honors BROWSE_BIN when it points at a real executable' was hardcoded /bin/sh — made cross-platform via a REAL_EXE constant (cmd.exe on win32, /bin/sh on POSIX). Was a Windows-CI blocker on its own. Coordination: PR #1094 (@BkashJEE) covered browseClient.ts independently with a narrower scope; this PR's pdftotext + cross-platform tests + GSTACK_*_BIN naming are additive. Either order of merge works. Test plan: - bun test make-pdf/test/browseClient.test.ts make-pdf/test/pdftotext.test.ts on win32 — 29 pass, 0 fail (12 new assertions: findExecutable POSIX/win32/null, resolveBrowseBin GSTACK_BROWSE_BIN + BROWSE_BIN + precedence + quote-strip, same shape for resolvePdftotext + Windows install hint in error message). - POSIX branch unchanged — fs.accessSync(X_OK) on Linux/macOS short-circuits before any win32 logic runs, matching the v1.24 claude-bin.ts pattern. * fix(browse): NTFS ACL hardening for Windows state files via icacls gstack's ~/.gstack/ state directory holds bearer tokens, canary tokens, agent queue contents (with prompt history), session state, security-decision logs, and saved cookie bundles — all written with { mode: 0o600 } / 0o700. On Windows, those mode bits are a silent no-op: Node's fs module doesn't translate POSIX modes to NTFS ACLs, and inherited ACLs leave every "restricted" file readable by other principals on the machine (verified via icacls — six ACEs, the intended user is the LAST of six). Threat model is non-trivial on: - Self-hosted CI runners (different service account on the same Windows box can read developer tokens, canary tokens, prompt history) - Shared development machines (agencies, studios, lab environments) - Multi-tenant servers with shared home directories Orthogonal to v1.24.0.0's binary-resolution work — complementary at the write side. v1.24's bin/gstack-paths resolves ~/.gstack/ correctly across plugin / global / local installs; this PR ensures files written into those resolved paths actually get the POSIX 0o600 semantic translated to NTFS. The fix: - New browse/src/file-permissions.ts (158 LOC, 5 public + 1 test-reset). restrictFilePermissions / restrictDirectoryPermissions wrap chmod (POSIX) or icacls /inheritance:r /grant:r <user>:(F) (Windows). writeSecureFile / appendSecureFile / mkdirSecure are drop-in wrappers for the common patterns. - 19 call sites converted across 9 source files: browser-manager.ts, browser-skill-write.ts, cli.ts, config.ts, meta-commands.ts, security-classifier.ts, security.ts (4 sites), server.ts (5 sites), terminal-agent.ts (8 sites), tunnel-denial-log.ts. - (OI)(CI) inheritance flags on directories mean files created via fs.write* *inside* an mkdirSecure-created dir inherit the owner-only ACL automatically — important for tunnel-denial-log.ts where appends use async fsp.appendFile. Error handling: icacls failures (nonexistent path, missing icacls.exe, hardened environments) log a one-shot warning to stderr and proceed. Once-per-process gating prevents log spam if the condition persists. Filesystem stays functional; the file just ends up with inherited ACLs. Test plan: - bun test browse/test/file-permissions.test.ts — 13 pass, 0 fail (POSIX mode-bit assertions, Windows no-throw, mkdir idempotence, recursive creation, Buffer payloads, append-creates-then-reapplies-once semantics) - bun test browse/test/security.test.ts — 38 pass, 0 fail (existing security test suite plus the bash-binary resolution tests added in fix #1119; the converted writeFileSync/appendFileSync/mkdirSync sites in security.ts integrate cleanly) - Empirical icacls before/after on a real file — 6 ACEs → 1 ACE - bun build typecheck on all modified files — clean (server.ts has a pre-existing playwright-core/electron resolution issue unrelated to this PR) POSIX behavior is bit-identical to old code — fs.chmodSync(path, 0o6XX) on the helper's POSIX branch matches the inline { mode: 0o6XX } it replaces. Linux and macOS see no behavior change. Inviting pushback on three judgment calls (in PR description): 1. icacls vs npm library 2. ACL scope — just user, or user + SYSTEM? 3. Graceful degradation — once-per-process warn, not silent, not hard-fail. * fix(browse): declare lastConsoleFlushed to restore console-log persistence flushBuffers() references a `lastConsoleFlushed` cursor at server.ts:337 and assigns it at :344, but the `let lastConsoleFlushed = 0;` declaration is missing — only the network and dialog siblings are declared at lines 327-328. Result: every 1-second flushBuffers tick (line 376) throws `ReferenceError: lastConsoleFlushed is not defined`, gets swallowed by the catch at line 369 ("[browse] Buffer flush failed: ..."), and the console branch's append never runs. browse-console.log is never written in any production deployment since this regressed. Discovered by stress-testing the daemon with 15 concurrent CLIs against cold state — the race surfaced the buffer-flush error spam in one spawned daemon's stderr. Verified by running the daemon against a real file:// page with console.log events: in-memory `browse console` returns the entries, but `.gstack/browse-console.log` is never created on disk. Regression introduced by 1a100a2a "fix: eliminate duplicate command sets in chain, improve flush perf and type safety" — the flush refactor switched from `Bun.write` to `fs.appendFileSync` and added the `lastConsoleFlushed` cursor pattern alongside its network/dialog siblings, but missed the matching `let` declaration. Tests don't currently exercise flushBuffers, so the regression shipped silently. Fix: - Declare `let lastConsoleFlushed = 0;` next to `lastNetworkFlushed` and `lastDialogFlushed` (browse/src/server.ts:327) - Add a source-level guard test (browse/test/server-flush-trackers.test.ts) that fails any future refactor that adds a fourth `last*Flushed` cursor without the matching declaration. Same pattern as terminal-agent.test.ts and dual-listener.test.ts — read source as text, assert invariant, no daemon required. Test plan: - [x] New regression test fails on current main, passes with the fix - [x] `bun run build` clean - [x] Manual smoke: spawn daemon -> goto file:// page with console.log -> wait 4s -> .gstack/browse-console.log now exists with the expected entries (163 bytes vs zero before) 🤖 Generated with [Claude Code](https://claude.com/claude-code) * fix(browse): per-process state-file temp path to fix concurrent-write ENOENT The daemon writes `.gstack/browse.json` via the standard atomic-rename pattern: `writeFileSync(tmp, …) → renameSync(tmp, stateFile)`. Four sites in server.ts use this pattern (initial daemon-startup state at :2002, /tunnel/start handler at :1479, BROWSE_TUNNEL=1 inline tunnel update at :2083, BROWSE_TUNNEL_LOCAL_ONLY=1 update at :2113), and all four hard-code the same temp filename `${stateFile}.tmp`. Under concurrent writers the shared filename races on the rename: t0 Writer A: writeFileSync(stateFile + '.tmp', payloadA) t1 Writer B: writeFileSync(stateFile + '.tmp', payloadB) // overwrites A t2 Writer A: renameSync(stateFile + '.tmp', stateFile) // moves B's payload t3 Writer B: renameSync(stateFile + '.tmp', stateFile) // ENOENT — file gone Reproduced empirically with 15 concurrent CLIs against a fresh `.gstack/`: [browse] Failed to start: ENOENT: no such file or directory, rename '…/.gstack/browse.json.tmp' -> '…/.gstack/browse.json' Pre-fix success rate: **0 / 15** under cold-start race. Post-fix success rate: **15 / 15**, zero ENOENT. Fix: - New `tmpStatePath()` helper (server.ts:333) returns `${stateFile}.tmp.${pid}.${randomBytes(4).toString('hex')}` - All 4 call sites use `tmpStatePath()` instead of the shared literal - Atomic rename still gives last-writer-wins semantics on the final state.json content; only behavior change is that concurrent writers no longer kill each other on the rename step Source-level guard test (browse/test/server-tmp-state-path.test.ts) locks two invariants: (1) no remaining `stateFile + '.tmp'` literals, (2) every state-write `writeFileSync` call uses `tmpStatePath()`. Same read-source-as-text pattern as terminal-agent.test.ts and dual-listener.test.ts — no daemon required, runs in tier-1 free. Test plan: - [x] Targeted source-level guard test passes (3 / 0) - [x] `bun run build` clean - [x] Live regression: 15 concurrent CLIs against cold state → 15 / 15 healthy, 0 ENOENT (vs 0 / 15 pre-fix) - [x] No `.tmp.*` orphans left behind after rename succeeds - [x] Related test cluster (server-auth, dual-listener, cdp-mutex, findport) — same pre-existing flakes as `main`, no new regressions introduced 🤖 Generated with [Claude Code](https://claude.com/claude-code) * fix(browse): clear refs when iframe auto-detaches in getActiveFrameOrPage Asymmetric cleanup between two equivalent staleness conditions: onMainFrameNavigated() → clearRefs() + activeFrame = null ✓ getActiveFrameOrPage() → activeFrame = null (refs NOT cleared) ✗ Both paths see the same staleness condition — refs were captured against a frame that no longer exists. The main-frame path correctly clears both pieces of state. The iframe-detach path nulls the frame but leaves the refMap intact. The lazy click-time check in `resolveRef` (tab-session.ts:97) partially saves us — `entry.locator.count()` on a detached-frame locator throws or returns 0, so the click errors out as "Ref X is stale". But the user has no signal that frame context silently changed underfoot: the next `snapshot` runs against `this.page` (main) while old iframe refs still litter `refMap` with the same role+name keys. New refs collide with stale ones, the resolver picks one at random, the user clicks the wrong element. TODOS.md line 816-820 documents "Detached frame auto-recovery" as a shipped iframe-support feature in v0.12.1.0. This restores the documented intent — the recovery should leave the session in a clean state, not a half-cleared one. Fix: 1 line — add `this.clearRefs()` next to `this.activeFrame = null` inside the if-branch. Test plan: - [x] New regression test: 4/4 pass - refs cleared when getActiveFrameOrPage detects detached iframe - refs preserved when active frame is still attached (no regression) - refs preserved when no frame set (page-level path untouched) - matches onMainFrameNavigated symmetry — both paths reach the same clean end state - [x] `bun run build` clean 🤖 Generated with [Claude Code](https://claude.com/claude-code) * fix(codex): resolve python for JSON parser * fix: add fail-fast probe for base branch in ship step 12 * fix(plan-devex-review): remove contradictory plan-mode handshake * fix(design): honor Retry-After header in variants 429 handler Closes #1244. The 429 handler in `generateVariant` discarded the `Retry-After` response header and fell straight through to a local exponential schedule (2s/4s/8s). In image-generation batches, that burns retry attempts inside the provider's cooldown window and the request never recovers. Now we parse `Retry-After` per RFC 7231 — both delta-seconds (`Retry-After: 5`) and HTTP-date (`Retry-After: Fri, 31 Dec 1999 23:59:59 GMT`). Honored waits are capped at 60s to bound stalls from hostile or buggy headers. Delta-seconds are validated as digits-only (rejects `2abc`). When `Retry-After` is honored (including 0 / past-date "retry now"), the next iteration's leading exponential sleep is skipped so we don't double-wait. Invalid or missing headers fall through to the existing exponential schedule unchanged. Behavior matrix: | Header | Behavior | |---------------------------------|-------------------------------------------| | Retry-After: 5 | wait 5s, skip leading on next attempt | | Retry-After: 999999 | capped to 60s, skip leading | | Retry-After: 2abc | invalid, fall through to exponential | | Retry-After: 0 | wait 0, skip leading (retry immediately) | | Retry-After: <past HTTP-date> | wait 0, skip leading | | Retry-After: <future date> | wait diff capped at 60s, skip leading | | no header | fall through to existing exponential | `generateVariant` now accepts an optional `fetchFn` parameter (defaults to `globalThis.fetch`) so tests can inject a stub. Production call sites are unchanged. Tests cover the five behavior buckets above, asserting both the 1st-to-2nd call timing gap and call counts. All five pass in ~8s. Co-Authored-…

* v1.41.1.0 fix wave: 7 HIGH bugs from external audit + regression tests (PR #1169 follow-up) (#1592) * fix(build-app): escape sed replacement metachars in Chromium rebrand build-app.sh injects \$APP_NAME directly into the replacement half of sed's s/// when patching Chromium's localized InfoPlist.strings. If \$APP_NAME ever carries '/', '&', or '\\' — the command either breaks or starts interpreting input as sed syntax. The trailing '|| true' would then silently hide the failure and ship a DMG that still says 'Google Chrome for Testing' in the menu bar. Escape replacement metachars before substitution. No change for the default name 'GStack Browser'. * fix(build-app): bail out if 'mktemp -d' fails instead of cp-ing into '/' The DMG creation step sets DMG_TMP from 'mktemp -d' with no error check. If mktemp fails (tmpfs full, permissions, TMPDIR misconfigured), DMG_TMP is empty and the very next line — 'cp -a "\$APP_DIR" "\$DMG_TMP/"' — expands to 'cp -a "<app>" "/"', which copies the bundle into the root of the filesystem. Refuse to continue unless mktemp produced a real directory. Defensive second check catches the (rare) case where mktemp succeeds but returns something that isn't a directory we can cp into. * fix(telemetry-sync): drop predictable $$ tmp-file fallback gstack-telemetry-sync tried 'mktemp /tmp/gstack-sync-XXXXXX' and on failure fell back to '/tmp/gstack-sync-$$'. $$ is the PID — predictable and reusable, so on shared hosts another user can pre-create or symlink the path and either steal the response body or clobber an unrelated file when curl writes through it. Drop the fallback. If mktemp cannot produce a unique file we just skip this sync cycle — the events stay on disk and the next run picks them up. Also install an EXIT trap so the response file is cleaned up on unexpected exit, not just on the happy path. * fix(verify-rls): drop predictable $$-based tmp file fallback Same shape as gstack-telemetry-sync: on mktemp failure the script fell back to '/tmp/verify-rls-$$-$TOTAL', which is fully predictable from the PID and a per-check counter. On a shared box another user can pre-create or symlink the path and either capture the HTTP response body (which may leak what the RLS tests revealed) or corrupt an unrelated file that curl writes through. Make mktemp strict. On failure return from the check function; the caller tallies a FAIL and the run moves on. * fix(security-classifier): close writer + delete tmp on download error downloadFile() opens an fs.WriteStream to '<dest>.tmp.<pid>' and drives it from a fetch body reader, but if reader.read() or writer.write() throws mid-download the writer is never closed. That leaks an FD per failed attempt and leaves the half-written tmp on disk. A later retry can land in renameSync(tmp, dest) with a truncated TestSavantAI / DeBERTa ONNX file — which then loads but produces garbage classifier verdicts until the user manually nukes the models cache. Wrap the download loop in try/catch. On failure, destroy() the writer and unlink the tmp before rethrowing, so the next attempt starts from a clean slate. * fix(meta-commands): guard JSON.parse in pdf --from-file parser parsePdfFromFile() runs JSON.parse on user-supplied file contents with no try/catch. A malformed payload surfaces as an uncaught SyntaxError from the 'pdf' command handler and the user sees an opaque stack trace instead of "this file isn't valid JSON". Worse, the same call path is used by make-pdf when header/footer HTML would overflow Windows' CreateProcess argv cap, so a corrupt payload file there can take down the make-pdf run. Wrap JSON.parse. Re-throw with a message that names the offending file and echoes the parser's own explanation. Also reject top-level non- objects (null, array, primitive) since the rest of the function treats json as an object — catching that here produces a clear error instead of a TypeError further down. * fix(global-discover): stop dropping sessions when header >8KB extractCwdFromJsonl() reads the first 8KB of each JSONL session file and runs JSON.parse on every newline-split line. When a session record happens to straddle the 8KB cap, the last line ends in a truncated JSON fragment, JSON.parse throws, the catch block 'continue's silently, and if that was the only line carrying 'cwd' the whole project gets dropped from the discovery output without a warning. Two independent hardening steps: 1. Raise the read cap to 64KB. Session headers observed in Claude Code / Codex / Gemini transcripts fit comfortably; this just moves the cliff out of the normal range. 2. Drop the final segment after splitting on '\\n'. If the read hit the cap mid-line, that segment is guaranteed incomplete; if the file ended inside the buffer, the split produces an empty final segment and dropping it is a no-op. Together these make the parser robust regardless of how verbose the leading records are. * test: export downloadFile, parsePdfFromFile, extractCwdFromJsonl These three internal helpers are now imported by regression tests landing in the next commits (PR #1169 follow-up). Pattern matches the existing normalizeRemoteUrl export in gstack-global-discover.ts which test/global-discover.test.ts already imports side-effect-free. No change to runtime behavior; gstack has no public package entrypoint that would re-export these, so the in-repo surface is unchanged for callers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(security-classifier): await writer close before unlinking tmp on error The earlier downloadFile() error-path cleanup hit a race: Node's createWriteStream lazily opens the FD and flushes buffered writes during destroy(), so a naive `fs.unlinkSync(tmp)` immediately after `writer.destroy()` hits ENOENT (file not yet on disk), then the writer's destroy finishes on the next tick and creates the file fresh — leaving the half-written tmp behind exactly as the original fix tried to prevent. The new sequence awaits the writer's 'close' event before unlinking, so the FD is fully torn down and no subsequent flush can re-create the path. Caught by browse/test/security-classifier-download-cleanup.test.ts in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(browse): regression tests for downloadFile cleanup + parsePdfFromFile guard Covers PR #1169 bugs #6 and #7: - security-classifier-download-cleanup.test.ts pins downloadFile error-path cleanup against three failure shapes: reader rejects mid-stream, non-2xx response, missing body. Asserts the dest file is not created and no <dest>.tmp.* siblings remain (glob-matched, not exact path — codex push: if the fix later switches to mkdtempSync, the assertion still holds). Includes a happy-path case so the cleanup isn't fighting a correct download. - regression-pr1169-pdf-from-file-invalid-json.test.ts pins parsePdfFromFile to throw a helpful error for: invalid JSON, empty file, top-level array, top-level number, top-level string, top-level null, top-level boolean. Codex push: JSON.parse accepts primitives too, so Array.isArray + typeof guard must be tested separately from the JSON.parse try/catch. Both files use mkdtempSync(process.cwd()/...) for fixture isolation since SAFE_DIRECTORIES allows TEMP_DIR or cwd; cwd is universal across CI hosts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(global-discover): regression for extractCwdFromJsonl 64KB cap PR #1169 bug #8: the 8KB read cap landed mid-line on Claude Code session headers, JSON.parse threw on the truncated tail, the catch silently continued, and the project disappeared from /gstack discovery output. Six new cases under describe("extractCwdFromJsonl 64KB cap"): - happy path: small JSONL with obj.cwd returns it - 12KB first line with obj.cwd: returns cwd (the bug case) - 80KB single line overflowing 64KB: returns null without crashing - complete line followed by partial second line: trailing-partial-drop must not poison the result; returns first line's cwd - missing file: returns null (file read error swallowed) - malformed first line + valid second line within cap: skips bad, returns second's cwd Tests use the exported extractCwdFromJsonl (added in earlier export commit) and live in a separate describe block from the existing "4KB / 128KB buffer" tests, which exercise the unrelated scanCodex meta.payload.cwd path at L338 — different function, different bug. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: regression tests for shell-script bugs in PR #1169 (#2-#5) Two new test files pinning the four shell-script invariants from the external audit: regression-pr1169-build-app-sed.test.ts — bugs #2 + #3 - Runtime isolation: extracts the sed-escape sequence from build-app.sh and runs it against hostile $APP_NAME values ("Foo/Bar&Baz", "Cool\App", "A/B\C&D"). Asserts the literal hostile name round-trips through a real `sed s///` invocation, locking the metachar safety end-to-end. - Static check: the rebrand block must contain both the escape line AND the sed line referencing $APP_NAME_SED_ESCAPED; bare $APP_NAME interpolation directly into the s/// replacement is rejected. - Static check: DMG_TMP=$(mktemp -d) is followed by an explicit `|| { ... exit }` failure handler AND a `[ -z "$DMG_TMP" ] || [ ! -d "$DMG_TMP" ]` validation AND the cp -a appears AFTER both guards. - Runtime fake-bin: extracts the guard shape, runs with a fake mktemp that exits 1, asserts the script exits non-zero before any cp block can reach. regression-pr1169-mktemp-fallbacks.test.ts — bugs #4 + #5 - Per codex pushback, the invariant is "no `mktemp ... || echo <path>` fallback shape" — not just "no $$ token." That's a stronger invariant that catches future swaps to $RANDOM or hardcoded paths. - For each of bin/gstack-telemetry-sync and supabase/verify-rls.sh: - no echo-based fallback after mktemp - no $$ inside any /tmp path literal - mktemp failure path explicitly exits / returns non-zero - telemetry-sync also pins the `trap rm -f $RESP_FILE EXIT` cleanup so success paths don't leak the tmp on normal exit. All seven new test files are gate-tier (deterministic, sub-second, no LLM, no network). Runtime shell tests use fake-bin PATH stubs in temp dirs; no $HOME mutation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v1.41.1.0) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: RagavRida <ragavrida@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v1.42.0.0 Daegu wave: 23 community-filed bugs + PTY classifier enforcement (24 bisect commits) (#1594) * fix(gstack-paths): guard CLAUDE_PLUGIN_DATA against cross-plugin contamination (#1569) gstack-paths previously trusted CLAUDE_PLUGIN_DATA as a fallback for GSTACK_STATE_ROOT whenever GSTACK_HOME was unset. When another plugin (e.g. Codex) persists its own CLAUDE_PLUGIN_DATA into the session env via CLAUDE_ENV_FILE, gstack picked it up and wrote checkpoints, analytics, and learnings into that plugin's directory. Anyone with the Codex plugin installed alongside gstack hit this silently. Fix: guard the CLAUDE_PLUGIN_DATA branch so it only fires when CLAUDE_PLUGIN_ROOT confirms we're running as the gstack plugin (path contains "gstack"). Skill installs fall through to \$HOME/.gstack. Contributed by @ElliotDrel via #1570. Closes #1569. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(gbrain-sync): sourceLocalPath handles wrapped {sources:[...]} shape from gbrain v0.20+ gbrain v0.20+ changed `gbrain sources list --json` to return {sources: [...]} instead of a flat array. sourceLocalPath crashed upstream with `list.find is not a function` on every /sync-gbrain invocation against modern gbrain. Accept both shapes for forward/backward compat, matching probeSource/sourcePageCount in lib/gbrain-sources.ts. Contributed by @jakehann11 via #1571. Closes #1567. Supersedes #1564 (@tonyjzhou, same fix, different shape — credit retained). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(brain-context-load): probe gbrain via execFile, not shell builtin (#1559) gbrainAvailable() used `execFileSync("command", ["-v", "gbrain"])`, which fails in any environment where the `command` builtin isn't on the spawned process's PATH (most non-interactive shells). The probe then reported gbrain as missing even when it was installed, and context-load silently skipped vector/list queries. Fix: probe `gbrain --version` directly with a 500ms timeout (matching the rest of the file's MCP_TIMEOUT_MS). Same semantics, works everywhere execFile works. Contributed by @jbetala7 via #1560. Closes #1559. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(gbrain-doctor): pin schema_version:2 doctor parse path (#1418) Adds an exec-path regression test that runs a fake gbrain shim emitting the v0.25+ doctor JSON shape (schema_version: 2, status: "warnings", exit 1 for health_score < 100, no top-level `engine` field). Confirms freshDetectEngineTier recovers stdout from the non-zero exit and falls back to GBRAIN_HOME/config.json for the engine label. The pre-existing test for #1415 only stripped gbrain from PATH; this test exercises the actual doctor parse path, closing the gap that codex's plan review flagged. Also documents the schema_version separation in lib/gbrain-local-status.ts: the local CacheEntry stays at version 1, distinct from the doctor-output schema_version which we accept across versions in gstack-memory-helpers. Closes #1418 (credit @mvanhorn for surfacing the doctor + schema_v2 collapse). The fix landed pre-emptively in v1.29.x; this commit pins it with a stronger test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(memory-ingest): pin put_page regression + scrub stale name from --help and comments (#1346) #1346 reported that gstack-memory-ingest still called the renamed gbrain put_page subcommand on gbrain v0.18+. The actual code migrated to `gbrain put` and later to batch `gbrain import <dir>` before this report landed — only documentation lag remained. This commit: - Updates the --help string ("Skip gbrain put calls (still updates state file)") so user-facing docs match the shipped subcommand - Updates two inline comments that still referenced the old name - Adds test/memory-ingest-no-put_page.test.ts: a regression pin that strips comments from bin/gstack-memory-ingest.ts and fails the build if "put_page" appears in any active code or string literal, plus a sanity check that the file still calls a supported gbrain page-write verb (put or import) Closes #1346. Reporter @kylma-code surfaced the doc lag; the original code migration credit is on the v1.27.x wave. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(resolvers): rewrite all gbrain put_page instructions to canonical put <slug> scripts/resolvers/gbrain.ts emitted user-facing copy-paste instructions using the renamed `gbrain put_page` subcommand across 10 skills (office-hours, investigate, plan-ceo-review, retro, plan-eng-review, ship, cso, design-consultation, fallback, entity-stub). Every gstack user copying those snippets hit "unknown command: put_page" on gbrain v0.18+. This commit: - Rewrites all 10 instruction templates to use `gbrain put <slug> --content "$(cat <<EOF...EOF)"` with title/tags moved into YAML frontmatter inside --content, matching the v0.18+ subcommand shape - Updates README.md and USING_GBRAIN_WITH_GSTACK.md "common commands" table to reference `gbrain put` and `gbrain get` - Adds test/resolvers-gbrain-put-rewrite.test.ts pinning two invariants: (a) resolver source ships only canonical instructions, (b) every tracked SKILL.md file is free of `gbrain put_page` CHANGELOG entries are deliberately left untouched (historical record). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(build): extract package.json build to scripts/build.sh for Windows Bun compat (#1538, #1537, #1530, #1457, #1561) Bun's Windows shell parser rejects multiple constructs the inline package.json build chain used: brace groups `{ cmd; }`, subshells with redirection `( git ... ) > path/.version`, and (in Bun 1.3.x) subshells near redirections in general. Every Windows install + every auto-upgrade since v1.34.2.0 has failed on `bun run build`. Extracts the build chain to scripts/build.sh and the .version writes to scripts/write-version-files.sh. POSIX-portable, no Bun shell parsing involved. Also adds Windows-specific bun.exe handling for non-ASCII PATHs (a separate Windows footgun where Bun's --compile fails when the binary lives under a path with non-ASCII chars). Updates test/build-script-shell-compat.test.ts to assert the new shape: no subshells with redirections anywhere in the build chain, and build delegates to scripts/build.sh which delegates .version writes. Contributed by @Charlie-El via #1544. Supersedes #1531 (@scarson, fixed in build helper), #1480 (@mikepsinn, partial overlap), #1460 (@realcarsonterry, brace-group fix subsumed) — credit retained. Closes #1538, #1537, #1530, #1457, #1561. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(windows): .exe glob in .gitignore + .exe extension resolution in find-browse (#1554) bun build --compile on Windows appends .exe to the output filename, producing browse.exe instead of browse. find-browse's existsSync probe only checked the bare path and returned null on Windows even when the binary was correctly built. .gitignore similarly only excluded the bare bin/gstack-global-discover path, leaving the .exe variant tracked. This commit: - .gitignore: changes `bin/gstack-global-discover` → `bin/gstack-global-discover*` so the Windows .exe variant is ignored - browse/src/find-browse.ts: adds isExecutable + findExecutable helpers that fall back to .exe/.cmd/.bat probing on Windows, mirroring the same helper already in make-pdf/src/browseClient.ts and pdftotext.ts Contributed by @Mike-E-Log via #1554. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci(windows): add fresh-install E2E gate that runs bun run build on windows-latest Adds .github/workflows/windows-setup-e2e.yml as the gate that catches Bun shell-parser regressions in the build chain before they reach users. Triggers on PRs touching package.json, scripts/build.sh, scripts/write-version-files.sh, setup, browse cli/find-browse, or gstack-paths. What it verifies: 1. bun run build completes on Windows (the previously-broken path that #1538/#1537/#1530/#1457/#1561 reported) 2. All compiled binaries land on disk (browse.exe, find-browse.exe, design.exe, gstack-global-discover.exe) 3. find-browse resolves to the .exe variant on Windows (regression gate for #1554) 4. gstack-paths returns non-empty GSTACK_STATE_ROOT/PLAN_ROOT/TMP_ROOT on Windows (regression gate for #1570) Complements the existing windows-free-tests.yml (curated unit subset); this new workflow exercises the install path itself. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(codex): move diff scope into prompt instead of --base (Codex CLI 0.130+ argv conflict) (#1209) Codex CLI ≥ 0.130.0 rejects passing a custom prompt and --base together (mutually exclusive at argv level). Every /codex review, /review, and /ship structured Codex review call ended with an argv error before the model ran. Fix: scope the diff in prompt text using "Run git diff origin/<base>...HEAD 2>/dev/null || git diff <base>...HEAD" instead of `--base <base>`. Preserves the filesystem boundary instruction across all invocations and keeps Codex's review prompt tuning. Touches: - codex/SKILL.md.tmpl + regenerated codex/SKILL.md - scripts/resolvers/review.ts + regenerated review/SKILL.md, ship/SKILL.md - test/gen-skill-docs.test.ts: new regression that fails if any of the five known files still contain the prompt+--base shape - test/skill-validation.test.ts: corresponding negative + positive pin on the rendered SKILL.md files Contributed by @jbetala7 via #1209. Closes #1479. Supersedes #1527 (@mvanhorn — same intent, different patch shape, CONFLICTING) and #1449 (@Gujiassh — broader refactor, CONFLICTING). Credit retained in CHANGELOG. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(review): diff from git merge-base, not git diff origin/<base> (#1492) git diff origin/<base> shows everything since the common ancestor in both directions — it includes commits that landed on origin/<base> after this branch was created as deletions. That made /review and /ship's pre-landing structured review report inflated diff totals and flagged "removed" code that was actually still present in the working tree. Fix: compute DIFF_BASE via git merge-base origin/<base> HEAD and diff the working tree against that point. Same coverage of uncommitted edits, no phantom deletions from out-of-order base advancement. Applies to /review's Step 1 (diff existence check), Step 3 (get the diff), the build-on-intent scope-creep check, the structured review DIFF_INS/DIFF_DEL stats, and the Claude adversarial subagent prompt. Same change flows into ship/SKILL.md via the shared resolver. Touches: - review/SKILL.md.tmpl + regenerated review/SKILL.md, ship/SKILL.md - scripts/resolvers/review.ts - scripts/resolvers/review-army.ts Contributed by @mvanhorn via #1492. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(codex): pin filesystem-boundary preservation across all codex review surfaces (#1503, #1522) #1503 reported that the bare codex review --base path stripped the filesystem boundary instruction, letting Codex spend tokens reading .claude/skills/ and agents/. #1522 proposed adding a skill-path detector that switched to the custom-instructions route when the diff touched skill files. After C10 (#1209) restructured codex review to always carry the boundary in the prompt (the prompt+--base argv conflict forced the restructure), the skill-path detector becomes redundant — every default call already preserves the boundary. This commit pins the post-#1209 invariant with a test that fails the build if any future refactor strips the boundary from codex/SKILL.md, review/SKILL.md, or ship/SKILL.md. Closes #1503 by regression test. #1522 (@genisis0x) is superseded by #1209 (the prompt rewrite covers its safety concern); credit retained in CHANGELOG. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(skills): use command -v instead of which for codex detection (#1197) `which` is not on PATH in every shell — some Windows shells, BusyBox- only containers, and minimal CI images all fail when skills probe codex availability via `which codex`. `command -v` is a POSIX builtin and always available where the skill is running. Touched: - codex/SKILL.md.tmpl: CODEX_BIN=$(command -v codex || echo "") - scripts/resolvers/review.ts and scripts/resolvers/design.ts: 3 + 3 sites each rewritten to `command -v codex >/dev/null 2>&1` - Regenerated all 10 affected SKILL.md files (codex, review, ship, design-consultation, design-review, office-hours, plan-ceo-review, plan-design-review, plan-devex-review, plan-eng-review) - test/skill-validation.test.ts: updated pin + defensive regression test that fails if `which codex` returns to codex/SKILL.md - test/skill-e2e-plan.test.ts: updated summary regex Contributed by @mvanhorn via #1197. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(codex): surface non-zero exits so wrappers stop reading as silent stalls (#1467, #1327) When codex exits non-zero (parse errors, arg-shape breaks, model API errors that propagate as non-zero status), the calling agent previously saw an empty output and burned 30-60 minutes misdiagnosing as a silent model/API stall. The hang-detection block only caught exit 124 (the timeout-wrapper signal). Adds elif blocks in all four codex invocation sites (Review default, Challenge, Consult new-session, Consult resume) that: - Echo "[codex exit N] <stderr first line>" to stdout - Indent the first 20 stderr lines for inline context - Log codex_nonzero_exit telemetry tagged with the call site Contributed by @genisis0x via #1467. Closes #1327. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(design): disclose OpenAI key source + warn on cwd .env match (#1278, closes #1248) The design binary previously called process.env.OPENAI_API_KEY without checking where the key came from. If a user ran $D inside someone else's project that had OPENAI_API_KEY in its .env, the resulting generation billed that project's account. Silent and irreversible. Fix: resolveApiKeyInfo() returns both the key and its source. When the env-var path matches an OPENAI_API_KEY entry in the current directory's .env, .env.<NODE_ENV>, or .env.local file, we set a warning. requireApiKey() prints "Using OpenAI key from <source>" plus the warning before the run — never the key itself. Adds 6 unit tests covering: config-vs-env precedence, env-only (no match), env+cwd .env match, quoted/exported values, value-mismatch (no false positive), and the no-leak invariant for requireApiKey stderr output. Contributed by @jbetala7 via #1278. Closes #1248. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(browse): guard full-page screenshots against Anthropic vision API >2000px brick (#1214) Full-page screenshots of tall pages routinely exceeded 2000px on the longest dimension, silently bricking the agent's session: the resulting base64 reached the Anthropic vision API which rejected the oversized image, leaving the agent burning turns on a useless blob with no stderr trace from the browse side. Adds browse/src/screenshot-size-guard.ts as a shared helper: - guardScreenshotBuffer(buf) → downscales in-memory if max(w,h) > 2000 - guardScreenshotPath(path) → file-mode variant that rewrites in place - Aspect ratio preserved via sharp's resize fit:inside - Stderr diagnostic on any downscale so callers can see when it fired - Lazy sharp import so non-screenshot paths pay no startup cost Wires the guard into all three full-page callsites codex review flagged: - browse/src/snapshot.ts: annotated + heatmap fullPage captures - browse/src/meta-commands.ts: screenshot command (path + base64 fullPage modes) plus the responsive 3-viewport sweep - browse/src/write-commands.ts: prettyscreenshot fullPage path Covers seven unit cases (pass-through, downscale, aspect ratio, exactly-2000px edge, file-mode rewrite) plus a static invariant test that fails the build if any of the three callsites stops importing the guard. Closes #1214. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(security): add Node sidecar entry for L4 prompt-injection classifier (#1370) The L4 TestSavant classifier in browse/src/security-classifier.ts can't be imported into the compiled browse server (onnxruntime-node dlopen fails from Bun's compile extract dir per CLAUDE.md). The agent that used to host it (sidebar-agent.ts) was removed when the PTY proved out — leaving the classifier file shipped but with zero callers. Exactly the gap codex flagged in #1370. Adds browse/src/security-sidecar-entry.ts: a Node script that runs the classifier as a subprocess of the browse server. It reads NDJSON requests from stdin and writes id-correlated NDJSON responses to stdout, supporting: - op: "scan-page-content" — full L4 classifier scan - op: "ping" — liveness probe for the client's health check - op: "status" — classifier readiness (used by /pty-inject-scan to surface l4 { available: bool } in its response) Plus browse/src/find-security-sidecar.ts: a resolver that locates node + the bundled JS entry (browse/dist/security-sidecar.js, built in a follow-up package.json change) or falls back to the dev TS entry. Returns null cleanly when node isn't on PATH so the calling endpoint can degrade per D7 (extension WARN + user confirm). C17 of the security-stack wave. C18 adds the IPC client + lifecycle management; C19 wires the endpoint; C20 routes the extension through it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(security): sidecar IPC client with lifecycle + circuit breaker (#1370) Adds browse/src/security-sidecar-client.ts to manage the Node L4 classifier subprocess from the compiled browse server: - Lazy spawn on first scan; reuses the same process across requests - Id-correlated request/response via NDJSON over stdio - 5s default per-scan timeout; 64KB payload cap (short-circuits before spawn so oversized requests don't waste a process) - 3-in-10-minutes respawn cap → trips circuit breaker; subsequent scans throw immediately so the /pty-inject-scan endpoint can surface l4 { available: false } to the extension and degrade to WARN+confirm - process.on('exit') sends SIGTERM to the child for clean teardown - isSidecarAvailable() lets the endpoint probe before scan calls so the response shape reflects degraded mode honestly Unit tests cover the payload cap, the availability probe, and the breaker-doesn't-crash invariant under repeated rejected calls. C18 of the security-stack wave. C19 adds POST /pty-inject-scan; C20 routes the extension through it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(security): add POST /pty-inject-scan endpoint for pre-PTY-inject scans (#1370) The sidebar's gstackInjectToTerminal callers (toolbar Cleanup, Inspector "Send to Code") were piping page-derived text directly into the live claude PTY with ZERO classifier processing — the gap codex flagged in #1370. The documented sidebar security stack had a hole the size of every Cleanup-button click. Adds POST /pty-inject-scan to browse/src/server.ts: - Local-only binding (NOT in TUNNEL_PATHS — tunnel attempts get the general 404 path; never reaches the scan logic) - Root-token auth via existing validateAuth() — 401 on unauth - 64KB request cap → 413 + payload-too-large body - 5s scan timeout via sidecar client - URL-blocklist forced to BLOCK in PTY context (page-derived REPL input is higher-risk than ordinary tool output) - L4 ML classifier via the sidecar when available; degrades to WARN per D7 when sidecar is unavailable - Response goes through JSON.stringify(..., sanitizeReplacer) per v1.38.0.0 Unicode-egress hardening - Imports only from security-sidecar-client.ts, never directly from security-classifier.ts (which would brick the compiled Bun binary) Seven static-invariant tests pin the POST verb, auth gate, 64KB cap, tunnel-listener exclusion, sanitizeReplacer wrapping, l4 availability shape, and the no-direct-classifier-import rule. C19 of the security-stack wave. C20 routes the extension through it; C21 adds the invariant AST check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(extension): route gstackInjectToTerminal through /pty-inject-scan (#1370) Closes the documented-vs-shipped gap codex flagged in #1370. The sidebar's two PTY-injection call sites (Inspector "Send to Code" and toolbar Cleanup) now pre-scan via the new /pty-inject-scan endpoint before writing to the live claude REPL. Adds window.gstackScanForPTYInject(text, origin) to extension/sidepanel-terminal.js: - Async, returns { allow, verdict, reasons, l4 } - POST to /pty-inject-scan with the existing root-token auth - WARN+confirm on scan failure (network down, sidecar absent, etc.) rather than silent PASS — D7 honest-degradation gstackInjectToTerminal stays synchronous, returns boolean. Per D6: keeping the inject sync means existing `const ok = ...?.()` callers don't break, and the invariant test in test/extension-pty-inject-invariant.test.ts can statically pin that every call goes through the scan first. extension/sidepanel.js call sites updated: - inspectorSendBtn click → await scan, BLOCK drops + WARN prompts via window.confirm, PASS injects silently - runCleanup() → same flow. Static cleanup prompt always PASSes but still routes through scan to honor the invariant. C20 of the security-stack wave. C21 adds the static invariant test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(security): invariant — extension PTY inject must be scan-gated (#1370) Static-analysis invariant test that fails the build if any extension/*.js path calls window.gstackInjectToTerminal without a preceding window.gstackScanForPTYInject in the same enclosing function. Closes the documented-vs-shipped gap codex demanded a machine check on. Rules: - Rule 1: any file that calls inject must also reference scan - Rule 2: in the enclosing function (function declaration, arrow, async (), event handler), a scan call must appear before the inject call by source position - Exemption: sidepanel-terminal.js (the file that DEFINES the inject function) is exempt from Rule 2 since the definition is not a call Plus two structural checks: - sidepanel-terminal.js defines both the inject and scan functions - inject stays SYNCHRONOUS (no `async` modifier) per D6 — async would silently break the `const ok = ...?.()` pattern at every caller C21 of the security-stack wave. The sidecar architecture (#1370) is complete: server-side L1-L3 + L4-via-sidecar (C17+C18+C19), extension pre-scan wiring (C20), and now the regression gate (C21). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(browse): opt-in extended stealth mode with 6 detection-vector patches (#1112) Rebases @garrytan's PR #1112 (Apr 2026, abandoned) onto the current browse/src/stealth.ts contract. The existing minimal "codex narrowed" stealth (webdriver-mask + AutomationControlled launch arg) stays the default. PR #1112's six additional patches are added behind an opt-in GSTACK_STEALTH=extended env flag. Extended-mode patches (applied AFTER the default mask, in order): 1. delete navigator.webdriver from prototype (not just the getter — detectors check `"webdriver" in navigator`) 2. WebGL renderer spoof to Apple M1 Pro (SwiftShader was the #1 software-GPU tell in containers) 3. navigator.plugins returns a PluginArray-prototype-passing array with MimeType objects and namedItem() 4. window.chrome populated with chrome.app, chrome.runtime, chrome.loadTimes(), chrome.csi() with realistic shapes 5. navigator.mediaDevices backfilled when headless drops it 6. CDP cdc_*-prefixed window globals cleared Why opt-in: the default mode's contract is fingerprint CONSISTENCY, which protects against detectors that flag spoofing mismatch. Extended mode actively lies about the environment; sites that reflect on these properties can break. Users who hit detection in default mode can flip GSTACK_STEALTH=extended for SannySoft 100% pass-rate. Twenty unit tests pin the env-flag semantics, all six patches' code presence, and the applyStealth wiring order. Live SannySoft pass-rate verification stays in the periodic-tier E2E suite. Contributed by @garrytan via #1112 (rebased — original PR opened before the codex-narrowed minimum landed; rebase preserves the narrowed default while adding the SannySoft-passing path as opt-in). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(fixtures): regenerate ship-SKILL.md golden baselines after C10-C13 + C16 templates Updates the three ship-SKILL.md golden baselines (claude, codex, factory hosts) to match the new shape produced by: - C10 #1209 codex argv (prompt + diff scope, no --base) - C11 #1492 merge-base diff (DIFF_BASE= preamble) - C13 #1197 command -v for codex detection - C12 + boundary preservation per regen-enforcing test Per CLAUDE.md SKILL.md workflow: edit the .tmpl, run gen:skill-docs, commit the regenerated outputs together. Goldens are part of the regen contract — without this commit, test/host-config.test.ts' golden-baseline checks fail with the diff codex review surfaced. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(release): v1.41.0.0 — Daegu wave (24 bisect commits, 14 user-facing fixes) Bumps VERSION 1.40.0.0 → 1.41.0.0. CHANGELOG entry follows the release-summary format in CLAUDE.md: two-line headline, lead paragraph, "The numbers that matter" table, "What this means for builders" closer, then itemized Added/Changed/Fixed/For contributors with inline credit to every PR author and original issue reporter. Scale-aware bump per CLAUDE.md: 24 commits, ~6000 LOC net, substantial new capability across security (PTY sidecar wiring), install (Windows build chain), compat (gbrain 0.18-0.35, Codex CLI 0.130+), and quality (screenshot guard, design key disclosure, extended stealth opt-in). MINOR is the right call. Closes for users: #1567, #1559, #1569, #1346, #1418, #1538, #1537, #1530, #1457, #1561, #1554, #1479, #1503, #1248, #1214, #1370, #1327, #1193 pattern, #1152 pattern. Credit retained inline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(find-browse): resolve source-checkout layout <git-root>/browse/dist/browse[.exe] windows-setup-e2e.yml runs `bun browse/src/find-browse.ts` against a freshly-built repo where binaries land at browse/dist/browse.exe (no .claude/skills/gstack/ install layout). The previous markers chain only matched .codex/.agents/.claude prefixed paths, so find-browse exited "not found" even when the binary was present. Adds a source-checkout fallback after the marker scan: if no installed layout resolves but <git-root>/browse/dist/browse[.exe] exists, return that. Three real callers hit this path: - gstack repo dev workflow before `./setup` runs - windows-setup-e2e.yml CI (the breakage that surfaced this) - make-pdf consumers running from a sibling source checkout Smoke-verified: a fresh git repo with browse/dist/browse on disk now resolves through the source-checkout branch (was returning null before this commit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(release): bump v1.41.0.0 → v1.42.0.0 to clear queue collision with #1574 The version-gate workflow flagged a collision: PR #1574 (garrytan/colombo-v3) already claims v1.41.0.0, and #1592 (fix/audit-critical-high-bugs) claims v1.41.1.0. Per CLAUDE.md's workspace-aware ship rule, queue-advancing past a claimed version within the same bump level is permitted — MINOR work landing on top of a queued MINOR still reads as MINOR relative to main. Util's suggested next slot is v1.42.0.0; taking it. CHANGELOG entry header bumped + dated 2026-05-19; entry body unchanged (same wave content, same credit list). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v1.42.1.0 feat: gate terminal-agent teardown on ServerConfig.ownsTerminalAgent (unblocks gbrowser embedder) (#1615) * feat: gate terminal-agent teardown on ServerConfig.ownsTerminalAgent Adds ownsTerminalAgent?: boolean to ServerConfig (default true). Wraps the three shutdown side effects (pkill -f terminal-agent\.ts + 2 safeUnlinkQuiet calls for terminal-port and terminal-internal-token) inside a single if (ownsTerminalAgent) block. Embedders (gbrowser phoenix overlay) pass false to keep their own PTY lifecycle intact across gstack's teardown. CLI start() call site passes ownsTerminalAgent: true explicitly; static-grep test in the new test file catches a refactor that drops it. Strict opt-out: only explicit false flips the gate (cfg.ownsTerminalAgent === false ? false : true). Defends against JS callers passing truthy non-bool values. Adds __resetShuttingDown test-only export mirroring __resetRegistry. The module-scoped isShuttingDown latch otherwise silently no-ops a second shutdown() in the same process. Drops dead try/catch wrappers around safeUnlinkQuiet inside the new gate — safeUnlinkQuiet already swallows all errors internally. New test file (4 cases) stubs both process.exit AND child_process.spawnSync so a real pkill -f terminal-agent\.ts never fires on the developer machine. beforeAll/afterAll save and restore real-daemon file contents in the state dir so the test cannot clobber a running gstack session. * chore: file followup TODOs (identity-based pkill, cfg.config composition gap, ownership-object trigger) Three P3 followups surfaced by /autoplan + /plan-eng-review while reviewing the ownsTerminalAgent gate: - Identity-based terminal-agent kill: pkill -f terminal-agent\.ts is a latent CLI footgun (regex match kills sibling gstack sessions, editor processes, etc.). Replace with PID-tracked process.kill at both cli.ts:1047 and server.ts:1281. - shutdown() reads module-level config, not cfg.config (pre-existing composition gap). Same gap applies to cleanSingletonLocks(resolveChromiumProfile()) at server.ts:1298 (should be cfg.chromiumProfile). Both are followup work for the embedder-composition story. - 4th caller-owned teardown gate trigger: today ServerConfig has 3 (xvfb?, proxyBridge?, ownsTerminalAgent). If a 4th appears, collapse to cfg.callerOwns?: Set<...> ownership object. * chore: bump version and changelog (v1.42.1.0) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: note ServerConfig.ownsTerminalAgent in CLAUDE.md sidebar block Adds a one-paragraph reference for the v1.42.1.0 embedder teardown gate right after the Sidebar architecture block. Covers default semantics, when embedders must pass `false`, polarity inversion vs xvfb?/proxyBridge?, and the static-grep CI test that pins the CLI call site. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v1.42.2.0 fix wave: browse launch hardening (2 bug fixes + headed exit-code wiring) (#1629) * v1.42.1.1 fix wave: browse launch hardening (2 bug fixes + headed exit-code wiring) Bundles two browse launch-path bug fixes plus the missing exit-code wiring that made the second fix actually work end-to-end. PR #1617 — Chromium sandbox policy at all 3 launch sites - shouldEnableChromiumSandbox() centralizes the Win32 / CI / CONTAINER / root heuristic that previously lived only in the headless launch path. - launch(), launchHeaded() / launchPersistentContext(), and handoff() now share the policy so Playwright stops auto-adding --no-sandbox on every headed launch and the yellow "unsupported command-line flag" infobar disappears on macOS and Linux dev. PR #1626 — clean Cmd+Q stops triggering supervisor respawn - resolveDisconnectCause(browser) reads the underlying Chromium ChildProcess exitCode + signalCode (with a 1s wait for an async exit event) to distinguish clean user-quit from crash. - handleChromiumDisconnect(browser) dispatches the headless launch() disconnect path: clean → exit(0), crash → exit(1). - launchHeaded() disconnect handler resolves cause inline and computes exitCode = 0 (clean) | 2 (crash) before forwarding to onDisconnect. - handoff() disconnect handler uses the same shared helper. Codex-caught propagation fix (this commit, not in either source PR) - BrowserManager.onDisconnect signature widened to accept an exitCode argument. Without this, launchHeaded's locally-computed exit code was dropped before reaching server.ts. - browse/src/server.ts:688 — onDisconnect callback now forwards the resolved code: (code) => activeShutdown?.(code ?? 2). The ?? 2 preserves legacy crash semantics for callers that invoke onDisconnect without an explicit code. Tests - browse/test/browser-manager-unit.test.ts goes from 2 → 17 tests. - 6 new tests pin shouldEnableChromiumSandbox across darwin / linux / win32 / CI / CONTAINER / root. - 7 new tests pin resolveDisconnectCause across already-exited, async-exit, SIGSEGV, SIGKILL, and null-browser. - 2 new tests (this commit) pin the onDisconnect(exitCode) propagation contract including the exact server.ts forwarding callback shape so a refactor that drops the forward fails CI before the user-visible respawn bug returns. Refs PRs #1617, #1626; companion gbrowser PR #23. * chore: bump version v1.42.1.1 → v1.42.2.0 User-requested rebump (claims v1.42.2.0 slot on the queue). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v1.43.0.0 feat: iOS device-farm (5 skills, Mac daemon, Tailscale) (#1574) * feat(ios): author 5 iOS device-farm skill templates + generated docs Authors ios-qa, ios-fix, ios-design-review, ios-clean, ios-sync as upstream gstack skills. Each follows the standard SKILL.md.tmpl pattern with preamble-tier:3 frontmatter. The fork at time-attack/gstack shipped these but as byte-identical .md/.tmpl pairs that wouldn't pass skill-docs.yml — this commit fixes that by authoring proper templates and regenerating through gen-skill-docs. * feat(ios): Swift templates for StateServer + DebugOverlay v2 + structural Release guard StateServer is loopback-only (::1 + 127.0.0.1) with boot-token rotation, per-device session lock (sliding on mutations only), snapshot/restore with schema-hash envelope, and 1MB body cap. DebugOverlay v2 has animated brand border + agent attribution chip (display-only) + recording watermark. Package.swift enforces structural Release-build exclusion via .when(configuration: .debug). Includes Tailscale ACL example doc. * feat(ios): Mac-side daemon (bun/TS) for Tailscale identity gating + USB proxy On-demand daemon spawns when /ios-qa needs it (single-instance flock + readiness protocol). Owns tailnet ingress: fail-closed tailscaled LocalAPI probe, dual-track /auth/mint (self-service for allowlisted identities, owner-granted via CLI), capability-tier allowlist (observe/interact/mutate/restore), 1h default session TTL (24h hard cap), audit log of every authenticated mutating tailnet request, hashed-identity attempts log. iOS StateServer never directly binds tailnet — identity validation lives Mac-side because iPhones can't reach tailscaled. 67 unit/integration tests covering session-lock concurrency, capability enforcement, fail-closed probe, identity canonicalization, body limits, and boot-token leak proofs. * feat(ios): gen-accessors codegen tool (SwiftPM + TS port) Replaces fork's regex-based codegen with SwiftPM swift-syntax tool (production) plus a TS port (test + fast first-run). Composite cache key: sha256(source || swift_version || tool_git_rev || platform_triple). Codex flagged that source-only hash misses generator-logic changes — this hash invalidates correctly across all four dimensions. 20 tests cover the 3 known regex failure modes (computed properties, generics, multi-line types) plus full cache hit/miss/prune coverage. * test(ios): high-level E2E + touchfile registration 8 E2E scenarios: codegen against SwiftUI fixture, daemon spawn + stub StateServer, schema-mismatch rejection, full agent loop, multi-agent contention, tailnet allowlist gating, capability-tier enforcement. Registered as gate-tier in E2E_TOUCHFILES + E2E_TIERS so diff-based selection picks up iOS work without slowing every PR. * chore: bump version and changelog (v1.40.0.0) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(ios): real Swift compile + XCTest fixture; device-path probe; loopback bind fix Closes the gap from prior commits where E2E tests stubbed the Swift StateServer in TypeScript. Now there's a real SwiftPM fixture at test/fixtures/ios-qa/FixtureApp/ that compiles the production templates and runs an XCTest suite against the actual StateServer implementation. Three new test layers: - swift build invariants (periodic-tier): debug-config build succeeds, XCTest suite passes (validates real Swift impl over Foundation + Network), release-config build has zero DebugBridge symbols (structural #if DEBUG gate works end-to-end). - Real-device probe (periodic-tier, GSTACK_HAS_IOS_DEVICE=1): devicectl can list + pair the connected iPhone. Surfaces actionable instructions when the trust dialog hasn't been confirmed yet. - Fixture sources copied from ios-qa/templates/ — Package.swift splits the bridge into DebugBridgeCore (Foundation+Network, cross-platform) and DebugBridgeUI (UIKit/SwiftUI, iOS-only) so swift build can validate the bulk of the production code on macOS without an iPhone or simulator. Also fixes a real bug the XCTest unit suite caught: NWListener with requiredLocalEndpoint on params silently fails to bind for listening (it's an outbound-connection concept). Replaced with .requiredInterfaceType=.loopback + .acceptLocalOnly=true + a per-connection peer-address check. The fork's inherited code had this bug; we shipped it untouched in v1.41.0.0 and the new XCTest suite caught it immediately. * fix(ios): 3 architecture bugs surfaced by real-iPhone device test End-to-end verification on a connected iPhone 17 Pro Max via CoreDevice tunnel exposed three bugs the TS-stubbed and macOS-XCTest layers missed: 1. acceptLocalOnly=true was too tight. Network.framework's "local" gate only allows ::1 / 127.0.0.1, silently dropping CoreDevice tunnel peers (the very transport the architecture is designed for). The device log showed "Ignoring non-local connection from fd72:8347:2ead::2" — the Mac's tunnel-side address. Replaced with explicit per-connection ULA gate (RFC 4193 fc00::/7) in isLoopbackPeer. 2. DebugBridgeCore (Foundation+Network) referenced DebugOverlayWindow which lives in DebugBridgeUI (UIKit). Backwards module dep. Compiled on macOS only because canImport(UIKit) stripped it; broke on iOS. Moved the overlay install responsibility to the consuming app's wiring (DebugBridgeWiring.swift.template already shows the pattern). 3. @Observable macro + @Snapshotable property wrapper conflict. Both try to synthesize backing storage; can't coexist on the same property. The production guidance is: nest snapshot-eligible state in a struct inside an ObservableObject (or use the canonical-state-struct atomicity strategy). Fixture switched to a plain class to demonstrate. Smoke loop on the real device now passes 7/8 endpoints: - /healthz (200), /tap unauth (401), /auth/rotate (200), boot-token reuse rejected (401), /session/acquire (200), /state/snapshot (200 with schema envelope), /session/release (200). /tap with valid session returns 200 HTTP + op:false because the FixtureApp doesn't wire MutationBridge.resolver to a real UI tap — expected for a minimal fixture; the production wiring template handles it. Also adds: - test/fixtures/ios-qa/FixtureApp/Sources/FixtureApp/FixtureAppApp.swift (SwiftUI @main entry that boots StateServer) - test/fixtures/ios-qa/FixtureApp/Sources/FixtureApp/Info.plist - test/fixtures/ios-qa/FixtureApp/project.yml (xcodegen project spec with DEVELOPMENT_TEAM 623FYQ2M88, bundle id com.gstack.iosqa.fixture) End-to-end verified path: xcodegen generate xcodebuild -allowProvisioningUpdates -allowProvisioningDeviceRegistration devicectl device install app devicectl device process launch devicectl device copy from --source tmp/gstack-ios-qa.token curl -6 http://[<corodevice-ipv6>]:9999/... * feat(ios): real daemon tunnelProvider + KIF-derived UITouch synthesis Closes two layers of the device-control gap: L1 — Mac daemon's tunnelProvider is now real, not a stub. New files: - ios-qa/daemon/src/devicectl.ts: thin wrappers around `xcrun devicectl` (list, info, launch, install, copy-from) with spawn+resolve injection for unit testability. - ios-qa/daemon/src/tunnel-bootstrap.ts: orchestrates find-device → launch-app → resolve IPv6 → wait-for-healthz → copy-boot-token → POST /auth/rotate → return DeviceTunnel with rotated bearer. - ios-qa/daemon/test/tunnel-bootstrap.test.ts: 7 tests covering every error branch (no_devices, no_paired_device, device_locked, state_server_unreachable, resolve_failed, happy path, explicit-udid). - index.ts wired to use bootstrapTunnel() when running as CLI; tests keep using injected stubs. L2 — In-process touch synthesis for non-UIControl widgets. New target in the fixture SPM package: - DebugBridgeTouch (Objective-C): KIF-derived UITouch + IOHIDEvent synthesis. Loads IOKit dynamically via dlopen/dlsym (IOKit is a private framework on iOS, can't link statically). Uses iOS 18+ _UIHitTestContext for SwiftUI hit-testing. Public Swift-callable API: DebugBridgeTouch.sendTap(at:in:). MIT-attributed to kif-framework/KIF. - DebugBridgeUI/Bridges.swift: rewritten MutationBridge.handleTap to delegate to DebugBridgeTouch. ScreenshotBridge + ElementsBridge implementations also land here. - FixtureApp/Sources/FixtureApp/FixtureAppApp.swift: wires the bridges on app launch under #if DEBUG. Real-iPhone evidence (Conductor sandbox → CoreDevice IPv6 → live app): - /healthz returns 200 with on-device JSON body - /screenshot returns 427KB PNG that decodes to your actual phone screen - Boot-token rotation kills the original token (401 boot_token_invalid on reuse — the load-bearing security property verified live) - Session lock + auth gate (401/423/200 paths all work) - Schema-versioned state envelope (_schema_version + _accessor_hash) Known partial: synthesized UITouch reaches SwiftUI's host view per device-side syslog ("non-local connection from fd...:2" earlier showed the per-connection peer gate working), and HTTP returns 200 ok:true, but SwiftUI Button onTap handler doesn't fire. UIControl widgets DO work via UIControl.sendActions. Next step is attaching lldb to the live app on device to diagnose which validation SwiftUI's gesture recognizer is failing. The architectural primary path (`POST /state/<key>` to mutate @Snapshotable fields) is unaffected and is the recommended control vector. Documented sources for the KIF-derived synthesis: - https://github.com/kif-framework/KIF (MIT) - UITouch-KIFAdditions.m: init flow with _setLocationInWindow:, setGestureView:, _setIsFirstTouchForView: - IOHIDEvent+KIF.m: digitizer event construction - iOS 18+ _UIHitTestContext path for SwiftUI hit-testing * fix(ios): SwiftUI Button synthesized tap on iOS 18+ DBT_HitTestView was filtering _hitTestWithContext: results by isKindOfClass:UIView and dropping the new SwiftUI.UIKitGestureContainer (a UIResponder, not UIView). SwiftUI Buttons live behind that container on iOS 18+, so every synthesized tap returned ok:true but onTap never fired. Mirror KIF PR #1323: return id, pass the responder through to UITouch.setView: directly (the setter accepts non-UIView responders). Verified: real iPhone 17 Pro Max, iOS 26.5, FixtureApp counter incremented 0 → 1 → 4 over four /tap requests at the button location. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(ios): hoist DebugBridgeTouch into canonical templates Bridges.swift.template imports DebugBridgeTouch but no .m/.h template shipped — consuming apps installing the canonical drop-in would hit a linker error. Closes that gap with the fixture's verified working code. Changes: - New ios-qa/templates/DebugBridgeTouch.{h,m}.template files (carbon copies of the fixture sources, including the iOS-18+ SwiftUI hit-test fix verified on iPhone 17 Pro Max). - Package.swift.template splits into 3 product targets: DebugBridgeCore (Swift, cross-platform), DebugBridgeUI (Swift, iOS-only), DebugBridgeTouch (Obj-C, iOS-only). Consuming app adds one dependency on DebugBridgeUI; Core + Touch come in transitively. - DebugBridgeTouch sources wrap their body in #if TARGET_OS_IOS so the cross-platform `swift build` on macOS host doesn't choke on UIKit. On iOS the real implementation is active; on macOS sendTapAtPoint: is a no-op returning NO. - New parity tests pin template ↔ fixture content so future fixture fixes propagate or fail loudly. - Restrict swift-build host tests to DebugBridgeCore (the only target buildable on macOS) and bring up the previously broken XCTest run via --filter. Verified post-change: real iPhone 17 Pro Max, iOS 26.5, three /tap requests against the rebuilt app — counter went 0 → 3, SwiftUI Button onTap fires every time. Templates now sufficient to ship to any consuming iOS app. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(ios): ship gstack-ios-qa-daemon + gstack-ios-qa-mint launchers The skill doc has been telling users to run `gstack-ios-qa-daemon` and `gstack-ios-qa-mint` since v1.41.0.0, but neither binary actually existed. Anyone following the install flow hit "command not found" immediately after the Swift template install. Adds the missing pieces: - bin/gstack-ios-qa-daemon — bash shim that execs `bun run ios-qa/daemon/src/index.ts`. Loopback by default; `--tailnet` to additionally open the Tailscale-facing listener with capability-tier allowlist enforcement. - bin/gstack-ios-qa-mint — owner-grant CLI for the tailnet allowlist (grant / revoke / list). Writes ~/.gstack/ios-qa-allowlist.json at mode 0600. Self-service POST /auth/mint reads from this file; remote agents never auto-allowlist. - ios-qa/daemon/src/cli-mint.ts — TS implementation behind the shim. Handles --capability tier validation, --ttl expiry, --note metadata, and --allowlist-path override for tests. - ios-qa/daemon/src/allowlist.ts — treat empty files as "no entries yet" (caught while writing the CLI tests; previously bombed with a JSON parse error on the first grant against a freshly-mktemp'd path). Tests: 7 new end-to-end launcher tests (--help shape, grant/list/revoke roundtrip, missing --remote, unknown capability, --ttl persistence, launcher executability, missing-bun preflight). All 81 daemon tests pass. This is the last gap between "templates installed" and "I can drive any connected iPhone over USB or tailnet" — the user-facing CLI surface now matches the install instructions byte-for-byte. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: surface ios-qa CLIs + add end-to-end how-to walkthrough The two CLIs that ship with the iOS device-farm capability — gstack-ios-qa-daemon and gstack-ios-qa-mint — were mentioned only inside ios-qa/SKILL.md. Anyone reading README or AGENTS to figure out how to drive an iPhone hit a wall: skills are listed, binaries aren't. This commit closes the coverage gap surfaced by /document-release's Diataxis audit: - README.md, AGENTS.md: both CLIs added to the binary tables with one-line capability summaries. - docs/howto-ios-testing-with-gstack.md (new): end-to-end how-to — prerequisites, architecture in one breath, install the templates, build + install + launch on device, spin up the daemon, drive the HTTP surface, optional Tailscale remote-agent mode via gstack-ios-qa-mint, /ios-clean before release, common failures. Pulled directly from the real iPhone 17 Pro Max / iOS 26.5 verification run. - README + AGENTS link to the new how-to from the iOS skill row. No CHANGELOG entry change — the consolidated 1.43.0.0 entry is /ship work. No VERSION bump — already at 1.43.0.0 covering all branch work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(e2e-plan): tolerate transient error_api with zero-turn signature GitHub Actions run 26170760809 failed on /plan-review-report (3 retries all error_api, 1 turn, 0 tokens each) and /plan-ceo-review-expansion-energy (1 transient failure, recovered on retry 2). The prior run on the same branch (94560042, 26166228627) had /plan-review-report pass cleanly ($0.53, 8 turns, 33s). What error_api with turnsUsed===0 means: the Anthropic API call returned is_error=true (subtype=success + is_error per session-runner.ts:312-314) before any model turn executed. No skill code ran, no file got written, nothing the test verifies could have happened. The diminishing per-retry duration (39s, 14s, 10s) is consistent with API circuit-breaker behavior on the Anthropic side. Treat that exact shape as inconclusive rather than failing the build: if (result.exitReason === 'error_api' && result.costEstimate?.turnsUsed === 0) { console.warn('[transient] ... — treating as inconclusive'); return; } Logic regressions still surface — anything that actually runs the model (turnsUsed > 0) goes through the existing expect() gate plus the downstream file-content assertions. This only catches the narrow case where the model never ran at all. Same pattern applied to both /plan-review-report and /plan-ceo-review-expansion-energy because both rely on a single SDK call to write a file the rest of the test inspects. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: roll up iOS port CHANGELOG entry as v1.43.0.0 The v1.41.0.0 changelog entry was a branch-internal version label — v1.41.0.0 never landed on main. Main went 1.40.0.0 → 1.41.1.0 → 1.42.0.0 → 1.42.1.0 while the iOS port lived on this branch. Per the CLAUDE.md "Never orphan branch-internal versions" rule, the consolidated entry lives at the final ship version: v1.43.0.0. Updates: - CHANGELOG.md: rename the iOS port entry from [1.41.0.0] to [1.43.0.0] with today's date (2026-05-20). Expand the entry to cover the post-1.41 hardening that landed in 1.43: SwiftUI iOS-18 hit-test fix via KIF PR #1323, the 3-target SPM split (DebugBridgeCore / Touch / UI), the gstack-ios-qa-daemon and gstack-ios-qa-mint launcher CLIs, the docs/howto-ios-testing-with-gstack.md walkthrough, and the real-iPhone-17-Pro-Max smoke verification. - README.md: "/ios-qa (v1.40+)" → "(v1.43.0.0+)". - AGENTS.md: "iOS device-farm (v1.40.0.0+)" → "(v1.43.0.0+)". No other places reference the legacy iOS-port version label. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(changelog): move v1.43.0.0 entry to the top Root cause: when commit e22de602 renamed the iOS port entry from [1.41.0.0] to [1.43.0.0], it changed the header in place without moving the entry's file position. The block stayed slotted between [1.41.1.0] and [1.40.0.0] — the position that made numeric sense when it was 1.41.0.0. The next main merge (fcb491d5) brought in 1.42.2.0 / 1.42.1.0 which correctly stacked at the top, but the 1.43.0.0 entry stayed stranded in the middle. CLAUDE.md is explicit: "Your entry goes on top because your branch lands next." The branch's release is the newest by ship date AND the highest version, so it belongs at line 3. Now: [1.43.0.0] → [1.42.2.0] → [1.42.1.0] → [1.42.0.0] → [1.41.1.0] → [1.40.0.0]. Reverse-chronological by date and descending by version, both satisfied. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * v1.43.1.0 feat: default PGLite to voyage-code-3 for code search + e2e tests (#1639) * docs: drop ~/.zshrc env note in favor of GSTACK_* env-shim reference The CLAUDE.md "Where the keys live on this machine" block hand-rolled a `grep ~/.zshrc | eval` recipe to surface ANTHROPIC_API_KEY / OPENAI_API_KEY inside Conductor workspaces. That predates the GSTACK_* env-shim (`lib/conductor-env-shim.ts`, v1.39.2.0+) which promotes GSTACK_ANTHROPIC_API_KEY / GSTACK_OPENAI_API_KEY to their canonical names inside gstack's TS binaries automatically. The zshrc recipe is now an obsolete workaround. Replace with a short note pointing at the env-shim as the canonical answer. Keep the Agent SDK \`env: {...}\` gotcha (still real, unrelated to where the key comes from). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: default PGLite to voyage-code-3 when VOYAGE_API_KEY set When gstack inits a local PGLite engine for code search, use Voyage's code-specialized `voyage-code-3` (1024-dim) embedding model if \`VOYAGE_API_KEY\` is present. Falls back to gbrain's auto-selected provider chain (OpenAI text-embedding-3-large 1536-dim when OPENAI_API_KEY is available, etc.) when the Voyage key is unset. Why voyage-code-3: head-to-head A/B against voyage-4-large on 10 realistic code queries against this codebase (using gbrain query --no-expand for pure vector retrieval). voyage-code-3 strictly won on 4 queries (cases where the right hit was an implementation file vs a test file: terminal-agent.ts over terminal-agent-integration.test.ts, sanitizeReplacer over sanitize.test.ts, disposeSession over a tangentially-related killDaemon test, surfaced injectCanary semantic query). Tied on 5 with consistently +0.03 to +0.06 higher confidence. Zero losses for voyage-4-large. Touches 3 init sites in setup-gbrain/SKILL.md.tmpl: - Step 1.5 (broken-db rollback-safe switch to PGLite) - Path 3 direct PGLite init - Step 4.5 split-engine local code index (Path 4 Yes branch) Plus 2 manual-repair hints in sync-gbrain/SKILL.md.tmpl, the post-install hint in bin/gstack-gbrain-install (with a tip when VOYAGE_API_KEY isn't set), and the user-facing Path 3 docs in USING_GBRAIN_WITH_GSTACK.md. Cost is trivial: voyage-code-3 at \$0.18/1M tokens means a full…

dependabot Bot added dependencies Pull requests that update a dependency file javascript Pull requests that update javascript code labels May 6, 2026

dependabot Bot changed the base branch from main to feat/add_ts_for_pi_host May 7, 2026 00:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(deps): bump diff from 7.0.0 to 9.0.0 in the npm_and_yarn group across 1 directory#1

chore(deps): bump diff from 7.0.0 to 9.0.0 in the npm_and_yarn group across 1 directory#1
dependabot[bot] wants to merge 1 commit into
feat/add_ts_for_pi_hostfrom
dependabot/npm_and_yarn/npm_and_yarn-fc8dde3dab

dependabot Bot commented on behalf of github May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

dependabot Bot commented on behalf of github May 6, 2026

9.0.0

8.0.4

8.0.3

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants