Skip to content

v1.55.1.0 fix: telemetry consent accuracy + gstack-slug cache sanitization#1848

Merged
garrytan merged 6 commits into
mainfrom
garrytan/design-shotgun-security-audit
Jun 3, 2026
Merged

v1.55.1.0 fix: telemetry consent accuracy + gstack-slug cache sanitization#1848
garrytan merged 6 commits into
mainfrom
garrytan/design-shotgun-security-audit

Conversation

@garrytan

@garrytan garrytan commented Jun 3, 2026

Copy link
Copy Markdown
Owner

Summary

A small security + privacy correctness fix, surfaced by an external "responsible disclosure" review of the design-shotgun skill. Five of six flagged items were working-as-intended with guards the report missed; two led to real hardening here.

Security — slug helper

  • bin/gstack-slug now sanitizes its output to [a-zA-Z0-9._-] on every path, including values read from its on-disk cache. The compute/fallback paths already filtered, but a cached value was echoed into eval "$(gstack-slug)" unsanitized, so a tampered ~/.gstack/slug-cache file could inject shell. The filter now runs unconditionally before the value is emitted, and a poisoned cache is healed on the next write.

Privacy — telemetry

  • The telemetry consent copy now states what actually happens: "No code or file paths. Your repo name is recorded locally only and stripped before any upload" (was "No code, file paths, or repo names").
  • The telemetry preamble sanitizes the repo basename before building its local analytics JSON, so an unusual repo directory name can't malform the record (and can't leak a fragment past the regex stripper).

Repo identity (repo, _repo_slug, _branch) was already stripped before any remote upload; this PR keeps the behavior and locks it with a test.

Test Coverage

[~] bin/gstack-slug — re-sanitize cached SLUG before echo→eval  [★★★ test/gstack-slug-sanitize.test.ts]
[+] telemetry repo-identity egress (repo/_repo_slug/_branch)    [★★★ test/telemetry-repo-strip.test.ts]
[~] generate-preamble-bash.ts basename sanitize                 [covered-by-construction: tr -cd can't emit JSON-breaking chars]
[~] generate-telemetry-prompt.ts consent copy                   [string, not a code path — llm-judge ✓]

Both new tests verified RED when their fix is removed. Tests: +2 files.

Pre-Landing Review

No issues found. Meaningful surface is ~12 lines of sanitization logic; reviewed inline (no SQL, no LLM trust boundary, no auth). Security-positive.

Design Review

No frontend files changed — design review skipped.

Eval Results

  • LLM-judge (diff-selected): 24 pass, 1 skip, 0 fail (21s). Validates the regenerated SKILL.md docs.
  • E2E: skipped this run by maintainer decision. The diff touches the preamble resolvers (global touchfiles), so the selector flagged 135/173 E2E tests, but none assert on the telemetry consent copy or repo basename — the change is behaviorally neutral for skill execution. Free bun test suite (skill validation + gen-skill-docs quality + golden regression + both new invariant tests) passes: exit 0, 0 failures.

Adversarial Review

Claude + Codex, both: ship/land. Injection verified closed (newline payloads, self-heal, BRANCH path all checked); generated JSON well-formed; repo fields stripped before upload; bash -n clean. One low non-exploitable finding (a metachar-only cache yields an empty SLUG for the session, handled downstream by ${SLUG:-unknown}) — noted as a low-priority follow-up, not blocking.

Scope Drift

Scope Check: CLEAN. Intent: fix the disclosed issues. Delivered: exactly that + tests + generated-doc regen.

Plan Completion

All code + test items DONE (gstack-slug sanitize, telemetry copy + basename sanitize, both invariant/regression tests). One non-code item deferred: sending the reply to the external reviewers (a communication, not a code deliverable).

Verification Results

Skipped — CLI/skill repo, no dev server to drive.

TODOS

No TODO items completed in this PR.

Test plan

  • Free suite bun test passes (exit 0, 0 failures), run on merged state
  • LLM-judge evals pass (24/25, 1 diff-deselected)
  • Both new tests proven to fail when their fix is reverted

🤖 Generated with Claude Code

garrytan and others added 6 commits June 2, 2026 21:32
The compute and fallback paths filter slug output to [a-zA-Z0-9._-], but a
value read straight from ~/.gstack/slug-cache was echoed into eval output
unsanitized. A locally-planted cache file could inject shell into
eval "$(gstack-slug)". Re-sanitize on every path so the invariant the file
header promises actually holds, and heal a poisoned cache on the next write.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The telemetry consent prompt promised "no repo names" while the preamble
epilogue records the repo basename in the local skill-usage.jsonl. It is
already stripped before any remote upload, so it never left the machine, but
the copy was unqualified. Reword it to state repo name is local-only and
stripped before upload.

Also sanitize the basename to [a-zA-Z0-9._-] before it goes into the
hand-built JSON, so a repo directory name containing quotes or newlines can
neither break the JSON nor leak a fragment past the regex stripper.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Generated output of the preceding resolver change: the corrected consent copy
and sanitized repo basename now appear in every skill preamble. Golden ship
fixtures refreshed to match.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pins the contract that repo/branch identity in the synced skill-usage.jsonl is
stripped before the remote POST. Three checks: a floor (the three known fields),
coverage (every repo/branch field a producer writes into skill-usage.jsonl is
stripped, so a future producer rename can't silently leak), and behavior (runs
the actual sed strip expressions over a sample event). Scoped to the synced
file, so the local-only timeline branch field is correctly excluded.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Proves a poisoned ~/.gstack/slug-cache file cannot inject shell metacharacters
into gstack-slug output (the value consumed by eval). Verified red when the
cache-read sanitization is removed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@trunk-io

trunk-io Bot commented Jun 3, 2026

Copy link
Copy Markdown

Merging to main in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown

E2E Evals: ✅ PASS

58/58 tests passed | $9.41 total cost | 12 parallel runners

Suite Result Status Cost
e2e-browse 4/4 $0.18
e2e-deploy 6/6 $1.36
e2e-design 4/4 $0.75
e2e-plan 8/8 $2.74
e2e-qa-workflow 3/3 $1.34
e2e-review 6/6 $2.04
e2e-workflow 3/3 $0.52
llm-judge 24/24 $0.48

12x ubicloud-standard-8 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite

@garrytan garrytan merged commit c43c850 into main Jun 3, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant