Skip to content

test(#528): drain quarantine — pin-escape audit + handover rewrite#533

Merged
atlas-apex merged 1 commit into
devfrom
test/GH-528-drain-quarantine
Jun 6, 2026
Merged

test(#528): drain quarantine — pin-escape audit + handover rewrite#533
atlas-apex merged 1 commit into
devfrom
test/GH-528-drain-quarantine

Conversation

@atlas-apex

Copy link
Copy Markdown
Collaborator

Summary

Completes #528 — drains the test-runner quarantine to empty. (Clean re-do of two earlier attempts that GitHub auto-closed when the stacked base branch was deleted on merge; rebuilt here on a fresh dev base.)

(a) Pin-escape audit + fix. Many hooks resolve ops-root via _lib-ops-root.sh, which inside a live Claude Code session honours the session pin → sandbox tests escape onto the real fork (wrong results; real-file mutation for writing hooks). 21 hooks use the pin; only 4 tests guarded.

  • Systemic: bin/run-hook-tests.sh now export APEXYARD_OPS_DISABLE_PIN=1 — the whole suite resolves by walk-up to its sandbox (no-op in headless CI; test_resolve_ops_root_pin sets the var per-case, unaffected). Turned the local run from 7 spurious failures → green.
  • Belt-and-suspenders: test_link_custom_skills (symlinks into .claude/skills/) gets its own guard.

(b) Rewrote test_handover_clone_prompt. Old assertions pinned a removed [y/n/later] clone prompt; the SKILL now clones-by-default at step 1.5-clone (skip-if-.git, --no-clone) + a step-8 follow-up offer. New spec-asserts + a clone-by-default runtime simulator → 18/18.

Result: QUARANTINE is empty — the gate enforces all 65 tests.

Testing

  1. bash bin/run-hook-tests.shPASS=65, FAIL=0, SKIP=0 (was 7 failing before the pin guard).
  2. bash .claude/hooks/tests/test_handover_clone_prompt.sh → 18/18.
  3. test_resolve_ops_root_pin + test_link_custom_skills pass under the suite default; no real-repo mutation after runs.
  4. shellcheck -S error clean on all three changed files.

Closes #528


Glossary

Term Definition
Session pin _lib-ops-root.sh resolving the ops root from $APEXYARD_OPS_PIN_DIR/ops-root-$CLAUDE_CODE_SESSION_ID inside a live session.
APEXYARD_OPS_DISABLE_PIN Env flag that disables the pin so resolution walks up to the sandbox — the correct neutralizer for hermetic tests.
Quarantine The runner's documented skip-list; now empty — every discovered test is enforced.
Spec-pin test A test asserting literal substrings in a SKILL doc; rewritten here to the current clone-by-default design.

(a) Pin-escape audit: hooks resolve ops-root via _lib-ops-root.sh, which honours
the session pin inside a live Claude Code session → sandbox tests escape onto the
real fork (wrong results; real-file mutation for writing hooks). Fixes:
- bin/run-hook-tests.sh exports APEXYARD_OPS_DISABLE_PIN=1 (whole suite resolves
  to its sandbox; no-op in headless CI; the pin-test sets the var per-case).
- test_link_custom_skills (writes symlinks into .claude/skills/) gets its own guard.

(b) Rewrote test_handover_clone_prompt against the current /handover spec
(clone-by-default at step 1.5-clone + step-8 follow-up offer), replacing the
removed [y/n/later]-prompt assertions. 18/18.

QUARANTINE is now empty — the gate enforces all 65 tests.

Closes #528

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@atlas-apex atlas-apex left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #533

Commit: aa6e2b15fbe5a3263c19a16520ee372eae9fa82a

Verdict is APPROVED. --approve rejected (cannot approve own PR), so this is posted as a comment; the Rex approval marker is written for the merge gate.

Summary

Drains the test-runner quarantine to empty (completes #528). Clean dev-based rebuild of the #531/#532 content — same 3-file delta, new SHA. (a) Systemic pin-escape fix: bin/run-hook-tests.sh exports APEXYARD_OPS_DISABLE_PIN=1 so sandbox tests resolve by walk-up instead of escaping onto the real fork via the session pin; test_link_custom_skills.sh gets a belt-and-suspenders guard. (b) test_handover_clone_prompt.sh rewritten from the removed [y/n/later] clone-prompt design to the current clone-by-default-at-step-1.5 spec. QUARANTINE is now empty, so the gate enforces all 65 tests.

Checklist Results

  • ✅ Architecture & Design: Pass (test-only; no layering concerns)
  • ✅ Code Quality: Pass (shellcheck -S error clean on all 3 files)
  • ✅ Testing: Pass (the change IS test maintenance)
  • ✅ Security: Pass (mocked git, no network, no secrets)
  • ✅ Performance: Pass (N/A)
  • ✅ PR Description & Glossary: Pass (4-term glossary present)
  • ✅ Summary Bullet Narrative: Pass (bullets explain what + why)
  • ✅ Technical Decisions (AgDR): N/A (test maintenance; the env flag is a pre-existing escape hatch in _lib-ops-root.sh, not a new decision)
  • ✅ Adopter Handbooks: N/A (shell-only diff; migration-safety blocking handbook does not apply — "draining quarantine" is test-runner terminology, not a schema migration)

Verification performed

  1. Pin-disable correct + safe. _lib-ops-root.sh:131 gates the pin lookup on [ -z "${APEXYARD_OPS_DISABLE_PIN:-}" ], so the suite-level export forces walk-up. test_resolve_ops_root_pin.sh manages the var per-case (sets it for Case 3; unsets it in the pin-exercising cases at lines 89/115/173/202/231), so the suite default does not interfere. Ran it in the PR tree: 8/8 pass.
  2. Handover rewrite matches the spec. All 12 spec-assert needles verified present in the PR tree's .claude/skills/handover/SKILL.md (step-1.5-clone heading, default-yes line, WORKSPACE_DIR=$(portfolio_workspace_dir), .git-skip guard, clone command, CLONE_STATUS markers, step-8 follow-up offer + options, cost facts, step 7/9/10 order). The runtime simulator's three branches (--no-clone → declined, .git exists → preserved, else → clone with git clone <url> "$WORKSPACE_DIR/<name>") faithfully mirror the documented step-1.5-clone branch. Ran it: 18/18 pass.
  3. Empty QUARANTINE justified. Ran the full suite against the PR tree: PASS=65, FAIL=0, SKIP=0. (Sanity: running the OLD test against the current new SKILL fails 8 spec-asserts — confirming the rewrite was necessary.)
  4. PR hygiene. Glossary present; narrative bullets; PR title well-formed; Closes #528; commit SHA recorded. All CI checks green, including the hook test suite job (enforces 65) and shellcheck .claude/hooks.

Issues Found

None.

Suggestions

None blocking. The rewritten test's runtime simulator is a hand-translation of the SKILL's bash branch — the in-file comment already flags that divergence is a triage signal, which is the right discipline for this spec-pin pattern.

Verdict

APPROVED


🤖 Reviewed by Rex (Code Reviewer Agent)
📌 Reviewed commit: aa6e2b15fbe5a3263c19a16520ee372eae9fa82a

@atlas-apex atlas-apex merged commit afbb26c into dev Jun 6, 2026
7 checks passed
@atlas-apex atlas-apex deleted the test/GH-528-drain-quarantine branch June 6, 2026 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants