Skip to content

test(#528): drain quarantine — pin-escape audit + handover rewrite#532

Closed
atlas-apex wants to merge 2 commits into
devfrom
test/GH-528-audit-and-handover
Closed

test(#528): drain quarantine — pin-escape audit + handover rewrite#532
atlas-apex wants to merge 2 commits into
devfrom
test/GH-528-audit-and-handover

Conversation

@atlas-apex

Copy link
Copy Markdown
Collaborator

Summary

Completes #528 — drains the test-runner quarantine to empty. (Originally PR #531, which GitHub auto-closed when #530's branch was deleted on merge; re-opened here against dev with the same already-Rex-reviewed HEAD 5d2f3b7.)

(a) Pin-escape audit + fix. Many hooks resolve ops-root via _lib-ops-root.sh, which inside a live Claude Code session honours the session pin → sandbox tests escape onto the real fork (wrong results; real-file mutation for writing hooks). 21 hooks use the pin; only 4 tests guarded.

  • Systemic: bin/run-hook-tests.sh now export APEXYARD_OPS_DISABLE_PIN=1 — the whole suite resolves by walk-up to its sandbox (no-op in headless CI; test_resolve_ops_root_pin sets the var per-case, unaffected). Turned the local run from 7 spurious failures → green.
  • Belt-and-suspenders: test_link_custom_skills (symlinks into .claude/skills/) gets its own guard.

(b) Rewrote test_handover_clone_prompt. Old assertions pinned a removed [y/n/later] clone prompt; the SKILL now clones-by-default at step 1.5-clone (skip-if-.git, --no-clone) + a step-8 follow-up offer. New spec-asserts + a clone-by-default runtime simulator → 18/18.

Result: QUARANTINE is empty — the gate enforces all 65 tests. Closes #528.

Testing

  1. bash bin/run-hook-tests.shPASS=65, FAIL=0, SKIP=0 (was 7 failing before the pin guard).
  2. bash .claude/hooks/tests/test_handover_clone_prompt.sh → 18/18.
  3. test_resolve_ops_root_pin + test_link_custom_skills pass under the suite default; no real-repo mutation after runs.
  4. shellcheck -S error clean on all three changed files.
  5. Linux tests gate is green enforcing 65.

Closes #528


Glossary

Term Definition
Session pin _lib-ops-root.sh resolving the ops root from $APEXYARD_OPS_PIN_DIR/ops-root-$CLAUDE_CODE_SESSION_ID inside a live session.
APEXYARD_OPS_DISABLE_PIN Env flag that disables the pin so resolution walks up to the sandbox — the correct neutralizer for hermetic tests.
Quarantine The runner's documented skip-list; now empty — every discovered test is enforced.
Spec-pin test A test asserting literal substrings in a SKILL doc; rewritten here to the current clone-by-default design.

me2resh and others added 2 commits June 6, 2026 16:53
Root cause was test isolation, not a config regression: apply-agent-routing.sh
resolves the ops root via _lib-ops-root.sh, which inside a Claude Code session
honours the session pin ($APEXYARD_OPS_PIN_DIR/ops-root-$CLAUDE_CODE_SESSION_ID)
and pointed at the REAL fork instead of the mktemp sandbox — so the hook rewrote
the real .claude/agents/*.md (+ a stray snapshot) and every sandbox assertion
failed. Two fixes:
- export APEXYARD_OPS_DISABLE_PIN=1 at the top of the test so ops-root resolves
  by walk-up to the sandbox (no-op in headless CI, which has no pin). This alone
  took the suite from 4/14 to 13/14 and stopped the real-repo mutation.
- case 4 (idempotency) declared endpoint http://localhost:11434 but, unlike
  cases 9-13, never mocked curl — so the unreachable endpoint was correctly
  filtered and the endpoint_count assertion tested the environment. Hoisted
  make_mock_curl above case 4 and mocked the endpoint reachable → 14/14.

Un-quarantined in bin/run-hook-tests.sh. Only test_handover_clone_prompt remains
quarantined (it spec-pins a removed clone-prompt design; needs a rewrite). #528

Refs #528

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(a) Pin-escape audit: many hooks resolve ops-root via _lib-ops-root.sh, which
honours the session pin inside a live Claude Code session → sandbox tests escape
onto the REAL fork (wrong results; real-file mutation for writing hooks).
- Systemic fix: bin/run-hook-tests.sh now exports APEXYARD_OPS_DISABLE_PIN=1, so
  the whole suite resolves by walk-up to its sandbox (no-op in headless CI; the
  pin-exercising test_resolve_ops_root_pin sets the var per-case, unaffected).
  This alone turned the local run from 7 spurious failures to green.
- Belt-and-suspenders: test_link_custom_skills (writes symlinks into
  <ops-root>/.claude/skills/) gets its own guard so a standalone in-session run
  can't symlink into the real repo.

(b) Rewrote test_handover_clone_prompt against the CURRENT /handover spec. The
old assertions pinned a removed [y/n/later] clone prompt; the SKILL now clones
by default at step 1.5-clone (skip-if-.git, --no-clone to decline) with a step-8
follow-up-skills offer. New spec-asserts + a runtime simulator for the
clone-by-default branch → 18/18.

QUARANTINE is now EMPTY — the gate enforces all 65 tests. Closes #528.

Closes #528

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@atlas-apex

Copy link
Copy Markdown
Collaborator Author

Closing — branch carried #530's pre-squash commits and conflicts with dev after #530 merged. Re-opening from a clean dev-based branch with the identical 3-file delta.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants