test(#528): drain quarantine — pin-escape audit + handover rewrite#532
Closed
atlas-apex wants to merge 2 commits into
Closed
test(#528): drain quarantine — pin-escape audit + handover rewrite#532atlas-apex wants to merge 2 commits into
atlas-apex wants to merge 2 commits into
Conversation
Root cause was test isolation, not a config regression: apply-agent-routing.sh resolves the ops root via _lib-ops-root.sh, which inside a Claude Code session honours the session pin ($APEXYARD_OPS_PIN_DIR/ops-root-$CLAUDE_CODE_SESSION_ID) and pointed at the REAL fork instead of the mktemp sandbox — so the hook rewrote the real .claude/agents/*.md (+ a stray snapshot) and every sandbox assertion failed. Two fixes: - export APEXYARD_OPS_DISABLE_PIN=1 at the top of the test so ops-root resolves by walk-up to the sandbox (no-op in headless CI, which has no pin). This alone took the suite from 4/14 to 13/14 and stopped the real-repo mutation. - case 4 (idempotency) declared endpoint http://localhost:11434 but, unlike cases 9-13, never mocked curl — so the unreachable endpoint was correctly filtered and the endpoint_count assertion tested the environment. Hoisted make_mock_curl above case 4 and mocked the endpoint reachable → 14/14. Un-quarantined in bin/run-hook-tests.sh. Only test_handover_clone_prompt remains quarantined (it spec-pins a removed clone-prompt design; needs a rewrite). #528 Refs #528 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(a) Pin-escape audit: many hooks resolve ops-root via _lib-ops-root.sh, which honours the session pin inside a live Claude Code session → sandbox tests escape onto the REAL fork (wrong results; real-file mutation for writing hooks). - Systemic fix: bin/run-hook-tests.sh now exports APEXYARD_OPS_DISABLE_PIN=1, so the whole suite resolves by walk-up to its sandbox (no-op in headless CI; the pin-exercising test_resolve_ops_root_pin sets the var per-case, unaffected). This alone turned the local run from 7 spurious failures to green. - Belt-and-suspenders: test_link_custom_skills (writes symlinks into <ops-root>/.claude/skills/) gets its own guard so a standalone in-session run can't symlink into the real repo. (b) Rewrote test_handover_clone_prompt against the CURRENT /handover spec. The old assertions pinned a removed [y/n/later] clone prompt; the SKILL now clones by default at step 1.5-clone (skip-if-.git, --no-clone to decline) with a step-8 follow-up-skills offer. New spec-asserts + a runtime simulator for the clone-by-default branch → 18/18. QUARANTINE is now EMPTY — the gate enforces all 65 tests. Closes #528. Closes #528 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Collaborator
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Completes #528 — drains the test-runner quarantine to empty. (Originally PR #531, which GitHub auto-closed when #530's branch was deleted on merge; re-opened here against
devwith the same already-Rex-reviewed HEAD5d2f3b7.)(a) Pin-escape audit + fix. Many hooks resolve ops-root via
_lib-ops-root.sh, which inside a live Claude Code session honours the session pin → sandbox tests escape onto the real fork (wrong results; real-file mutation for writing hooks). 21 hooks use the pin; only 4 tests guarded.bin/run-hook-tests.shnowexport APEXYARD_OPS_DISABLE_PIN=1— the whole suite resolves by walk-up to its sandbox (no-op in headless CI;test_resolve_ops_root_pinsets the var per-case, unaffected). Turned the local run from 7 spurious failures → green.test_link_custom_skills(symlinks into.claude/skills/) gets its own guard.(b) Rewrote
test_handover_clone_prompt. Old assertions pinned a removed[y/n/later]clone prompt; the SKILL now clones-by-default at step 1.5-clone (skip-if-.git,--no-clone) + a step-8 follow-up offer. New spec-asserts + a clone-by-default runtime simulator → 18/18.Result:
QUARANTINEis empty — the gate enforces all 65 tests.Closes #528.Testing
bash bin/run-hook-tests.sh→ PASS=65, FAIL=0, SKIP=0 (was 7 failing before the pin guard).bash .claude/hooks/tests/test_handover_clone_prompt.sh→ 18/18.test_resolve_ops_root_pin+test_link_custom_skillspass under the suite default; no real-repo mutation after runs.shellcheck -S errorclean on all three changed files.testsgate is green enforcing 65.Closes #528
Glossary
_lib-ops-root.shresolving the ops root from$APEXYARD_OPS_PIN_DIR/ops-root-$CLAUDE_CODE_SESSION_IDinside a live session.APEXYARD_OPS_DISABLE_PIN