agents.md: ban adding test skips without explicit user request#21232
Merged
Conversation
Adds an explicit rule that agents must never add t.Skip / SkipLoad / any equivalent skip mechanism unless the user explicitly asks for it, and must never offer skips as suggestions in user-facing option menus. Skipping a failing test hides the bug without fixing it. The skip's removal later surfaces the bug as a CI failure with no context about when or why it was added. Concrete case: #21153 removed a t.Skip on TestGeneratedTraceApiCollision that had documented a known parallel-exec SD/CREATE2-reincarnation bug; the underlying bug had never been fixed, so removing the skip surfaced CI failures across downstream PRs (#21017) with no easy attribution. The rule covers all forms of test muting (t.Skip, SkipLoad, bt.SkipLoad, build-tag exclusions, runner-matrix removal without tracking, etc.) and states the one acceptable workflow: explicit user request in the current turn, comment linking a tracking issue, and revert as soon as the bug is fixed.
Rewords the skip-policy section to: - Apply project-wide (every contributor + every automated agent), not just to agents reading agents.md. - Spell out the two valid skip reasons (external test suites we haven't implemented; flakes with low tolerance + tracking issue + local-repro investigation). - Add an explicit, strict rule for automated agents: never add a skip, period. Not even with a 'user can review it' framing. Not as an AskUserQuestion option. Not as a tactical unblock. The end state we want: humans add skips only for the two narrow reasons with tracking issues; automated agents always go through investigate -> reproduce -> fix, never -> skip.
AskAlexSharov
approved these changes
May 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a project-level policy in agents.md for test skips. The rule applies to every contributor and to every automated agent (LLM coding assistants etc.) working in the repo.
Three-part policy
Why skips are dangerous — concrete case: execution: parallel-exec correctness + perf fixes; gate residual EXEC3_PARALLEL flakes (#21136) #21153 removed a
t.Skipdocumenting a known parallel-exec SD/CREATE2 bug; the underlying bug was never actually fixed, so removing the skip suddenly red'd CI across downstream PRs.Two valid skip reasons (humans only, with tracking issue):
Strict ban for automated agents: agents must never add a skip, period. Not as a tactical unblock, not behind a flag, not as an
AskUserQuestionoption. Default trajectory for a failing test is investigate → reproduce → fix → verify. If genuinely can't fix in-session, escalate with investigation findings — never silently mute.Why
Skips convert a loud "this is broken" CI signal into silence, then back into surprise when the skip is later removed. The end state we want: humans add skips only for the two narrow reasons with tracking issues; automated agents always go through investigate → reproduce → fix, never → skip.
Test plan
agents.md; no code paths affected.agents.md/CLAUDE.mdat session start.