test(synthetic): add 8 Kubernetes RCA scenarios#661
Conversation
Greptile SummaryAdds 8 synthetic Kubernetes RCA scenarios (OOMKilled, ImagePullBackOff, pending/unschedulable, liveness-probe killing, resource-quota exceeded, DNS failure, node-not-ready, stuck rollout) to the existing test suite from #583. Each scenario follows the base-override fixture model and includes alert, evidence, and graded Prior review concerns (contradictory deployment counters in 002/008, missing Confidence Score: 5/5Safe to merge; all remaining findings are P2 quality improvements to scoring metadata, not runtime or data-correctness issues. All P0/P1 issues from previous review rounds (contradictory ready/unavailable counts, missing available_evidence, root-cause tokens in ruling_out_keywords for 004 and 008) are resolved. The only remaining finding is a P2 concern about weak ruling_out_keywords in three answer.yml files — a scoring-signal quality issue that doesn't affect test correctness or the agent's pass/fail outcome. tests/synthetic/eks/003-pending-insufficient-resources/answer.yml, tests/synthetic/eks/005-resource-quota-exceeded/answer.yml, tests/synthetic/eks/006-dns-resolution-failure/answer.yml — ruling_out_keywords should use negative-framing tokens. Important Files Changed
Prompt To Fix All With AIThis is a comment left during a code review.
Path: tests/synthetic/eks/003-pending-insufficient-resources/answer.yml
Line: 17-19
Comment:
**`ruling_out_keywords` are positive-evidence tokens, not negative-framing**
`nodes` and `Ready` will appear trivially in any correct diagnosis of this scenario (the answer naturally mentions "both nodes are Ready=True but their allocatable CPU is nearly zero"). They carry no signal that the agent specifically ruled out an alternative hypothesis (e.g. node failure, OOM, image pull).
Compare the fixed usage in scenarios 004 and 008: `"not OOM"`, `"exit code 0"`, `"not quota"`. For scenario 003 the intended ruling-out claim is that the nodes themselves are healthy and the block is CPU capacity, so tokens like `"not node failure"` or `"not OOM"` would actually test that conclusion. The same pattern applies to scenario 005 (`existing pods`, `Ready`) and scenario 006 (`Ready`, `restart`).
How can I resolve this? If you propose a fix, please make it concise.Reviews (4): Last reviewed commit: "fix(synthetic): make available_evidence ..." | Re-trigger Greptile |
|
Want your agent to iterate on Greptile's feedback? Try greploops. |
Summary
Why this matters
One thing to fix later
Two scenarios (quota and rollout-stuck) needed a small workaround because EKS tool output is currently dropped before reaching the agent (known issue from #583). For now I mirrored the key signals into Datadog logs so the agent can see them. Once the EKS evidence wiring is fixed in a follow-up, the workaround can be removed.
Test plan