test(synthetic): add 5 EKS noise scenarios#666
Conversation
Greptile SummaryAdds 5 EKS noise/false-positive scenarios (009–013) to the synthetic test suite, bringing EKS coverage in line with RDS Postgres. All 33 new files are test data only (JSON evidence fixtures, YAML scenario metadata and answer keys) with no changes to production code.
Confidence Score: 4/5Safe to merge with one scenario (012) containing a provably unsupportable timestamp claim in its expected model response. One P1 finding remains: scenario 012's tests/synthetic/eks/012-pending-recovered/answer.yml (15-minute claim), tests/synthetic/eks/011-recovered-rollout/eks_events.json and tests/synthetic/eks/012-pending-recovered/eks_events.json (navigational hints in event messages) Important Files Changed
|
Summary
The 5 scenarios
Real-LLM scoring (mock backends)
5/5 pass on the latest evidence state.
Note: 011 and 013 were initially flaky — the agent returned
unknown/infrastructureon some runs. Mirroring critical recovery context intodatadog_logs(the diagnose step weights Datadog evidence more heavily than EKS events for confident classification) got both to consistently pass.The underlying behaviour is still worth tightening: the diagnose prompt in
app/nodes/root_cause_diagnosis/prompt_builder.pycould weight current-state signals above historical Warning events more aggressively. Happy to follow up in a separate PR.What's in each scenario folder
scenario.yml- base inheritance from000-healthy+ adversarial signal metadataalert.json- symptom-level alert (e.g. KubernetesPodCrashLooping)answer.yml- expected categoryhealthy, with negative-evidence rule-outs (e.g. "not OOM", "not insufficient")000-healthy)eks_events.jsonis kept strictly Warning-typed; recovery context lives indatadog_logsTest plan
pytest tests/synthetic/eks/- all 14 scenarios load