Bug
When Copilot CLI hits a rate limit AFTER successfully completing all work (build, test, create-issue), it exits with code 1. The compiled workflow propagates this as a job failure, even though the agent's work was fully successful.
Root Cause
Two issues in compiled workflow YAML:
-
Main agent step (copilot_engine_execution.go): No continue-on-error. Copilot CLI exits 1 on rate limit even after successful completion. Safe outputs are already persisted via MCP before the rate limit hits.
-
Threat detection step (threat_detection.go): Also lacks continue-on-error. When this step fails (rate limit), it fails the agent job even though "Set detection conclusion" handles failures gracefully.
Suggested Fix
- Check if
safeoutputs.jsonl has content despite non-zero exit — if agent produced outputs, treat as success
- Add
continue-on-error: true to threat detection step (conclusion step already handles failures)
- Or: have Copilot CLI distinguish rate-limit exits (exit 0 + warning) from real failures (exit 1)
Impact
Inflates failure count in batch experiment runs. At least 3 repos miscounted as failures in v6.
Bug
When Copilot CLI hits a rate limit AFTER successfully completing all work (build, test, create-issue), it exits with code 1. The compiled workflow propagates this as a job failure, even though the agent's work was fully successful.
Root Cause
Two issues in compiled workflow YAML:
Main agent step (
copilot_engine_execution.go): Nocontinue-on-error. Copilot CLI exits 1 on rate limit even after successful completion. Safe outputs are already persisted via MCP before the rate limit hits.Threat detection step (
threat_detection.go): Also lackscontinue-on-error. When this step fails (rate limit), it fails the agent job even though "Set detection conclusion" handles failures gracefully.Suggested Fix
safeoutputs.jsonlhas content despite non-zero exit — if agent produced outputs, treat as successcontinue-on-error: trueto threat detection step (conclusion step already handles failures)Impact
Inflates failure count in batch experiment runs. At least 3 repos miscounted as failures in v6.