-
Notifications
You must be signed in to change notification settings - Fork 277
Description
What happens
push_repo_memory.cjs performs a single pull-then-push sequence to update the repo-memory branch. If the push fails — typically because another concurrent workflow pushed to the same branch between the pull and push — the script calls core.setFailed() and exits immediately. There is no retry loop or exponential backoff.
In a parallel pipeline (multiple agents dispatched simultaneously), all agents that complete their work around the same time will race on the repo-memory push. The first push wins; every subsequent push fails with a non-fast-forward error and the workflow step is marked as failed.
What should happen
The push should retry with exponential backoff on non-fast-forward failures. The pattern is: pull (with merge strategy), push, and if push fails due to a concurrent update, pull again and retry. This is a standard optimistic concurrency pattern for git-based shared state.
Where in the code
All references are to main at 99b2107.
Single pull:
push_repo_memory.cjs:372-380:try { const repoUrl = `https://x-access-token:${ghToken}@${serverHost}/${targetRepo}.git`; execGitSync(["pull", "--no-rebase", "-X", "ours", repoUrl, branchName], { stdio: "inherit" }); } catch (error) { core.warning(`Pull failed (this may be expected): ${getErrorMessage(error)}`); }
Single push with hard failure:
push_repo_memory.cjs:382-390:try { const repoUrl = `https://x-access-token:${ghToken}@${serverHost}/${targetRepo}.git`; execGitSync(["push", repoUrl, `HEAD:${branchName}`], { stdio: "inherit" }); core.info(`Successfully pushed changes to ${branchName} branch`); } catch (error) { core.setFailed(`Failed to push changes: ${getErrorMessage(error)}`); return; }
No retry logic: No loop, no backoff, no re-pull on push failure anywhere in the script.
Evidence
Source-level verification (2026-03-03):
- Searched entire
push_repo_memory.cjsfor retry, backoff, loop, or re-attempt logic — none found - The
catchblock on push (line 388-390) callscore.setFailed()and returns immediately - The pull step's
catchblock (line 378-380) logs a warning but doesn't trigger a retry flow
Race condition analysis:
- In a 4-agent parallel pipeline, all agents may finish within seconds of each other
- Each agent's safe-outputs job runs
push_repo_memory.cjsindependently - All agents pull the same state, commit their changes locally, then push
- First push succeeds; subsequent pushes get
rejected (non-fast-forward)because their local branch is behind - Without retry, those pushes hard-fail and repo-memory updates are lost
Proposed fix
Wrap the pull-push sequence in a retry loop with exponential backoff:
const MAX_RETRIES = 3;
const BASE_DELAY_MS = 1000;
for (let attempt = 0; attempt <= MAX_RETRIES; attempt++) {
try {
execGitSync(["pull", "--no-rebase", "-X", "ours", repoUrl, branchName], { stdio: "inherit" });
} catch (error) {
core.warning(`Pull failed (this may be expected): ${getErrorMessage(error)}`);
}
try {
execGitSync(["push", repoUrl, `HEAD:${branchName}`], { stdio: "inherit" });
core.info(`Successfully pushed changes to ${branchName} branch`);
return; // success
} catch (error) {
if (attempt < MAX_RETRIES) {
const delay = BASE_DELAY_MS * Math.pow(2, attempt);
core.warning(`Push failed (attempt ${attempt + 1}/${MAX_RETRIES + 1}), retrying in ${delay}ms...`);
// wait delay
} else {
core.setFailed(`Failed to push changes after ${MAX_RETRIES + 1} attempts: ${getErrorMessage(error)}`);
return;
}
}
}Impact
Frequency: Every parallel pipeline dispatch. In our 4-agent runs, at least 1-2 repo-memory pushes fail per batch.
Cost: Moderate — repo-memory updates are lost for the failing agents. The step failure also adds noise to the run summary. The data isn't critical (repo-memory is advisory), but the step failure can be confusing for operators triaging run results.