Skip to content

Harden Replicated provisioning retries#355

Merged
AnaBerg merged 6 commits into
mainfrom
codex/replicated-provisioning-reliability
Jun 7, 2026
Merged

Harden Replicated provisioning retries#355
AnaBerg merged 6 commits into
mainfrom
codex/replicated-provisioning-reliability

Conversation

@AnaBerg

@AnaBerg AnaBerg commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add bounded retry/status handling for Replicated post-bootstrap SSH steps.
  • Treat workspace file staging, workspace verification, and GitHub credential setup as required before logging bootstrap completion.
  • Harden remote SSH file writes with quoted paths, parent directory creation, and base64 payload transfer.
  • Add tests for retry behavior, path quoting, readiness command generation, and remote write command generation.
  • Add the implementation spec for issue more reliability in claw provisioning #271.

Root Cause

Replicated claws could hit transient SSH connection refusal after the bridge bootstrap completed. The previous flow logged warnings for template file and GitHub credential helper failures, then still logged Bootstrap complete, leaving GitHub issue factory claws with incomplete workspaces or missing credentials.

Validation

  • go test ./pkg/hub -run 'TestRetryReplicatedBootstrapStep|TestReplicatedWorkspaceReadinessCommand|TestShellDoubleQuote|TestRemoteWriteFileCommand|TestBootstrapScript' -count=1
  • go test ./pkg/hub
  • go test ./pkg/provider/replicated ./pkg/provider/daytona ./pkg/provider/exedev
  • go test ./...
  • cr --base feat/docker-web-env until CodeRabbit reported No findings

@AnaBerg AnaBerg changed the title [codex] Harden Replicated provisioning retries Harden Replicated provisioning retries Jun 4, 2026
@AnaBerg AnaBerg self-assigned this Jun 4, 2026
Base automatically changed from feat/docker-web-env to main June 5, 2026 16:10
@AnaBerg AnaBerg force-pushed the codex/replicated-provisioning-reliability branch from e66c8fa to bc164b0 Compare June 5, 2026 16:34

@AnaBerg AnaBerg left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is scoped to Replicated provisioning reliability, but it also changes AI config selection, external webhook trigger handling, Docker provisioning behavior, script command validation, and a UI title edge case. That broadens the risk and makes review/rollback harder. Please split these unrelated changes into separate PRs or explicitly justify why they are required for issue #271.

Comment thread docs/superpowers/plans/2026-06-04-replicated-provisioning-reliability.md Outdated
@AnaBerg AnaBerg marked this pull request as ready for review June 5, 2026 20:07
@greptile-apps

greptile-apps Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Reviews (1): Last reviewed commit: "chore: keep replicated PR scoped" | Re-trigger Greptile

Comment thread pkg/hub/server.go Outdated
Comment thread pkg/hub/server.go Outdated
@greptile-apps

greptile-apps Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Reviews (2): Last reviewed commit: "fix: address replicated review comments" | Re-trigger Greptile

@AnaBerg AnaBerg linked an issue Jun 5, 2026 that may be closed by this pull request
…isioning-reliability

# Conflicts:
#	pkg/hub/server.go
@greptile-apps

greptile-apps Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Reviews (3): Last reviewed commit: "Merge remote-tracking branch 'origin/mai..." | Re-trigger Greptile

@AnaBerg AnaBerg merged commit 58d9c0f into main Jun 7, 2026
11 checks passed
@AnaBerg AnaBerg deleted the codex/replicated-provisioning-reliability branch June 7, 2026 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

more reliability in claw provisioning

1 participant