… and model timeout failures
This commit addresses issue #331 by making Daytona-backed workflow runs
fail fast or recover cleanly instead of leaving stuck agent sessions.
Changes:
1. Workspace readiness gate (bootstrapDaytona):
- Before setting bootstrap_ok=1 or starting the bridge, verify every
configured repository is present at the expected path with a .git dir.
- If repos are missing, mark the claw as failed with a sanitized,
actionable bootstrap error (bootstrap_diagnostic).
- Uses setBootstrapStatusWithDiagnostic to persist both status and
diagnostic for UI display.
2. Tool-loop detection (detectToolLoop):
- Extended to catch repeated exec/elevated/tool-policy failures:
- 'exec failed:'
- 'elevated is not available'
- 'tool-policy'
- Injects a corrective hub message after 3+ occurrences, similar to
existing edit/write/read loop detection.
3. Model timeout classification (heartbeat handler):
- When gateway is unhealthy for 8+ consecutive checks (~4 minutes) while
streaming, classify as 'model timeout'.
- Persists diagnostic in bootstrap_diagnostic column and broadcasts it
to the UI as a retryable failure.
4. Bootstrap diagnostic persistence:
- Added bootstrap_diagnostic column to claws table (schema + migration).
- Added BootstrapDiagnostic field to types.Claw.
- stopAgentWithReason now persists sanitized diagnostic to DB and
includes it in the broadcast payload.
- setBootstrapStatusWithDiagnostic helper added for gating failures.
5. Tests:
- Added daytona_resilience_test.go with tests for:
- elevated failure loop detection
- tool-policy failure loop detection
- single failure not triggering loop
- mixed failure patterns
- bootstrap output sanitization for workspace diagnostics
This PR addresses issue #331 by making Daytona-backed workflow runs fail fast or recover cleanly instead of leaving stuck agent sessions.
Changes
1. Workspace readiness gate (bootstrapDaytona)
2. Tool-loop detection (detectToolLoop)
exec failed:elevated is not availabletool-policy3. Model timeout classification (heartbeat handler)
4. Bootstrap diagnostic persistence
5. Tests
Verification
go build ./...passesgo test ./pkg/hub/...passes (all 100+ tests)Closes #331