Add OpenClaw orchestration support#173
Conversation
…est suite Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cleanup - Fix command injection via unvalidated $host with is_valid_hostname() - Add JSON validation guard on step output in turn loop - Restrict preexisting-branch check to local refs with gitLocalBranchExists() - Consolidate duplicate if-gf.JSON blocks and add human-mode session_exited - Add test for stateErr != nil capture fallthrough path - Extract handlePromptSendError helper to deduplicate Codex retry - Shell-quote agent/workspace IDs in suggested commands - Add upper bound (50 attempts) to watch initial capture loop - Fix SKILL.md text: remove tautology, prefix legacy script paths - Document needs_input heuristic limitations on broad question markers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The workflow tests were appearing to hang (~16s each) because openclaw-present.sh's 250-line jq transform was invoked 3 times per workflow dual run (~5s per invocation). These tests verify workflow logic, not presentation augmentation, so bypass the present script via OPENCLAW_PRESENT_SCRIPT=/nonexistent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ry, fix script robustness - Narrow question-needs-input markers to prefix matches to reduce false positives - Add captureWaitBaselineWithRetry for transient tmux capture failures - Preserve hysteresis state for known sessions skipped by activity gate - Add assistant mismatch error on workspace reuse - Use branchExists probe before falling back to error-string parsing - Fix openclaw-step.sh exit codes to propagate errors to orchestrator - Use mktemp for temp files and shell_quote for prompt escaping in DX scripts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…, jq compat - Add missing fallback append for stale-tag sessions in activity tracking - Add "which do you" marker and last-sentence extraction for mid-line questions - Remove extra branchExists() call that broke mock expectations in CreateWorkspace - Replace jq 1.7-only rindex with rindex_compat polyfill using indices (jq 1.5+) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…atch gate, tmux error handling - Remove erroneous fallback append for known stale sessions without recent activity, which was causing unnecessary capture-pane work (P1) - Gate workspace assistant mismatch error on explicit --assistant flag so default-filled values don't trigger false conflicts (P1) - Return early on bulk tmux query failure instead of using empty map fallback, preventing false "stopped" tab notifications (P2) - Quote $port and $host in cmd_terminal_preset launch commands - Add ghs_ and glpat- secret redaction patterns to all three scripts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve conflicts in activity logic: use HasRecentWindowActivity (ShouldFallbackForStaleTag was a broken reference). Keep main's new KnownFreshTagCaptureFailurePreservesActivity test (moved to state_test.go to stay under 500-line limit). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
`label` is a reserved keyword in jq 1.6 (CI's version). jq 1.7 relaxed this, so tests passed locally but failed in CI with exit 3. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| preSendContent, _ := tmuxCapturePaneTail(sessionName, 80, tmuxOpts) | ||
| preSendHash := tmux.ContentHash(preSendContent) | ||
|
|
||
| if err := tmuxSendKeys(sessionName, prompt, true, tmuxOpts); err != nil { | ||
| if killErr := tmuxKillSession(sessionName, tmuxOpts); killErr != nil { | ||
| slog.Debug("best-effort session kill failed", "session", sessionName, "error", killErr) | ||
| } | ||
| if gf.JSON { | ||
| return returnJSONErrorMaybeIdempotent( | ||
| w, wErr, gf, version, "agent.run", idempotencyKey, | ||
| ExitInternalError, "prompt_send_failed", err.Error(), map[string]any{ | ||
| "session_name": sessionName, | ||
| }, | ||
| ) | ||
| return handlePromptSendError(w, wErr, gf, version, idempotencyKey, sessionName, tmuxOpts, err, "send") | ||
| } | ||
|
|
||
| // Codex startup can still occasionally drop the very first prompt even when | ||
| // a cursor is visible. If pane output does not change after send, retry once. | ||
| if strings.EqualFold(strings.TrimSpace(assistantName), "codex") && | ||
| !waitForPromptDelivery(sessionName, preSendHash, tmuxOpts) { | ||
| waitForPaneOutput(sessionName, assistantName, tmuxOpts) | ||
| if err := tmuxSendKeys(sessionName, prompt, true, tmuxOpts); err != nil { | ||
| return handlePromptSendError(w, wErr, gf, version, idempotencyKey, sessionName, tmuxOpts, err, "retry") | ||
| } |
There was a problem hiding this comment.
🔴 Codex prompt retry can be skipped because baseline capture success is ignored
The initial prompt send path computes a baseline pane hash without checking whether the baseline capture actually succeeded. If the baseline capture fails (returns ok=false), preSendContent becomes the empty string and preSendHash becomes the hash of empty content. Subsequent delivery polling will almost always report “delivered” as soon as a later capture succeeds (because any non-empty pane hash differs from the empty baseline), even if the prompt was actually dropped.
Root Cause
In sendAgentRunPromptIfRequested, baseline capture ignores the ok flag:
preSendContent, _ := tmuxCapturePaneTail(...)andpreSendHash := tmux.ContentHash(preSendContent)internal/cli/cmd_agent_run_liveness.go:94-96
Then Codex retry is gated on waitForPromptDelivery(..., preSendHash, ...):
!waitForPromptDelivery(sessionName, preSendHash, tmuxOpts)internal/cli/cmd_agent_run_liveness.go:103-108
If the baseline capture failed, the retry decision becomes unreliable (false positives for “delivered”).
Actual behavior: Codex may drop the first prompt; the code may incorrectly decide delivery happened and skip the one allowed retry.
Expected behavior: When baseline capture fails, either (a) re-attempt baseline capture, or (b) treat delivery as unknown and allow the retry path (or use a different delivery confirmation signal).
Impact: Intermittent “prompt dropped” failures can persist under tmux load/unresponsiveness even though the retry mechanism exists, leading to agent run --prompt occasionally doing nothing until the user manually retries.
Prompt for agents
In internal/cli/cmd_agent_run_liveness.go within sendAgentRunPromptIfRequested (around lines 94-109), incorporate the ok result from tmuxCapturePaneTail when computing the pre-send baseline used by waitForPromptDelivery.
Specifically:
1) Capture baseline with (content, ok). If ok is false, do not use the empty-string hash as the baseline for delivery detection.
2) Choose one of these fixes:
a) Re-capture the baseline a small bounded number of times (with short sleep) until ok=true or attempts exhausted; if exhausted, treat delivery as unknown and allow the single Codex retry.
b) Gate the current delivery check on ok==true; if ok==false, always run the single retry path (after another readiness wait) because delivery could not be verified.
Keep behavior unchanged for non-Codex assistants. Ensure the retry still happens at most once.
Was this helpful? React with 👍 or 👎 to provide feedback.
| initialAttempts++ | ||
| if initialAttempts > watchInitialCaptureMaxAttempts { | ||
| if !emitEvent(enc, watchEvent{ | ||
| Type: "exited", | ||
| Timestamp: now(), | ||
| }) { | ||
| return ExitOK | ||
| } | ||
| return ExitOK | ||
| } |
There was a problem hiding this comment.
🟡 agent watch can emit "exited" after initial capture retries even when session still exists
The initial snapshot loop in agent watch has a hard cap (watchInitialCaptureMaxAttempts). When that cap is exceeded, it emits an exited event unconditionally, without verifying that the tmux session is actually gone.
Detailed Explanation
In runWatchLoopWith, the initial capture loop does state-aware handling of capture misses via watchShouldEmitExited(...). However, even if watchShouldEmitExited repeatedly determines the session still exists (or tmux state checks are erroring), the loop still increments initialAttempts and will eventually hit the hard limit and emit exited:
initialAttempts++and the unconditionalif initialAttempts > watchInitialCaptureMaxAttempts { emit exited }internal/cli/cmd_agent_watch.go:154-163
Actual behavior: under prolonged tmux slowness/unresponsiveness (capture fails but session exists), agent watch can incorrectly report type: "exited".
Expected behavior: exited should only be emitted when the session is confirmed not to exist; otherwise the loop should either keep waiting (bounded by context) or emit a distinct error/timeout event.
Impact: Downstream consumers may treat the agent as stopped and stop monitoring prematurely, even though the session is still alive.
Prompt for agents
In internal/cli/cmd_agent_watch.go within runWatchLoopWith’s initial snapshot capture loop (around lines 154-163), avoid emitting a misleading exited event solely because watchInitialCaptureMaxAttempts was reached.
Adjust the limit handling to confirm the session is truly gone before emitting exited:
- On exceeding watchInitialCaptureMaxAttempts, call tmuxSessionStateFor(cfg.SessionName, opts). If state.Exists is false (and no error), emit exited.
- If state.Exists is true or the state check errors, either (a) continue waiting until ctx.Done(), or (b) emit a new event type (e.g., "error" or "unavailable") indicating capture failure/tmux unresponsive, but do not claim the session exited.
Keep the existing state-aware logic (watchShouldEmitExited) for normal miss handling.
Was this helpful? React with 👍 or 👎 to provide feedback.
…est suite
Summary
Describe the change and intended behavior.
Quality Checklist
make devchecklocally.make lint-strict-newlocally for changed code.make harness-presets.go test ./internal/tmux ./internal/e2e.