Skip to content

Add OpenClaw orchestration support#173

Merged
andyrewlee merged 10 commits intomainfrom
openclaw
Feb 20, 2026
Merged

Add OpenClaw orchestration support#173
andyrewlee merged 10 commits intomainfrom
openclaw

Conversation

@andyrewlee
Copy link
Copy Markdown
Owner

@andyrewlee andyrewlee commented Feb 19, 2026

…est suite

Summary

Describe the change and intended behavior.

Quality Checklist

  • Ran make devcheck locally.
  • Ran make lint-strict-new locally for changed code.
  • If UI/rendering changed, ran make harness-presets.
  • If tmux/e2e changed, ran go test ./internal/tmux ./internal/e2e.

Open with Devin

…est suite

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 8 additional findings.

Open in Devin Review

andyrewlee and others added 4 commits February 19, 2026 03:09
…cleanup

- Fix command injection via unvalidated $host with is_valid_hostname()
- Add JSON validation guard on step output in turn loop
- Restrict preexisting-branch check to local refs with gitLocalBranchExists()
- Consolidate duplicate if-gf.JSON blocks and add human-mode session_exited
- Add test for stateErr != nil capture fallthrough path
- Extract handlePromptSendError helper to deduplicate Codex retry
- Shell-quote agent/workspace IDs in suggested commands
- Add upper bound (50 attempts) to watch initial capture loop
- Fix SKILL.md text: remove tautology, prefix legacy script paths
- Document needs_input heuristic limitations on broad question markers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The workflow tests were appearing to hang (~16s each) because
openclaw-present.sh's 250-line jq transform was invoked 3 times
per workflow dual run (~5s per invocation). These tests verify
workflow logic, not presentation augmentation, so bypass the
present script via OPENCLAW_PRESENT_SCRIPT=/nonexistent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ry, fix script robustness

- Narrow question-needs-input markers to prefix matches to reduce false positives
- Add captureWaitBaselineWithRetry for transient tmux capture failures
- Preserve hysteresis state for known sessions skipped by activity gate
- Add assistant mismatch error on workspace reuse
- Use branchExists probe before falling back to error-string parsing
- Fix openclaw-step.sh exit codes to propagate errors to orchestrator
- Use mktemp for temp files and shell_quote for prompt escaping in DX scripts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
devin-ai-integration[bot]

This comment was marked as resolved.

…, jq compat

- Add missing fallback append for stale-tag sessions in activity tracking
- Add "which do you" marker and last-sentence extraction for mid-line questions
- Remove extra branchExists() call that broke mock expectations in CreateWorkspace
- Replace jq 1.7-only rindex with rindex_compat polyfill using indices (jq 1.5+)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
devin-ai-integration[bot]

This comment was marked as resolved.

andyrewlee and others added 4 commits February 19, 2026 13:21
…atch gate, tmux error handling

- Remove erroneous fallback append for known stale sessions without recent
  activity, which was causing unnecessary capture-pane work (P1)
- Gate workspace assistant mismatch error on explicit --assistant flag so
  default-filled values don't trigger false conflicts (P1)
- Return early on bulk tmux query failure instead of using empty map fallback,
  preventing false "stopped" tab notifications (P2)
- Quote $port and $host in cmd_terminal_preset launch commands
- Add ghs_ and glpat- secret redaction patterns to all three scripts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve conflicts in activity logic: use HasRecentWindowActivity
(ShouldFallbackForStaleTag was a broken reference). Keep main's new
KnownFreshTagCaptureFailurePreservesActivity test (moved to state_test.go
to stay under 500-line limit).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
`label` is a reserved keyword in jq 1.6 (CI's version). jq 1.7
relaxed this, so tests passed locally but failed in CI with exit 3.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@andyrewlee andyrewlee merged commit 46e839e into main Feb 20, 2026
3 checks passed
@andyrewlee andyrewlee deleted the openclaw branch February 20, 2026 05:33
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View 25 additional findings in Devin Review.

Open in Devin Review

Comment on lines +94 to 108
preSendContent, _ := tmuxCapturePaneTail(sessionName, 80, tmuxOpts)
preSendHash := tmux.ContentHash(preSendContent)

if err := tmuxSendKeys(sessionName, prompt, true, tmuxOpts); err != nil {
if killErr := tmuxKillSession(sessionName, tmuxOpts); killErr != nil {
slog.Debug("best-effort session kill failed", "session", sessionName, "error", killErr)
}
if gf.JSON {
return returnJSONErrorMaybeIdempotent(
w, wErr, gf, version, "agent.run", idempotencyKey,
ExitInternalError, "prompt_send_failed", err.Error(), map[string]any{
"session_name": sessionName,
},
)
return handlePromptSendError(w, wErr, gf, version, idempotencyKey, sessionName, tmuxOpts, err, "send")
}

// Codex startup can still occasionally drop the very first prompt even when
// a cursor is visible. If pane output does not change after send, retry once.
if strings.EqualFold(strings.TrimSpace(assistantName), "codex") &&
!waitForPromptDelivery(sessionName, preSendHash, tmuxOpts) {
waitForPaneOutput(sessionName, assistantName, tmuxOpts)
if err := tmuxSendKeys(sessionName, prompt, true, tmuxOpts); err != nil {
return handlePromptSendError(w, wErr, gf, version, idempotencyKey, sessionName, tmuxOpts, err, "retry")
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Codex prompt retry can be skipped because baseline capture success is ignored

The initial prompt send path computes a baseline pane hash without checking whether the baseline capture actually succeeded. If the baseline capture fails (returns ok=false), preSendContent becomes the empty string and preSendHash becomes the hash of empty content. Subsequent delivery polling will almost always report “delivered” as soon as a later capture succeeds (because any non-empty pane hash differs from the empty baseline), even if the prompt was actually dropped.

Root Cause

In sendAgentRunPromptIfRequested, baseline capture ignores the ok flag:

  • preSendContent, _ := tmuxCapturePaneTail(...) and preSendHash := tmux.ContentHash(preSendContent) internal/cli/cmd_agent_run_liveness.go:94-96

Then Codex retry is gated on waitForPromptDelivery(..., preSendHash, ...):

  • !waitForPromptDelivery(sessionName, preSendHash, tmuxOpts) internal/cli/cmd_agent_run_liveness.go:103-108

If the baseline capture failed, the retry decision becomes unreliable (false positives for “delivered”).

Actual behavior: Codex may drop the first prompt; the code may incorrectly decide delivery happened and skip the one allowed retry.

Expected behavior: When baseline capture fails, either (a) re-attempt baseline capture, or (b) treat delivery as unknown and allow the retry path (or use a different delivery confirmation signal).

Impact: Intermittent “prompt dropped” failures can persist under tmux load/unresponsiveness even though the retry mechanism exists, leading to agent run --prompt occasionally doing nothing until the user manually retries.

Prompt for agents
In internal/cli/cmd_agent_run_liveness.go within sendAgentRunPromptIfRequested (around lines 94-109), incorporate the ok result from tmuxCapturePaneTail when computing the pre-send baseline used by waitForPromptDelivery.

Specifically:
1) Capture baseline with (content, ok). If ok is false, do not use the empty-string hash as the baseline for delivery detection.
2) Choose one of these fixes:
   a) Re-capture the baseline a small bounded number of times (with short sleep) until ok=true or attempts exhausted; if exhausted, treat delivery as unknown and allow the single Codex retry.
   b) Gate the current delivery check on ok==true; if ok==false, always run the single retry path (after another readiness wait) because delivery could not be verified.

Keep behavior unchanged for non-Codex assistants. Ensure the retry still happens at most once.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +154 to +163
initialAttempts++
if initialAttempts > watchInitialCaptureMaxAttempts {
if !emitEvent(enc, watchEvent{
Type: "exited",
Timestamp: now(),
}) {
return ExitOK
}
return ExitOK
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 agent watch can emit "exited" after initial capture retries even when session still exists

The initial snapshot loop in agent watch has a hard cap (watchInitialCaptureMaxAttempts). When that cap is exceeded, it emits an exited event unconditionally, without verifying that the tmux session is actually gone.

Detailed Explanation

In runWatchLoopWith, the initial capture loop does state-aware handling of capture misses via watchShouldEmitExited(...). However, even if watchShouldEmitExited repeatedly determines the session still exists (or tmux state checks are erroring), the loop still increments initialAttempts and will eventually hit the hard limit and emit exited:

  • initialAttempts++ and the unconditional if initialAttempts > watchInitialCaptureMaxAttempts { emit exited } internal/cli/cmd_agent_watch.go:154-163

Actual behavior: under prolonged tmux slowness/unresponsiveness (capture fails but session exists), agent watch can incorrectly report type: "exited".

Expected behavior: exited should only be emitted when the session is confirmed not to exist; otherwise the loop should either keep waiting (bounded by context) or emit a distinct error/timeout event.

Impact: Downstream consumers may treat the agent as stopped and stop monitoring prematurely, even though the session is still alive.

Prompt for agents
In internal/cli/cmd_agent_watch.go within runWatchLoopWith’s initial snapshot capture loop (around lines 154-163), avoid emitting a misleading exited event solely because watchInitialCaptureMaxAttempts was reached.

Adjust the limit handling to confirm the session is truly gone before emitting exited:
- On exceeding watchInitialCaptureMaxAttempts, call tmuxSessionStateFor(cfg.SessionName, opts). If state.Exists is false (and no error), emit exited.
- If state.Exists is true or the state check errors, either (a) continue waiting until ctx.Done(), or (b) emit a new event type (e.g., "error" or "unavailable") indicating capture failure/tmux unresponsive, but do not claim the session exited.

Keep the existing state-aware logic (watchShouldEmitExited) for normal miss handling.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant