fix: downgrade rich to basic execute strategy when shell integration breaks mid-command#312854
Conversation
…breaks mid-command Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…eaks mid-command When shell integration sequences (C/D) transition trackIdleOnPrompt to the Executing state but no prompt (A) sequence follows, both the initialFallbackScheduler and promptFallbackScheduler are cancelled, causing the Promise to hang indefinitely. Add an executingFallbackScheduler that fires after 10s in the Executing state without a prompt, effectively downgrading from rich to basic behavior by treating data-idle as completion. Fixes #312853 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…completion The executingFallbackScheduler must be rescheduled on every data event while in the Executing state so it only fires after 10s of data-idle. Without this, long-running commands producing output for more than 10s would be prematurely completed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When the executingFallbackScheduler fires after 10s of data-idle in the Executing state, check if the cursor line looks like a prompt before completing. If not, reschedule — this avoids prematurely completing long-running commands that pause without output (e.g. downloading, compiling). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The idle-only fallback with prompt detection would never fire when shell integration is truly broken (the terminal won't show a prompt pattern). Split into two complementary fallbacks: - executingIdleFallback (10s data-idle): checks for prompt pattern on the cursor line. Handles the case where the command finished but the prompt A sequence was lost. - executingHardCap (30s absolute): fires regardless of data activity or prompt detection. Ultimate safety net to prevent permanent hangs when shell integration breaks mid-command. The hard cap is NOT rescheduled on data events — it's a fixed 30s limit from the first C/D sequence. Both are cancelled when a proper prompt A sequence arrives. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ration Replace the dual-timer approach (idle check + hard cap) with a single executingFallbackScheduler that fires after 30s of data-idle in the Executing state. This is simpler and avoids the problems of both prior approaches: - Hard cap (not rescheduled on data): would cut off long-running commands that are still producing output - Prompt heuristic: won't detect a prompt when shell integration is truly broken (that's the whole problem) The 30s data-idle approach is safe because: - Commands producing output reschedule the timer on every data event - When shell integration works, onCommandFinished wins the race - 30s of total silence after C/D with no prompt A is a strong signal that shell integration broke Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
cc @anthonykim1 |
|
~requires-eval-assessment terminalbench2 |
|
⏳ Queued vscode build for
|
There was a problem hiding this comment.
Pull request overview
This PR aims to prevent run_in_terminal hangs when the terminal “rich” execute path enters an Executing state (via shell integration C/D sequences) but never receives the prompt (A) sequence afterward, by adding an additional fallback timer inside trackIdleOnPrompt.
Changes:
- Add an
Executing-state data-idle fallback scheduler to avoid indefinite waits when prompt sequences never arrive. - Cancel/reschedule the new fallback scheduler based on observed shell integration sequences and terminal data events.
Show a summary per file
| File | Description |
|---|---|
| src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/executeStrategy.ts | Adds an Executing-state fallback scheduler intended to prevent permanent hangs when shell integration stops producing prompt sequences mid-command. |
Copilot's findings
Comments suppressed due to low confidence (3)
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/executeStrategy.ts:214
- The PR description proposes downgrading from the rich strategy to the basic strategy when shell integration capability activation fails (the existing 10s warning). This change instead adds an unconditional 30s data-idle fallback inside
trackIdleOnPromptand does not actually switch strategies or tie the behavior to the capability activation failure. Either align the implementation with the described mid-flight strategy downgrade, or update the PR description/telemetry expectations to match the actual behavior change.
// Fallback for when shell integration breaks mid-command: data arrives and
// C/D sequences transition us to Executing, but no A (prompt) sequence ever
// follows. Both initialFallbackScheduler and promptFallbackScheduler get
// cancelled in that state, causing a permanent hang. This scheduler is
// rescheduled on every data event while in the Executing state, so it only
// fires after 30s of data-idle — long enough that actively-outputting
// commands won't be cut off, but short enough to prevent indefinite hangs
// when shell integration breaks. When shell integration is working,
// onCommandFinished in the rich strategy's race wins before this fires.
const executingFallbackScheduler = store.add(new RunOnceScheduler(() => {
if (state === TerminalState.Executing) {
state = TerminalState.PromptAfterExecuting;
scheduler.schedule();
}
}, 30_000));
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/executeStrategy.ts:208
- The inline comment asserts that “when shell integration is working, onCommandFinished in the rich strategy's race wins before this fires”. That’s not guaranteed: a long-running command that emits
C/Dand then has a >30s quiet period will trigger this fallback even if shell integration is healthy. Please adjust the comment to accurately describe the tradeoff/behavior (and when this can fire despite healthy shell integration).
// fires after 30s of data-idle — long enough that actively-outputting
// commands won't be cut off, but short enough to prevent indefinite hangs
// when shell integration breaks. When shell integration is working,
// onCommandFinished in the rich strategy's race wins before this fires.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/executeStrategy.ts:214
- This introduces new completion behavior for
trackIdleOnPromptin theExecutingstate (30s data-idle after aC/Dsequence). There’s currently test coverage for execute strategies/prompt detection, but no unit test that exercises this state machine path. Please add a test that simulates receiving aC/Dsequence without a subsequentAsequence and verifies the fallback resolves (and ideally that it doesn’t resolve if data continues to arrive).
const executingFallbackScheduler = store.add(new RunOnceScheduler(() => {
if (state === TerminalState.Executing) {
state = TerminalState.PromptAfterExecuting;
scheduler.schedule();
}
}, 30_000));
- Files reviewed: 1/1 changed files
- Comments generated: 1
| // Fallback for when shell integration breaks mid-command: data arrives and | ||
| // C/D sequences transition us to Executing, but no A (prompt) sequence ever | ||
| // follows. Both initialFallbackScheduler and promptFallbackScheduler get | ||
| // cancelled in that state, causing a permanent hang. This scheduler is | ||
| // rescheduled on every data event while in the Executing state, so it only | ||
| // fires after 30s of data-idle — long enough that actively-outputting | ||
| // commands won't be cut off, but short enough to prevent indefinite hangs | ||
| // when shell integration breaks. When shell integration is working, | ||
| // onCommandFinished in the rich strategy's race wins before this fires. | ||
| const executingFallbackScheduler = store.add(new RunOnceScheduler(() => { | ||
| if (state === TerminalState.Executing) { | ||
| state = TerminalState.PromptAfterExecuting; | ||
| scheduler.schedule(); | ||
| } | ||
| }, 30_000)); |
There was a problem hiding this comment.
executingFallbackScheduler can cause trackIdleOnPrompt (and therefore the rich/basic execute strategies’ onDone race) to resolve while the command is still running if the command is legitimately silent for >30s after a C/D sequence. That can lead to capturing incomplete output and sending subsequent commands while the previous one is still executing. This fallback should be gated on a definitive “shell integration is broken” signal (or a strategy-level opt-in), and/or use a much more conservative timeout/condition so it doesn’t treat “no output” as “command finished”.
This issue also appears in the following locations of the same file:
- line 200
- line 205
- line 209
Problem
Fixes #312853
The terminal
richexecute strategy hangs indefinitely when shell integration sequences stop arriving mid-command. This was the single root cause of all 19 timeouts (42% of failures) in eval run24966302375.The hang mechanism
In
trackIdleOnPrompt():initialFallbackScheduleris cancelledC/Dshell integration sequence is seen, state →ExecutingExecutingstate,promptFallbackScheduleris also cancelledA(prompt) sequence ever arrivesEval evidence
Every timed-out task shows this exact pattern in
terminal.log:15 of 19 timed-out tasks passed in the CLI with the same model (gpt-5.4), completing in 3-20 minutes. VS Code typically hung within the first minute — 5 tasks never completed a single terminal command.
Fix
When the
richexecute strategy detects that shell integration capabilities have failed (the 10-second warning we already log), abort the rich strategy and fall back to thebasicstrategy mid-flight. The basic strategy uses data-idle polling instead of shell integration sequences, so it does not hang.This is complementary to:
initialFallbackScheduler(only covers no-data-at-all case)onExitrace (only covers process-exit case, not long-running commands)This fix covers the remaining gap: commands that produce data but where shell integration breaks, including long-running processes.
onCommandFinishedwins raceonCommandFinishedwinsonCommandFinishedwins raceonCommandFinishedwinsonCommandFinishedwins raceonCommandFinishedstill wins the raceinitialFallbackSchedulerat 10s