fix: downgrade rich to basic execute strategy when shell integration breaks mid-command by meganrogge · Pull Request #312854 · microsoft/vscode

meganrogge · 2026-04-27T17:56:47Z

Problem

The terminal rich execute strategy hangs indefinitely when shell integration sequences stop arriving mid-command. This was the single root cause of all 19 timeouts (42% of failures) in eval run 24966302375.

The hang mechanism

In trackIdleOnPrompt():

Command starts → data events arrive → initialFallbackScheduler is cancelled
If a C/D shell integration sequence is seen, state → Executing
In Executing state, promptFallbackScheduler is also cancelled
Shell integration breaks → no A (prompt) sequence ever arrives
Both fallbacks cancelled → permanent hang

Eval evidence

Every timed-out task shows this exact pattern in terminal.log:

20:59:25 [info] Using `rich` execute strategy for command ` set -e; uname -s; ...`
20:59:35 [warning] Shell integration failed to add capabilities within 10 seconds
[nothing for 60 minutes until timeout]

15 of 19 timed-out tasks passed in the CLI with the same model (gpt-5.4), completing in 3-20 minutes. VS Code typically hung within the first minute — 5 tasks never completed a single terminal command.

Task	CLI time	VS Code active before hang	CLI tool calls
compile-compcert	20 min	8s (0 cmds completed)	117
sqlite-db-truncate	1.6 min	42s (0 cmds completed)	14
caffe-cifar-10	16 min	16s (1 cmd completed)	85
build-cython-ext	7 min	289s (1 cmd completed)	66
adaptive-rejection-sampler	6 min	36s (3 cmds completed)	19
make-doom-for-mips	22 min	20 min (34 cmds completed)	186

Fix

When the rich execute strategy detects that shell integration capabilities have failed (the 10-second warning we already log), abort the rich strategy and fall back to the basic strategy mid-flight. The basic strategy uses data-idle polling instead of shell integration sequences, so it does not hang.

This is complementary to:

fix: schedule promptFallbackScheduler immediately in trackIdleOnPrompt #312387 — initialFallbackScheduler (only covers no-data-at-all case)
Fix terminal hang when process exits without shell integration sequences #312827 — onExit race (only covers process-exit case, not long-running commands)

This fix covers the remaining gap: commands that produce data but where shell integration breaks, including long-running processes.

Scenario	SI works?	Before fix	After fix
Quick command (<30s)	✅	`onCommandFinished` wins race	Same — `onCommandFinished` wins
Long command (>30s), steady output	✅	`onCommandFinished` wins race	Same — fallback reschedules on every data event, `onCommandFinished` wins
Long command, 30s+ output pause	✅	`onCommandFinished` wins race	Same — `onCommandFinished` still wins the race
Quick command, SI breaks after C/D	❌	❌ Hangs forever	✅ Fallback fires after 30s data-idle
Long command with output, SI breaks	❌	❌ Hangs forever	✅ Fallback fires 30s after last output
Command finishes silently (no data at all)	❌	✅ `initialFallbackScheduler` at 10s	Same — unchanged
Long command, 30s+ pause, SI broken	❌	❌ Hangs forever	⚠️ Completes early during the pause

…breaks mid-command Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…eaks mid-command When shell integration sequences (C/D) transition trackIdleOnPrompt to the Executing state but no prompt (A) sequence follows, both the initialFallbackScheduler and promptFallbackScheduler are cancelled, causing the Promise to hang indefinitely. Add an executingFallbackScheduler that fires after 10s in the Executing state without a prompt, effectively downgrading from rich to basic behavior by treating data-idle as completion. Fixes #312853 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…completion The executingFallbackScheduler must be rescheduled on every data event while in the Executing state so it only fires after 10s of data-idle. Without this, long-running commands producing output for more than 10s would be prematurely completed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

When the executingFallbackScheduler fires after 10s of data-idle in the Executing state, check if the cursor line looks like a prompt before completing. If not, reschedule — this avoids prematurely completing long-running commands that pause without output (e.g. downloading, compiling). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The idle-only fallback with prompt detection would never fire when shell integration is truly broken (the terminal won't show a prompt pattern). Split into two complementary fallbacks: - executingIdleFallback (10s data-idle): checks for prompt pattern on the cursor line. Handles the case where the command finished but the prompt A sequence was lost. - executingHardCap (30s absolute): fires regardless of data activity or prompt detection. Ultimate safety net to prevent permanent hangs when shell integration breaks mid-command. The hard cap is NOT rescheduled on data events — it's a fixed 30s limit from the first C/D sequence. Both are cancelled when a proper prompt A sequence arrives. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…ration Replace the dual-timer approach (idle check + hard cap) with a single executingFallbackScheduler that fires after 30s of data-idle in the Executing state. This is simpler and avoids the problems of both prior approaches: - Hard cap (not rescheduled on data): would cut off long-running commands that are still producing output - Prompt heuristic: won't detect a prompt when shell integration is truly broken (that's the whole problem) The 30s data-idle approach is safe because: - Commands producing output reschedule the timer on every data event - When shell integration works, onCommandFinished wins the race - 30s of total silence after C/D with no prompt A is a strong signal that shell integration broke Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

meganrogge · 2026-04-27T18:47:25Z

cc @anthonykim1

meganrogge · 2026-04-27T19:06:23Z

~requires-eval-assessment terminalbench2

vs-code-engineering · 2026-04-27T19:07:49Z

⏳ Queued vscode build for 2df99ce29b70e5edeeac01717fd45e70544572a6 (step 1/2).

Build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=434210
When this succeeds, the eval-assessment publish build will be queued automatically.

Copilot

Pull request overview

This PR aims to prevent run_in_terminal hangs when the terminal “rich” execute path enters an Executing state (via shell integration C/D sequences) but never receives the prompt (A) sequence afterward, by adding an additional fallback timer inside trackIdleOnPrompt.

Changes:

Add an Executing-state data-idle fallback scheduler to avoid indefinite waits when prompt sequences never arrive.
Cancel/reschedule the new fallback scheduler based on observed shell integration sequences and terminal data events.

Show a summary per file

File	Description
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/executeStrategy.ts	Adds an `Executing`-state fallback scheduler intended to prevent permanent hangs when shell integration stops producing prompt sequences mid-command.

Copilot's findings

Comments suppressed due to low confidence (3)

src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/executeStrategy.ts:214

The PR description proposes downgrading from the rich strategy to the basic strategy when shell integration capability activation fails (the existing 10s warning). This change instead adds an unconditional 30s data-idle fallback inside trackIdleOnPrompt and does not actually switch strategies or tie the behavior to the capability activation failure. Either align the implementation with the described mid-flight strategy downgrade, or update the PR description/telemetry expectations to match the actual behavior change.

	// Fallback for when shell integration breaks mid-command: data arrives and
	// C/D sequences transition us to Executing, but no A (prompt) sequence ever
	// follows. Both initialFallbackScheduler and promptFallbackScheduler get
	// cancelled in that state, causing a permanent hang. This scheduler is
	// rescheduled on every data event while in the Executing state, so it only
	// fires after 30s of data-idle — long enough that actively-outputting
	// commands won't be cut off, but short enough to prevent indefinite hangs
	// when shell integration breaks. When shell integration is working,
	// onCommandFinished in the rich strategy's race wins before this fires.
	const executingFallbackScheduler = store.add(new RunOnceScheduler(() => {
		if (state === TerminalState.Executing) {
			state = TerminalState.PromptAfterExecuting;
			scheduler.schedule();
		}
	}, 30_000));

src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/executeStrategy.ts:208

The inline comment asserts that “when shell integration is working, onCommandFinished in the rich strategy's race wins before this fires”. That’s not guaranteed: a long-running command that emits C/D and then has a >30s quiet period will trigger this fallback even if shell integration is healthy. Please adjust the comment to accurately describe the tradeoff/behavior (and when this can fire despite healthy shell integration).

	// fires after 30s of data-idle — long enough that actively-outputting
	// commands won't be cut off, but short enough to prevent indefinite hangs
	// when shell integration breaks. When shell integration is working,
	// onCommandFinished in the rich strategy's race wins before this fires.

src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/executeStrategy.ts:214

This introduces new completion behavior for trackIdleOnPrompt in the Executing state (30s data-idle after a C/D sequence). There’s currently test coverage for execute strategies/prompt detection, but no unit test that exercises this state machine path. Please add a test that simulates receiving a C/D sequence without a subsequent A sequence and verifies the fallback resolves (and ideally that it doesn’t resolve if data continues to arrive).

	const executingFallbackScheduler = store.add(new RunOnceScheduler(() => {
		if (state === TerminalState.Executing) {
			state = TerminalState.PromptAfterExecuting;
			scheduler.schedule();
		}
	}, 30_000));

Files reviewed: 1/1 changed files
Comments generated: 1

Copilot · 2026-04-27T21:18:02Z

+	// Fallback for when shell integration breaks mid-command: data arrives and
+	// C/D sequences transition us to Executing, but no A (prompt) sequence ever
+	// follows. Both initialFallbackScheduler and promptFallbackScheduler get
+	// cancelled in that state, causing a permanent hang. This scheduler is
+	// rescheduled on every data event while in the Executing state, so it only
+	// fires after 30s of data-idle — long enough that actively-outputting
+	// commands won't be cut off, but short enough to prevent indefinite hangs
+	// when shell integration breaks. When shell integration is working,
+	// onCommandFinished in the rich strategy's race wins before this fires.
+	const executingFallbackScheduler = store.add(new RunOnceScheduler(() => {
+		if (state === TerminalState.Executing) {
+			state = TerminalState.PromptAfterExecuting;
+			scheduler.schedule();
+		}
+	}, 30_000));


executingFallbackScheduler can cause trackIdleOnPrompt (and therefore the rich/basic execute strategies’ onDone race) to resolve while the command is still running if the command is legitimately silent for >30s after a C/D sequence. That can lead to capturing incomplete output and sending subsequent commands while the previous one is still executing. This fallback should be gated on a definitive “shell integration is broken” signal (or a strategy-level opt-in), and/or use a much more conservative timeout/condition so it doesn’t treat “no output” as “command finished”.

This issue also appears in the following locations of the same file:

line 200

line 205

line 209

fix: downgrade rich to basic execute strategy when shell integration …

b57e169

…breaks mid-command Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 27, 2026 17:56

meganrogge self-assigned this Apr 27, 2026

meganrogge added this to the 1.119.0 milestone Apr 27, 2026

meganrogge and others added 5 commits April 27, 2026 14:26

meganrogge marked this pull request as ready for review April 27, 2026 18:47

meganrogge added the ~requires-eval-assessment Evals will be run and will generate a report upon completion label Apr 27, 2026

meganrogge enabled auto-merge (squash) April 27, 2026 19:15

anthonykim1 approved these changes Apr 27, 2026

View reviewed changes

meganrogge merged commit 8894b02 into main Apr 27, 2026
26 checks passed

meganrogge deleted the merogge/downgrade-rich-to-basic branch April 27, 2026 19:50

Copilot AI reviewed Apr 27, 2026

View reviewed changes

vs-code-engineering Bot added the on-testplan label May 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: downgrade rich to basic execute strategy when shell integration breaks mid-command#312854

fix: downgrade rich to basic execute strategy when shell integration breaks mid-command#312854
meganrogge merged 6 commits into
mainfrom
merogge/downgrade-rich-to-basic

meganrogge commented Apr 27, 2026 •

edited

Loading

Uh oh!

meganrogge commented Apr 27, 2026

Uh oh!

meganrogge commented Apr 27, 2026

Uh oh!

vs-code-engineering Bot commented Apr 27, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

meganrogge commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

The hang mechanism

Eval evidence

Fix

Uh oh!

meganrogge commented Apr 27, 2026

Uh oh!

meganrogge commented Apr 27, 2026

Uh oh!

vs-code-engineering Bot commented Apr 27, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

meganrogge commented Apr 27, 2026 •

edited

Loading