Bug type
Regression (worked before, now fails)
Beta release blocker
No
Summary
The analyzeInterpreterHeuristicsFromUnquoted() function in pi-embedded-helpers contains two regex patterns with catastrophic backtracking (ReDoS). When the exec tool processes a long shell command containing VAR=value assignments followed by extensive text (e.g. HTML heredocs), the regex engine enters an infinite loop, consuming 100% CPU and permanently blocking the Node.js event loop. The gateway becomes completely unresponsive — no HTTP requests can be served, no new WebSocket connections can be established.
Root cause regex (line ~3084 in pi-embedded-BYdcxQ5A.js):
/(?:^|\s|(?<!\\)[|&;()])(?:[A-Za-z_][A-Za-z0-9_]*=.*\s+)*python/i
/(?:^|\s|(?<!\\)[|&;()])(?:[A-Za-z_][A-Za-z0-9_]*=.*\s+)*node/i
The inner .*\s+ is the problem: both .* and \s+ can match whitespace characters. When surrounded by the ()* outer quantifier, this creates exponentially many ways to partition whitespace, causing catastrophic backtracking on inputs with spaces.
Steps to reproduce
- Configure an agent with
qwen/glm-5 or qwen/qwen3.5-plus model via dashscope API
- Ask the agent to generate a shell command that writes HTML content using a heredoc, e.g.:
ACCESS_TOKEN=$(curl -s "https://api.example.com/token" | jq -r '.access_token')
cat > /tmp/content.html << 'HTMLEOF'
<section style="padding: 30px 20px; font-family: ...">
... (4000+ chars of HTML with many spaces) ...
</section>
HTMLEOF
- The exec tool receives this command and calls
analyzeInterpreterHeuristicsFromUnquoted(raw)
- The python/node detection regex runs
.test() on the full command string
- Gateway process hangs at 100% CPU indefinitely
Minimal reproduction:
const re = /(?:^|\s|(?<!\\)[|&;()])(?:[A-Za-z_][A-Za-z0-9_]*=.*\s+)*python(?:3)?(?=$|[\s|&;()])/i;
const cmd = fs.readFileSync("command_with_html_heredoc.txt", "utf8"); // ~5KB shell script
re.test(cmd); // HANGS FOREVER
Expected behavior
The regex should complete in O(n) time regardless of input. The analyzeInterpreterHeuristicsFromUnquoted function should return quickly even for long shell commands with embedded HTML/text content.
Actual behavior
The gateway process enters an infinite CPU loop inside RegExpPrototypeTest (confirmed via macOS sample profiling — 800/800 samples in regex engine). The process:
- Consumes 100% CPU permanently
- Cannot serve any HTTP requests (all connections hang)
- Cannot accept new WebSocket connections
- Cannot be stopped with SIGTERM (requires SIGKILL)
- LLM API responses time out because the event loop is blocked
The gateway must be force-killed (kill -9) and restarted.
OpenClaw version
2026.4.2
Operating system
macOS 26.3.1 (Darwin 25.3.0, arm64)
Install method
npm global
Model
qwen/glm-5, qwen/qwen3.5-plus
Provider / routing chain
dashscope (Alibaba Cloud)
Additional provider/model setup details
N/A — the bug is model-independent. It triggers whenever the LLM generates a long shell command with VAR=value assignments followed by text containing many whitespace characters (e.g. HTML heredocs, embedded JSON).
Logs, screenshots, and evidence
**Process sample** (macOS `sample` command on the stuck gateway):
800/800 samples (100%) in:
node::crypto::TLSWrap::OnStreamRead
→ v8::internal::MicrotaskQueue::RunMicrotasks
→ Builtins_AsyncFunctionAwaitResolveClosure
→ Builtins_RegExpPrototypeTest ← stuck here
**Error log** shows repeated LLM timeouts caused by blocked event loop:
[diagnostic] lane task error: lane=session:agent:main:main error="FailoverError: LLM request timed out."
**Process state**: `R` (running), 100% CPU, 105+ minutes accumulated CPU time, unresponsive to SIGTERM.
Impact and severity
This is a critical bug:
- Severity: Gateway becomes permanently unresponsive, requiring force-kill
- Affected: Any user whose agent generates shell commands with heredocs or long text content containing whitespace
- Impact: Complete denial of service — all channels (webchat, Feishu, etc.) stop working
- Trigger: Normal agent usage (not adversarial input) — the triggering command was a WeChat article publishing script with embedded HTML
Additional information
Suggested fix: Replace .* with \S* in both regex patterns to eliminate the overlap between .* and \s+:
- /(?:[A-Za-z_][A-Za-z0-9_]*=.*\s+)*python/i
+ /(?:[A-Za-z_][A-Za-z0-9_]*=\S*\s+)*python/i
- /(?:[A-Za-z_][A-Za-z0-9_]*=.*\s+)*node/i
+ /(?:[A-Za-z_][A-Za-z0-9_]*=\S*\s+)*node/i
This fix:
- Eliminates catastrophic backtracking (0.15ms vs infinite hang on the same input)
- Preserves correct detection of
VAR=value python patterns
- Is semantically more accurate: env var values in shell assignments cannot contain unescaped spaces
Source file: src/agents/pi-embedded-helpers/failover-matches.ts (or equivalent source for analyzeInterpreterHeuristicsFromUnquoted)
Bug type
Regression (worked before, now fails)
Beta release blocker
No
Summary
The
analyzeInterpreterHeuristicsFromUnquoted()function inpi-embedded-helperscontains two regex patterns with catastrophic backtracking (ReDoS). When the exec tool processes a long shell command containingVAR=valueassignments followed by extensive text (e.g. HTML heredocs), the regex engine enters an infinite loop, consuming 100% CPU and permanently blocking the Node.js event loop. The gateway becomes completely unresponsive — no HTTP requests can be served, no new WebSocket connections can be established.Root cause regex (line ~3084 in
pi-embedded-BYdcxQ5A.js):The inner
.*\s+is the problem: both.*and\s+can match whitespace characters. When surrounded by the()*outer quantifier, this creates exponentially many ways to partition whitespace, causing catastrophic backtracking on inputs with spaces.Steps to reproduce
qwen/glm-5orqwen/qwen3.5-plusmodel via dashscope APIanalyzeInterpreterHeuristicsFromUnquoted(raw).test()on the full command stringMinimal reproduction:
Expected behavior
The regex should complete in O(n) time regardless of input. The
analyzeInterpreterHeuristicsFromUnquotedfunction should return quickly even for long shell commands with embedded HTML/text content.Actual behavior
The gateway process enters an infinite CPU loop inside
RegExpPrototypeTest(confirmed via macOSsampleprofiling — 800/800 samples in regex engine). The process:The gateway must be force-killed (
kill -9) and restarted.OpenClaw version
2026.4.2
Operating system
macOS 26.3.1 (Darwin 25.3.0, arm64)
Install method
npm global
Model
qwen/glm-5, qwen/qwen3.5-plus
Provider / routing chain
dashscope (Alibaba Cloud)
Additional provider/model setup details
N/A — the bug is model-independent. It triggers whenever the LLM generates a long shell command with
VAR=valueassignments followed by text containing many whitespace characters (e.g. HTML heredocs, embedded JSON).Logs, screenshots, and evidence
Impact and severity
This is a critical bug:
Additional information
Suggested fix: Replace
.*with\S*in both regex patterns to eliminate the overlap between.*and\s+:This fix:
VAR=value pythonpatternsSource file:
src/agents/pi-embedded-helpers/failover-matches.ts(or equivalent source foranalyzeInterpreterHeuristicsFromUnquoted)