Skip to content

Commit 42435d1

Browse files
hclhclsyssteipete
authored
fix(browser): derive Chrome launch readiness from a single CDP diagnostic (#82904) (#82986)
* fix(browser): derive Chrome launch readiness from a single CDP diagnostic (#82904) The pre-fix launch path used `isChromeReachable` (a lightweight HTTP `/json/version` probe) to decide failure, then called the stronger `diagnoseChromeCdp` only to format the thrown error. On macOS cold starts where the HTTP probe transiently fails *between* the polling loop and the diagnostic call, the runtime would throw "Failed to start Chrome CDP on port ... { ok: true, wsUrl: ... }" — a self-contradicting error containing a successful diagnostic result. Per #82904 this is the actual user-visible bug. Capture `diagnoseChromeCdp` ONCE after the polling loop and use it for both the decision and the error text. The diagnostic helper already includes the lightweight reachability check and adds a websocket `Browser.getVersion` health command, so it is strictly stronger than the HTTP probe; if `diagnoseChromeCdp` returns ok the launch genuinely succeeded. The existing `withMockChromeCdpServer` success test in chrome.internal.test.ts still exercises this code path end-to-end (real HTTP server + real websocket handshake), so the regression-safety case is covered. The asymmetric `probe-fails-but-diagnostic-succeeds` scenario is hard to mock without restructuring the existing test harness; this commit ships the fix and relies on the upstream ClawSweeper review criteria (manual managed-Chrome cold-start proof) plus the standalone real-behavior probe in the PR body. * fix(browser): import ChromeCdpDiagnostic type from chrome.diagnostics The annotation `let finalDiagnostic: ChromeCdpDiagnostic | null` referenced a type that was only re-exported (not imported) inside chrome.ts, causing oxlint/tsc to read it as the implicit `error` type and fail check-lint, check-prod-types, check-test-types, etc. Add the type to the existing chrome.diagnostics.js import block. * fix(browser): preserve Chrome launch diagnostic fallback * test(browser): satisfy launch diagnostic lint * fix(browser): keep Chrome launch readiness scoped * test(browser): answer CDP launch mock probe --------- Co-authored-by: hclsys <hclsys@users.noreply.github.com> Co-authored-by: Peter Steinberger <steipete@gmail.com>
1 parent bf51933 commit 42435d1

3 files changed

Lines changed: 233 additions & 54 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,7 @@ Docs: https://docs.openclaw.ai
7777
- Process/diagnostics: report active lane blockers in lane wait warnings so `queueAhead=0` no longer hides commands waiting behind active work. Fixes #82791. (#82792) Thanks @galiniliev.
7878
- Process/diagnostics: stop counting the active processing turn as queued backlog in liveness warnings so transient max-only event-loop spikes do not surface as gateway warnings.
7979
- Agents/replies: classify provider conversation-state rejections and return a clear message-channel error instead of auto-resetting or falling back to a generic runner failure. (#82616) Thanks @dutifulbob.
80+
- Browser plugin: trust managed Chrome CDP diagnostics when launch HTTP probes race cold-start readiness, avoiding false startup failures. Fixes #82904. (#82986) Thanks @kmanan and @hclsys.
8081

8182
## 2026.5.17
8283

extensions/browser/src/browser/chrome.internal.test.ts

Lines changed: 166 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,29 @@ async function withMockChromeCdpServer(params: {
153153
wss.emit("connection", ws, req);
154154
});
155155
});
156-
params.onConnection?.(wss);
156+
if (params.onConnection) {
157+
params.onConnection(wss);
158+
} else {
159+
wss.on("connection", (ws) => {
160+
ws.on("message", (raw) => {
161+
const message = JSON.parse(rawDataToString(raw)) as {
162+
id?: unknown;
163+
method?: unknown;
164+
};
165+
if (message.method === "Browser.getVersion" && typeof message.id === "number") {
166+
ws.send(
167+
JSON.stringify({
168+
id: message.id,
169+
result: {
170+
product: "Chrome/Mock",
171+
userAgent: "OpenClawTest",
172+
},
173+
}),
174+
);
175+
}
176+
});
177+
});
178+
}
157179
await new Promise<void>((resolve, reject) => {
158180
server.listen(0, "127.0.0.1", () => resolve());
159181
server.once("error", reject);
@@ -484,6 +506,119 @@ describe("chrome.ts internal", () => {
484506
});
485507
});
486508

509+
it("accepts a ready CDP diagnostic after the launch HTTP probe expires", async () => {
510+
vi.spyOn(fs, "existsSync").mockImplementation((p) => {
511+
const s = String(p);
512+
if (
513+
s.includes("Google Chrome") ||
514+
s.includes("google-chrome") ||
515+
s.includes("/usr/bin/chromium")
516+
) {
517+
return true;
518+
}
519+
if (s.endsWith("Local State") || s.endsWith("Preferences")) {
520+
return true;
521+
}
522+
return false;
523+
});
524+
spawnMock.mockImplementation(() => makeFakeProc());
525+
526+
const originalFetch = globalThis.fetch;
527+
let now = 1_000_000;
528+
vi.spyOn(Date, "now").mockImplementation(() => now);
529+
let discoveryCalls = 0;
530+
vi.stubGlobal(
531+
"fetch",
532+
vi.fn(async (input: RequestInfo | URL, init?: RequestInit) => {
533+
const url =
534+
typeof input === "string" ? input : input instanceof URL ? input.href : input.url;
535+
if (url.includes("/json/version")) {
536+
discoveryCalls += 1;
537+
if (discoveryCalls === 1) {
538+
now += 2;
539+
throw new Error("ECONNREFUSED");
540+
}
541+
}
542+
return await originalFetch(input, init);
543+
}),
544+
);
545+
546+
await withMockChromeCdpServer({
547+
wsPath: "/devtools/browser/COLD_START",
548+
run: async (baseUrl) => {
549+
const port = new URL(baseUrl).port;
550+
const profile = makeProfile(Number(port));
551+
const running = await launchOpenClawChrome(
552+
makeResolved({ localLaunchTimeoutMs: 1 }),
553+
profile,
554+
);
555+
expect(running.pid).toBe(4242);
556+
expect(discoveryCalls).toBeGreaterThan(1);
557+
running.proc.kill?.("SIGTERM");
558+
},
559+
});
560+
});
561+
562+
it("keeps the launched process when fallback diagnostic sees HTTP before WS readiness", async () => {
563+
vi.spyOn(fs, "existsSync").mockImplementation((p) => {
564+
const s = String(p);
565+
if (
566+
s.includes("Google Chrome") ||
567+
s.includes("google-chrome") ||
568+
s.includes("/usr/bin/chromium")
569+
) {
570+
return true;
571+
}
572+
if (s.endsWith("Local State") || s.endsWith("Preferences")) {
573+
return true;
574+
}
575+
return false;
576+
});
577+
const fakeProc = makeFakeProc();
578+
spawnMock.mockImplementation(() => fakeProc);
579+
580+
const originalFetch = globalThis.fetch;
581+
let now = 1_000_000;
582+
vi.spyOn(Date, "now").mockImplementation(() => now);
583+
let discoveryCalls = 0;
584+
vi.stubGlobal(
585+
"fetch",
586+
vi.fn(async (input: RequestInfo | URL, init?: RequestInit) => {
587+
const url =
588+
typeof input === "string" ? input : input instanceof URL ? input.href : input.url;
589+
if (url.includes("/json/version")) {
590+
discoveryCalls += 1;
591+
if (discoveryCalls === 1) {
592+
now += 2;
593+
throw new Error("ECONNREFUSED");
594+
}
595+
}
596+
return await originalFetch(input, init);
597+
}),
598+
);
599+
600+
await withMockChromeCdpServer({
601+
wsPath: "/devtools/browser/WS_WARMING",
602+
onConnection: (wss) => {
603+
wss.on("connection", () => {
604+
// HTTP discovery is enough for launch; caller owns WS readiness.
605+
});
606+
},
607+
run: async (baseUrl) => {
608+
const port = new URL(baseUrl).port;
609+
const profile = makeProfile(Number(port));
610+
const running = await launchOpenClawChrome(
611+
makeResolved({ localLaunchTimeoutMs: 1 }),
612+
profile,
613+
);
614+
expect(running.pid).toBe(4242);
615+
expect(discoveryCalls).toBeGreaterThan(1);
616+
expect(fakeProc.kill).not.toHaveBeenCalledWith("SIGKILL");
617+
running.proc.kill?.("SIGTERM");
618+
},
619+
});
620+
});
621+
487622
it("uses profile executablePath over global executablePath when launching", async () => {
488623
const originalPlatform = process.platform;
489624
vi.spyOn(fs, "existsSync").mockImplementation((p) => {
@@ -528,16 +663,14 @@ describe("chrome.ts internal", () => {
528663
);
529664
vi.stubEnv("OPENCLAW_CONFIG_PATH", configPath);
530665
let cdpReachable = false;
666+
const originalFetch = globalThis.fetch;
531667
vi.stubGlobal(
532668
"fetch",
533-
vi.fn(async () => {
669+
vi.fn(async (input: RequestInfo | URL, init?: RequestInit) => {
534670
if (!cdpReachable) {
535671
throw new Error("ECONNREFUSED");
536672
}
537-
return {
538-
ok: true,
539-
json: async () => ({ webSocketDebuggerUrl: "ws://127.0.0.1/devtools" }),
540-
} as unknown as Response;
673+
return await originalFetch(input, init);
541674
}),
542675
);
543676
vi.spyOn(fs, "existsSync").mockImplementation((p) => {
@@ -567,27 +700,33 @@ describe("chrome.ts internal", () => {
567700
return secondProc;
568701
});
569702

570-
const profile = { ...makeProfile(18888), executablePath: "/tmp/profile-chrome" };
571-
const userDataDir = resolveOpenClawUserDataDir(profile.name);
572-
await fsp.mkdir(userDataDir, { recursive: true });
573-
await fsp.writeFile(path.join(userDataDir, "SingletonCookie"), "cookie");
574-
await fsp.writeFile(path.join(userDataDir, "SingletonSocket"), "socket");
575-
await fsp.symlink("remote-host-535", path.join(userDataDir, "SingletonLock"));
576-
577-
try {
578-
const running = await launchOpenClawChrome(
579-
makeResolved({ localLaunchTimeoutMs: 20 }),
580-
profile,
581-
);
582-
expect(running.proc).toBe(secondProc);
583-
expect(firstProc.kill).toHaveBeenCalledWith("SIGKILL");
584-
expect(spawnCalls).toBe(2);
585-
expect(fs.existsSync(path.join(userDataDir, "SingletonLock"))).toBe(false);
586-
expect(fs.existsSync(path.join(userDataDir, "SingletonSocket"))).toBe(false);
587-
running.proc.kill?.("SIGTERM");
588-
} finally {
589-
await fsp.rm(userDataDir, { recursive: true, force: true });
590-
}
703+
await withMockChromeCdpServer({
704+
wsPath: "/devtools/browser/SINGLETON_RETRY",
705+
run: async (baseUrl) => {
706+
const port = Number(new URL(baseUrl).port);
707+
const profile = { ...makeProfile(port), executablePath: "/tmp/profile-chrome" };
708+
const userDataDir = resolveOpenClawUserDataDir(profile.name);
709+
await fsp.mkdir(userDataDir, { recursive: true });
710+
await fsp.writeFile(path.join(userDataDir, "SingletonCookie"), "cookie");
711+
await fsp.writeFile(path.join(userDataDir, "SingletonSocket"), "socket");
712+
await fsp.symlink("remote-host-535", path.join(userDataDir, "SingletonLock"));
713+
714+
try {
715+
const running = await launchOpenClawChrome(
716+
makeResolved({ localLaunchTimeoutMs: 20 }),
717+
profile,
718+
);
719+
expect(running.proc).toBe(secondProc);
720+
expect(firstProc.kill).toHaveBeenCalledWith("SIGKILL");
721+
expect(spawnCalls).toBe(2);
722+
expect(fs.existsSync(path.join(userDataDir, "SingletonLock"))).toBe(false);
723+
expect(fs.existsSync(path.join(userDataDir, "SingletonSocket"))).toBe(false);
724+
running.proc.kill?.("SIGTERM");
725+
} finally {
726+
await fsp.rm(userDataDir, { recursive: true, force: true });
727+
}
728+
},
729+
});
591730
});
592731

593732
it("throws with stderr hint + sandbox hint when CDP never becomes reachable", async () => {

extensions/browser/src/browser/chrome.ts

Lines changed: 66 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ import {
3333
} from "./cdp.helpers.js";
3434
import { normalizeCdpWsUrl } from "./cdp.js";
3535
import {
36+
type ChromeCdpDiagnostic,
3637
diagnoseChromeCdp,
3738
formatChromeCdpDiagnostic,
3839
type ChromeVersion,
@@ -72,6 +73,12 @@ const CHROME_SINGLETON_LOCK_PATHS = [
7273
] as const;
7374
const CHROME_SINGLETON_IN_USE_PATTERN = /profile appears to be in use by another chromium process/i;
7475
const CHROME_MISSING_DISPLAY_PATTERN = /missing x server|\$DISPLAY/i;
76+
const CHROME_HTTP_DISCOVERY_FAILURE_CODES = new Set([
77+
"ssrf_blocked",
78+
"http_unreachable",
79+
"http_status_failed",
80+
"invalid_json",
81+
]);
7582

7683
export type { BrowserExecutable } from "./chrome.executables.js";
7784
export {
@@ -100,6 +107,16 @@ function exists(filePath: string) {
100107
}
101108
}
102109

110+
function diagnosticShowsChromeHttpDiscovery(diagnostic: ChromeCdpDiagnostic | null): boolean {
111+
if (!diagnostic) {
112+
return false;
113+
}
114+
if (diagnostic.ok) {
115+
return true;
116+
}
117+
return !CHROME_HTTP_DISCOVERY_FAILURE_CODES.has(diagnostic.code);
118+
}
119+
103120
function processExists(pid: number): boolean {
104121
if (!Number.isInteger(pid) || pid <= 0) {
105122
return false;
@@ -531,43 +548,65 @@ export async function launchOpenClawChrome(
531548
try {
532549
const readyDeadline =
533550
Date.now() + (resolved.localLaunchTimeoutMs ?? CHROME_LAUNCH_READY_WINDOW_MS);
551+
let launchHttpReachable = false;
552+
// Full CDP WebSocket readiness is handled by the caller's
553+
// waitForCdpReadyAfterLaunch() budget; launch only owns process discovery.
534554
while (Date.now() < readyDeadline) {
535555
if (await isChromeReachable(profile.cdpUrl)) {
556+
launchHttpReachable = true;
536557
break;
537558
}
538559
await new Promise((r) => setTimeout(r, CHROME_LAUNCH_READY_POLL_MS));
539560
}
540561

541-
if (!(await isChromeReachable(profile.cdpUrl))) {
542-
const diagnosticText = await diagnoseChromeCdp(profile.cdpUrl)
543-
.then(formatChromeCdpDiagnostic)
544-
.catch((err) => `CDP diagnostic failed: ${safeChromeCdpErrorMessage(err)}.`);
545-
const stderrOutput =
546-
normalizeOptionalString(Buffer.concat(stderrChunks).toString("utf8")) ?? "";
547-
const redactedStderrOutput = redactToolPayloadText(stderrOutput);
548-
if (
549-
allowSingletonRecovery &&
550-
CHROME_SINGLETON_IN_USE_PATTERN.test(stderrOutput) &&
551-
clearStaleChromeSingletonLocks(userDataDir)
552-
) {
553-
log.warn(
554-
`Removed stale Chromium Singleton* locks for profile "${profile.name}" and retrying launch.`,
562+
if (!launchHttpReachable) {
563+
let finalDiagnostic: ChromeCdpDiagnostic | null = null;
564+
let diagnosticErrorText: string | null = null;
565+
try {
566+
finalDiagnostic = await diagnoseChromeCdp(
567+
profile.cdpUrl,
568+
CHROME_REACHABILITY_TIMEOUT_MS,
569+
CHROME_WS_READY_TIMEOUT_MS,
555570
);
556-
await terminateChromeForRetry(proc, userDataDir);
557-
return await launchOnceAndWait(false);
571+
} catch (err) {
572+
diagnosticErrorText = `CDP diagnostic failed: ${safeChromeCdpErrorMessage(err)}.`;
558573
}
559-
const stderrHint = redactedStderrOutput
560-
? `\nChrome stderr:\n${redactedStderrOutput.slice(0, CHROME_STDERR_HINT_MAX_CHARS)}`
561-
: "";
562-
const launchHints = chromeLaunchHints({ stderrOutput, resolved, profile, launchOptions });
563-
try {
564-
proc.kill("SIGKILL");
565-
} catch {
566-
// ignore
574+
if (diagnosticShowsChromeHttpDiscovery(finalDiagnostic)) {
575+
launchHttpReachable = true;
576+
}
577+
const diagnosticText = finalDiagnostic
578+
? formatChromeCdpDiagnostic(finalDiagnostic)
579+
: (diagnosticErrorText ?? "CDP diagnostic failed.");
580+
if (launchHttpReachable) {
581+
log.debug(diagnosticText);
582+
} else {
583+
const stderrOutput =
584+
normalizeOptionalString(Buffer.concat(stderrChunks).toString("utf8")) ?? "";
585+
const redactedStderrOutput = redactToolPayloadText(stderrOutput);
586+
if (
587+
allowSingletonRecovery &&
588+
CHROME_SINGLETON_IN_USE_PATTERN.test(stderrOutput) &&
589+
clearStaleChromeSingletonLocks(userDataDir)
590+
) {
591+
log.warn(
592+
`Removed stale Chromium Singleton* locks for profile "${profile.name}" and retrying launch.`,
593+
);
594+
await terminateChromeForRetry(proc, userDataDir);
595+
return await launchOnceAndWait(false);
596+
}
597+
const stderrHint = redactedStderrOutput
598+
? `\nChrome stderr:\n${redactedStderrOutput.slice(0, CHROME_STDERR_HINT_MAX_CHARS)}`
599+
: "";
600+
const launchHints = chromeLaunchHints({ stderrOutput, resolved, profile, launchOptions });
601+
try {
602+
proc.kill("SIGKILL");
603+
} catch {
604+
// ignore
605+
}
606+
throw new Error(
607+
`Failed to start Chrome CDP on port ${profile.cdpPort} for profile "${profile.name}". ${diagnosticText}${launchHints}${stderrHint}`,
608+
);
567609
}
568-
throw new Error(
569-
`Failed to start Chrome CDP on port ${profile.cdpPort} for profile "${profile.name}". ${diagnosticText}${launchHints}${stderrHint}`,
570-
);
571610
}
572611

573612
const pid = proc.pid ?? -1;

0 commit comments

Comments
 (0)