Skip to content

fix: cherry-pick B1-B10 audit real gaps (#2406 batch close)#2442

Merged
alexey-pelykh merged 10 commits intomainfrom
b10-cleanup-real-gaps
Apr 20, 2026
Merged

fix: cherry-pick B1-B10 audit real gaps (#2406 batch close)#2442
alexey-pelykh merged 10 commits intomainfrom
b10-cleanup-real-gaps

Conversation

@alexey-pelykh
Copy link
Copy Markdown

Summary

Cherry-picks the confirmed-real gaps from the #2406 B1-B10 audit.
False positives were filtered out in the HQ B1-B10 cleanup pass
(hq@fbdc14b,
hq@b2e4633,
hq@74ef389)
before this PR. Tier 2 ambiguous cases were investigated and dispositioned
(EXCLUDE-GUT / EXTRACT) in a separate HQ commit.

Production (2)

Tests (7 commits)

Follow-up (1)

  • 2f6dfcb3be — fix(config): regenerate base schema + add label after deferralTimeoutMs port (required after a2b6f35419 to keep schema snapshot + label parity tests green)

Rebrand adaptations

Several cherry-picks required openclawremoteclaw rebrand in new files or fixtures:

  • path-policy.test.ts: C:\Users\User\OpenClawRemoteClaw
  • reply-delivery.ts + test: openclaw/plugin-sdk/mattermost imports, OpenClawConfig, temp dir prefix
  • redact-snapshot.schema.test.ts: types.openclawtypes.remoteclaw, OpenClawSchemaRemoteClawSchema, OpenClawConfigRemoteClawConfig, ~/.openclaw~/.remoteclaw
  • logger.browser-import.test.ts: resolvePreferredOpenClawTmpDirresolvePreferredRemoteClawTmpDir, /tmp/openclaw/tmp/remoteclaw, openclaw.logremoteclaw.log

Skipped / reclassified (handled separately on HQ)

11 entries were reclassified as EXCLUDE-GUT (fork-gutted area) or EXTRACT (fork-divergent, needs separate WI). See HQ disposition commit for details. Highlights:

  • 98e30dc2a3 cron-model-override.test.ts → EXCLUDE-GUT (imports fork-gutted model-fallback/skills/model-catalog)
  • 81b93b9ce0 subagent-announce.capture-completion-reply.test.ts → EXCLUDE-GUT (captureSubagentCompletionReply fn not in fork)
  • 18f15850e6 browser/proxy-files.test.ts → EXCLUDE-GUT (parent Playwright-gutted by 1aae7da)
  • 8d805a02fd zalouser (zca-constants + test-mocks) → EXTRACT (fork uses runtime wrappers instead)
  • aeb2adf240 redact-snapshot.restore.test.ts → EXTRACT (fork has richer inline tests)
  • 59bcac472e mattermost/index.test.ts → EXTRACT (registrationMode: "setup-only" feature not in fork)
  • 86a3149b2e .npmignore → SKIP (fork uses whitelist files field, not blacklist)
  • d06cc77f38 server-context.ensure-browser-available → EXCLUDE-GUT (Playwright-gutted)
  • Plus 5 more Tier 2 EXTRACT classifications for fork-divergent extension internals

Test plan

  • pnpm check passes (format + tsgo + lint + custom gates)
  • pnpm test passes (7010 passed, 3 skipped)
  • -x attribution visible on each cherry-picked commit
  • Rebrand leakage gate (scripts/check-no-remoteclaw-ai.mjs) passes

Closes #2406 (all 32 flagged entries now dispositioned: 9 landed, 5 already FP'd in prior HQ steps, 18 classified on HQ).

kevinWangSheng and others added 10 commits April 19, 2026 21:28
…aw#30766)

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
(cherry picked from commit ef89b48)
When HTTP_PROXY / HTTPS_PROXY / ALL_PROXY environment variables are set,
CDP connections to localhost/127.0.0.1 can be incorrectly routed through
the proxy (e.g. via global-agent or undici proxy dispatcher), causing
browser control to fail.

Fix:
- New cdp-proxy-bypass module with utilities for direct localhost connections
- WebSocket (ws) CDP connections: pass explicit http.Agent to bypass any
  global proxy agent patching
- fetch-based CDP probes: wrap in withNoProxyForLocalhost() to temporarily
  set NO_PROXY for the duration of the call
- Playwright connectOverCDP: wrap in withNoProxyForLocalhost() since
  Playwright reads env vars internally
- 13 new tests covering getDirectAgentForCdp, hasProxyEnv, and
  withNoProxyForLocalhost (env save/restore, error recovery)

(cherry picked from commit c96234b)
…w#44021)

Merged via squash.

Prepared head SHA: 856f11f
Co-authored-by: LyleLiu666 <31182860+LyleLiu666@users.noreply.github.com>
Co-authored-by: mukhtharcm <56378562+mukhtharcm@users.noreply.github.com>
Reviewed-by: @mukhtharcm

(cherry picked from commit c965049)
Extract applyMSTeamsWebhookTimeouts + constants from monitor.ts into
standalone webhook-timeouts.ts module and update test import. Subset
of upstream dbd26e4 — fork takes only the msteams-local changes.

(cherry picked from commit dbd26e4)
Closes openclaw#47711

After a SIGUSR1 gateway reload aborts in-flight subagent LLM calls, the gateway now scans for orphaned sessions and sends a synthetic resume message to restart their work. Also makes the deferral timeout configurable via gateway.reload.deferralTimeoutMs (default: 5 minutes, up from 90s).

(cherry picked from commit 304703f)
…ess (openclaw#43385) thanks @Huntterxx

Merged after review.\n\nSmall, scoped fix: treat 0-byte Edge TTS output as failure so provider fallback can continue.

(cherry picked from commit 946c24d)
…Ms port

Follow-up to cherry-pick 304703f (fix: resume orphaned subagent
sessions after SIGUSR1 reload):

- Regenerate src/config/schema.base.generated.ts (picks up new
  gateway.reload.deferralTimeoutMs field + help text)
- Add FIELD_LABELS entry for gateway.reload.deferralTimeoutMs
  (schema.help.quality test enforces label parity)
- Update infra-runtime timeout test to advance 5m instead of 90s
  (default raised by the fix)
@alexey-pelykh alexey-pelykh merged commit 028566c into main Apr 20, 2026
13 checks passed
@alexey-pelykh alexey-pelykh deleted the b10-cleanup-real-gaps branch April 20, 2026 06:21
alexey-pelykh added a commit that referenced this pull request Apr 22, 2026
…d-turn termination (#2345)

`server-reload-handlers.ts::getActiveCounts()` and
`server.impl.ts::setPreRestartDeferralCheck()` both computed gateway
capacity / restart-deferral as `queueSize + pendingReplies`, silently
omitting active CLI agent subprocess runs. Result: a config reload or
SIGUSR1-triggered restart would fire while CLI agents were mid-turn,
killing live subprocess runs.

Fix: include `getActiveSessionRunCount()` from
`src/agents/session-run-registry.ts` in both sums. The registry is
already populated by `ChannelBridge` (register at `:160`, unregister
at `:349`) and was designed as the replacement for the
`getActiveEmbeddedRunCount()` stub that was gutted with the Pi-embedded
execution engine.

Changes:
- `src/gateway/server-reload-handlers.ts`: import
  `getActiveSessionRunCount`; add `activeCliRuns` field to
  `getActiveCounts()` return; fold into `totalActive` sum; extend
  `formatActiveDetails()` with an `activeCliRuns > 0` branch so the
  deferral log reports "N active CLI run(s)" alongside operations
  and replies. Field name `activeCliRuns` (not `activeRuns`) to
  disambiguate from the per-channel `activeRuns` concept used in
  `channels/run-state-machine.ts`, `gateway/channel-health-policy.ts`,
  and related modules.
- `src/gateway/server.impl.ts`: import `getActiveSessionRunCount`;
  add `+ getActiveSessionRunCount()` to the `setPreRestartDeferralCheck`
  callback arrow.

Note on stale issue body: #2345 prescribes editing a hardcoded
`embeddedRuns = 0` in `server-reload-handlers.ts:150-158` and
replacing a `getActiveEmbeddedRunCount()` import + call at
`server.impl.ts:302`. Neither exists in the current tree — both were
removed during the pi-embedded-runner gut (`f749ed3fb6`, #2146/#2273)
and the subsequent cherry-pick cleanup (`028566c42b`, #2442). The
semantic bug the issue names (capacity sums miscount active CLI runs
as zero) is still present as an *omission* rather than a hardcoded
literal, and this PR fixes it.

Verification:
- `pnpm check` (format + tsgo + lint + project-specific lints) → exit 0
- `pnpm vitest run --config vitest.unit.config.ts src/infra/restart
  src/infra/infra-runtime src/agents/session-run-registry
  src/gateway/server.impl` → 5 files, 71 tests, all passed
- Rescan: `git grep "getTotalQueueSize() + getTotalPendingReplies"`
  returns only the one site I updated; no other sum call sites in the
  codebase need the same fix
- Adversarial validation (fresh-context subclaude): CLEAN verdict on
  8 AC + 11 adversarial checks — confirms `getActiveSessionRunCount`
  is LIVE (not a stub), registry is actively populated by ChannelBridge,
  no import cycle possible (session-run-registry has zero imports),
  `formatActiveDetails` correctly handles multi-counter output

Closes #2345

Refs: #2089

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
alexey-pelykh added a commit that referenced this pull request Apr 22, 2026
…d-turn termination (#2345) (#2460)

`server-reload-handlers.ts::getActiveCounts()` and
`server.impl.ts::setPreRestartDeferralCheck()` both computed gateway
capacity / restart-deferral as `queueSize + pendingReplies`, silently
omitting active CLI agent subprocess runs. Result: a config reload or
SIGUSR1-triggered restart would fire while CLI agents were mid-turn,
killing live subprocess runs.

Fix: include `getActiveSessionRunCount()` from
`src/agents/session-run-registry.ts` in both sums. The registry is
already populated by `ChannelBridge` (register at `:160`, unregister
at `:349`) and was designed as the replacement for the
`getActiveEmbeddedRunCount()` stub that was gutted with the Pi-embedded
execution engine.

Changes:
- `src/gateway/server-reload-handlers.ts`: import
  `getActiveSessionRunCount`; add `activeCliRuns` field to
  `getActiveCounts()` return; fold into `totalActive` sum; extend
  `formatActiveDetails()` with an `activeCliRuns > 0` branch so the
  deferral log reports "N active CLI run(s)" alongside operations
  and replies. Field name `activeCliRuns` (not `activeRuns`) to
  disambiguate from the per-channel `activeRuns` concept used in
  `channels/run-state-machine.ts`, `gateway/channel-health-policy.ts`,
  and related modules.
- `src/gateway/server.impl.ts`: import `getActiveSessionRunCount`;
  add `+ getActiveSessionRunCount()` to the `setPreRestartDeferralCheck`
  callback arrow.

Note on stale issue body: #2345 prescribes editing a hardcoded
`embeddedRuns = 0` in `server-reload-handlers.ts:150-158` and
replacing a `getActiveEmbeddedRunCount()` import + call at
`server.impl.ts:302`. Neither exists in the current tree — both were
removed during the pi-embedded-runner gut (`f749ed3fb6`, #2146/#2273)
and the subsequent cherry-pick cleanup (`028566c42b`, #2442). The
semantic bug the issue names (capacity sums miscount active CLI runs
as zero) is still present as an *omission* rather than a hardcoded
literal, and this PR fixes it.

Verification:
- `pnpm check` (format + tsgo + lint + project-specific lints) → exit 0
- `pnpm vitest run --config vitest.unit.config.ts src/infra/restart
  src/infra/infra-runtime src/agents/session-run-registry
  src/gateway/server.impl` → 5 files, 71 tests, all passed
- Rescan: `git grep "getTotalQueueSize() + getTotalPendingReplies"`
  returns only the one site I updated; no other sum call sites in the
  codebase need the same fix
- Adversarial validation (fresh-context subclaude): CLEAN verdict on
  8 AC + 11 adversarial checks — confirms `getActiveSessionRunCount`
  is LIVE (not a stub), registry is actively populated by ChannelBridge,
  no import cycle possible (session-run-registry has zero imports),
  `formatActiveDetails` correctly handles multi-counter output

Closes #2345

Refs: #2089

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Audit: upstream fixes silently lost in sync (v2026.2.25 → v2026.3.22, 32 gaps)

9 participants