feat(daemon): add POST /workspace/reload-env for hot-reloading env vars and session auth#4924
Conversation
qwen-code-ci-bot
left a comment
There was a problem hiding this comment.
Thanks for the PR, @doudouOUC!
The PR body doesn't follow the pull request template. Several required sections are missing or renamed:
| Template requires | PR has |
|---|---|
## What this PR does |
## Summary |
## Why it's needed |
(missing) |
## Reviewer Test Plan (+ ### How to verify, ### Evidence, ### Tested on) |
## Test plan (different format) |
## Risk & Scope |
(missing) |
## Linked Issues |
(missing) |
<details> Chinese translation |
(missing) |
Please update the PR description to match the template headings. The "Reviewer Test Plan" section is especially important — it's how reviewers confirm the change works, and without it review tends to stall.
中文说明
感谢 PR,@doudouOUC!
PR 描述没有按照 PR 模板 的格式来写,部分必填章节缺失或改名了:
| 模板要求 | PR 实际 |
|---|---|
## What this PR does |
## Summary |
## Why it's needed |
(缺失) |
## Reviewer Test Plan(含 ### How to verify、### Evidence、### Tested on) |
## Test plan(格式不同) |
## Risk & Scope |
(缺失) |
## Linked Issues |
(缺失) |
<details> 中文翻译 |
(缺失) |
请按照模板标题更新 PR 描述。Reviewer Test Plan 部分尤为重要——它是 reviewer 确认改动是否生效的依据,缺少的话 review 容易卡住。
wenshao
left a comment
There was a problem hiding this comment.
[Critical] server.test.ts:1129 — [typecheck] TS2322: workspaceReloadEnv mock 返回类型不匹配,导致类型分配错误。
[Critical] server.test.ts:1157 — [typecheck] TS2322: reloadEnv SDK mock 返回类型不匹配,导致类型分配错误。
(以上两条因引用行未在 PR diff 中变更,无法作为 inline comment,特此以 body 方式提供。)
[Critical] settings.ts reloadEnvironment 零测试覆盖 — 新增约 130 行核心逻辑(RELOAD_EXCLUDED_KEYS 过滤、删除语义、跟踪状态变更、CLOUD_SHELL 模式)无任何测试。
[Critical] acpAgent.ts workspaceReloadEnv handler 零测试覆盖 — 级联到 ACP 子进程的 handler(约 35 行)无测试。
Review Response SummaryFix commit: 1f0cf29
4 fixed / 4 pushed back / 2 false positive / 2 deferred |
…rs and session auth Add a new daemon endpoint that reloads environment variables from .env files and settings.env without restarting the daemon, and refreshes auth on all idle sessions so both new and existing sessions immediately use updated credentials (e.g. API keys). Core changes: - settings.ts: reloadEnvironment() with file-snapshot-based deletion tracking (lastReloadSnapshot seeded at boot), RELOAD_EXCLUDED_KEYS safety list, and force-write semantics for explicit reload - Session.ts: isIdle() method with cancel-race protection via pendingPromptCompletion null-reset - acpAgent.ts: workspaceReloadEnv extMethod handler with Promise.allSettled session refresh, modelProviders reload, and skipLoadEnvironment to preserve diff accuracy - workspace-service: EnvReloadResult/Response types, facade with 30s timeout and best-effort child forwarding - server.ts: POST /workspace/reload-env route behind strict mutation gate - capabilities.ts: conditional workspace_reload_env capability - SDK: env_reloaded event type, DaemonClient.reloadEnv(), barrel exports 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
1. SessionNotFoundError now reported as childError instead of silently swallowed, so callers can distinguish "child not running" from "child reloaded 0 sessions" 2. Remove duplicate EnvReloadResult — types.ts re-exports from settings.ts 3. Move pendingPrompt=null to finally block — prevents isIdle() from returning false permanently if #executePrompt throws 4. Skip deletion pass when .env file read fails — transient I/O failure should not wipe all tracked env vars 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
1f0cf29 to
5e3c0ca
Compare
1. EnvReloadResult: export type re-export doesn't create local binding; add import type before re-exporting 2. dotEnvReadFailed: variable declaration lost during rebase; restore 3. pendingPrompt: clear in try block before drain calls (drains check pendingPrompt and early-return if set), keep in finally for error path 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
LD_AUDIT provides the same code-execution primitive as LD_PRELOAD via the dynamic linker's audit interface. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
|
Qwen Code review did not complete successfully: Qwen review timed out after 55 minutes. See workflow logs. |
… read failure 1. isIdle() now checks notificationProcessing and notificationAbortController to prevent refreshAuth during background notification model turns 2. When .env file read fails, preserve dotEnvSourcedKeys and lastReloadSnapshot so the next successful reload can still detect key deletions 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
…owing on read failure 1. Add BASH_ENV and ENV to RELOAD_EXCLUDED_KEYS — shell-interpreter injection vectors analogous to LD_PRELOAD for Bash/POSIX sh 2. When .env file read fails, use lastReloadSnapshot as the shadow set for settings.env to prevent keys normally shadowed by .env from overwriting the still-live .env values in process.env 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
wenshao
left a comment
There was a problem hiding this comment.
No issues found. LGTM! ✅ — qwen3.7-plus via Qwen Code /review
| } | ||
| // settings.env is always readable (from settings.json, not a file), | ||
| // so its tracking set is always updated. | ||
| settingsEnvSourcedKeys.clear(); |
There was a problem hiding this comment.
[Suggestion] settingsEnvSourcedKeys is unconditionally cleared and repopulated from newSettingsEnvKeys, even when dotEnvReadFailed is true. When .env read fails, the shadow guard (line 1010) prevents settings.env keys previously shadowed by .env from entering newSettingsEnvKeys. This causes settingsEnvSourcedKeys to lose provenance tracking for those keys. On the next successful reload, the deletion pass cannot detect removal of those settings.env keys.
Impact: If a user removes a key from settings.env during a transient .env I/O failure window, the stale value persists in process.env indefinitely.
| settingsEnvSourcedKeys.clear(); | |
| if (!dotEnvReadFailed) { | |
| settingsEnvSourcedKeys.clear(); | |
| for (const key of newSettingsEnvKeys.keys()) { | |
| settingsEnvSourcedKeys.add(key); | |
| } | |
| } |
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Thanks — won't take this one. The scenario requires a transient .env I/O failure coinciding with a settings.env key deletion — an extremely unlikely double-fault. Adding the guard would mean settings.env tracking also goes stale on .env read failure, which creates its own edge cases.
chiga0
left a comment
There was a problem hiding this comment.
Code Review Overview (AI Generated)
PR: #4924 feat(daemon): add POST /workspace/reload-env for hot-reloading env vars and session auth
Type: New Feature
Author: @doudouOUC
Change size: +446/-17 across 17 files, 6 commits
HEAD: 4075f287 (commit 6: "add BASH_ENV/ENV to exclusions and fix settings.env shadowing on read failure")
Independent Review Result (Phase 1 — Blind): 0 findings
Phase 1 blind review of all 17 file diffs at HEAD found no issues. The feature is well-designed and thoroughly implemented:
reloadEnvironment()correctly handles.envread failure (skip deletion, preserve snapshot), settings.env shadowing (snapshot as shadow set), andRELOAD_EXCLUDED_KEYSfilteringisIdle()checks all 6 busy-state fields (pendingPrompt, pendingPromptCompletion, cronProcessing, cronAbortController, notificationProcessing, notificationAbortController)pendingPrompt = nullproperly infinallyblock — no leak on exception- Settings lock serializes concurrent reload calls
Promise.allSettledisolates per-session auth refresh failures- SDK types, events, and client method all correctly aligned with server response
Cross-Validation
wenshao's findings (4 rounds of CR, APPROVED at HEAD):
Round 1 (commit 1) — 12 items:
| # | Severity | Finding | Resolution | Status |
|---|---|---|---|---|
| W1 | Critical | Snapshot seeds ALL keys → first reload deletes shell-exported vars | Designed — intentional for ACP child inheritance | ✅ |
| W2 | Critical | Force-write overrides shell-exported vars | Designed — explicit reload contract | ✅ |
| W3 | Critical | SessionNotFoundError silently swallowed | Fixed commit 2: childError = 'ACP child not running' |
✅ |
| W4 | Critical | Non-atomic config refresh (modelProviders + auth) | Designed — rollback scope too large, Promise.allSettled isolates | ✅ |
| W5 | Critical | TS2367 rate_limit test array | False positive — verified by npx tsc --noEmit |
✅ |
| W6 | Critical | TS2367 workspace_reload_env test array | False positive | ✅ |
| W7 | Suggestion | findEnvFile cwd inconsistency (process.cwd vs workspaceCwd) | Deferred — pre-existing, child cwd matches | ✅ |
| W8 | Suggestion | EnvReloadResult type duplication | Fixed commit 2: import type + export type |
✅ |
| W9 | Suggestion | Weak type guard for env_reloaded event | Designed — follows settings_changed pattern |
✅ |
| W10 | Suggestion | pendingPrompt leak on exception | Fixed commit 2: moved to finally |
✅ |
| W11 | Suggestion | SSE event missing sessionsSkipped/childError | Designed — SSE best-effort, HTTP authoritative | ✅ |
| W12 | Suggestion | Deletion on .env read failure wipes keys | Fixed commit 2: skip deletion when read fails | ✅ |
Round 2 (commit 2) — 3 items:
| # | Severity | Finding | Resolution | Status |
|---|---|---|---|---|
| W13 | Critical | dotEnvReadFailed undeclared |
Fixed commit 3 (pre-existing) | ✅ |
| W14 | Critical | export type doesn't create local binding |
Fixed commit 3 | ✅ |
| W15 | Suggestion | Missing LD_AUDIT in exclusions | Fixed commit 4 | ✅ |
Round 3 (commit 5) — 4 items:
| # | Severity | Finding | Resolution | Status |
|---|---|---|---|---|
| W16 | Critical | Missing BASH_ENV/ENV shell injection vectors | Fixed commit 6 | ✅ |
| W17 | Critical | settings.env shadowing on .env read failure | Fixed commit 6: snapshot as shadow set | ✅ |
| W18 | Suggestion | Silent error swallowing on .env read failure | Deferred — matches loadEnvironment behavior | ✅ |
| W19 | Suggestion | refreshAuth rejection reason discarded | Deferred — { id, reason } structure for follow-up |
✅ |
Round 4 (HEAD) — 1 item:
| # | Severity | Finding | Resolution | Status |
|---|---|---|---|---|
| W20 | Suggestion | Missing behavioral tests for new code paths | Deferred — current PR has capability + mock tests | ✅ |
CI bot finding (at HEAD):
| # | Severity | Finding | Resolution | Status |
|---|---|---|---|---|
| B1 | Suggestion | settingsEnvSourcedKeys unconditionally cleared on .env read failure |
Author rejected — extremely unlikely double-fault scenario (transient I/O failure + concurrent key deletion), guard would make settings.env tracking stale too | ✅ |
Additional Audit Coverage
Areas I independently checked that go beyond existing findings:
-
Deletion logic correctness (Round 5.5 — Data Provenance Tracing):
previouslyKnownunion oflastReloadSnapshot.keys(),dotEnvSourcedKeys,settingsEnvSourcedKeyscorrectly covers all provenance sources. Deletion skipsRELOAD_EXCLUDED_KEYS. Verified: if.envremoves a key that was previously set,process.env[key]is deleted; if a key was never tracked, it's untouched. -
settings.env shadowing on read failure (Round 5.5 — State Field Initialization): When
.envread fails,lastReloadSnapshotacts as the shadow set for settings.env keys. Keys previously shadowed by.env(in snapshot) are excluded fromnewSettingsEnvKeys, preventing overwrite of still-live.envvalues. On next successful reload,.envkeys are correctly re-parsed. Verified correct. -
Concurrency model (Round 5.5 — Handler Parallelism):
withSettingsLockinrunQwenServe.tsserializes concurrentreloadDaemonEnvcalls.Promise.allSettledinacpAgent.tshandler isolates per-session failures.isIdle()correctly returns false during post-prompt drain phase (pendingPromptstill set beforefinally). -
Settings.lock scope (Round 5.5): The lock wraps both
loadSettings(skipLoadEnvironment)andreloadEnvironment()atomically, preventing interleaved read-modify-write onprocess.env. -
loadSettingsbackward compatibility (Round 5.5 — Data Structure Blast Radius): NewLoadSettingsOptionsoverload preserves existingbooleancallers.opts.consumeCorruptionEnvVars ?? trueandopts.skipLoadEnvironmentboth correctly gated. -
SSE event design (Round 6):
env_reloadedis a workspace-level control event, not per-session.reduceDaemonSessionEventreturnsbase(passthrough).asKnownDaemonEventvalidates data is object. Event published regardless of child forwarding outcome — clients always notified of daemon reload. -
RELOAD_EXCLUDED_KEYS completeness (Round 6 — security): Covers dynamic linker injection (LD_PRELOAD, LD_AUDIT, LD_LIBRARY_PATH, DYLD_INSERT_LIBRARIES, DYLD_LIBRARY_PATH), shell injection (BASH_ENV, ENV), Node.js vectors (NODE_OPTIONS, NODE_PATH, NODE_TLS_REJECT_UNAUTHORIZED), path variables (PATH, HOME, TMPDIR, TMP, TEMP), and project-level exclusions.
This review was generated by QoderWork AI
already have 2 approved.3ks.
| * total has reached `clientBudget`. Caller should free a slot | ||
| * (disconnect another server) before retrying. | ||
| */ | ||
| export interface DaemonEnvReloadResponse { |
There was a problem hiding this comment.
[Suggestion] JSDoc placement error: the multi-line JSDoc comment above (lines 1101–1113) documents DaemonMcpRestartResult (skip reasons: 'in_flight', 'disabled', 'budget_would_exceed'), but DaemonEnvReloadResponse was inserted between that JSDoc and the type it describes. IDE tooling will associate the wrong documentation with each type.
| export interface DaemonEnvReloadResponse { | |
| export interface DaemonEnvReloadResponse { | |
| updatedKeys: string[]; | |
| removedKeys: string[]; | |
| childReloaded: boolean; | |
| sessionsRefreshed?: string[]; | |
| sessionsSkipped?: string[]; | |
| childError?: string; | |
| } |
Move DaemonEnvReloadResponse above the JSDoc block so the comment correctly attaches to DaemonMcpRestartResult.
— qwen3.7-max via Qwen Code /review
| // Runtime MCP server mutation ext-methods | ||
| workspaceMcpRuntimeAdd: 'qwen/control/workspace/mcp/runtime-add', | ||
| workspaceMcpRuntimeRemove: 'qwen/control/workspace/mcp/runtime-remove', | ||
| workspaceReloadEnv: 'qwen/control/workspace/reload_env', |
There was a problem hiding this comment.
[Suggestion] reload_env uses snake_case while every other multi-word ext-method in this object uses kebab-case (runtime-add, runtime-remove, mcp/restart). The HTTP route (/workspace/reload-env) correctly uses kebab-case, making the two identifiers inconsistent.
| workspaceReloadEnv: 'qwen/control/workspace/reload_env', | |
| workspaceReloadEnv: 'qwen/control/workspace/reload-env', |
— qwen3.7-max via Qwen Code /review
| if (RELOAD_EXCLUDED_KEYS.has(key)) continue; | ||
| if ( | ||
| !isHomeScopedEnvFile && | ||
| PROJECT_ENV_HARDCODED_EXCLUSIONS.includes(key) |
There was a problem hiding this comment.
[Suggestion] PROJECT_ENV_HARDCODED_EXCLUSIONS.includes(key) is dead code here. RELOAD_EXCLUDED_KEYS is constructed by spreading PROJECT_ENV_HARDCODED_EXCLUSIONS (line 95), so any key in that array is already filtered by the preceding RELOAD_EXCLUDED_KEYS.has(key) check. This .includes() call can never trigger its continue — it's a wasted O(n) array scan on every parsed env key.
Remove both redundant PROJECT_ENV_HARDCODED_EXCLUSIONS.includes(key) blocks (here and at line ~1004).
— qwen3.7-max via Qwen Code /review
| * shell-exported variables are never touched. | ||
| * Fully synchronous — no TOCTOU window between delete and re-add. | ||
| */ | ||
| export function reloadEnvironment( |
There was a problem hiding this comment.
[Suggestion] reloadEnvironment() duplicates ~30 lines of .env parsing, exclusion filtering, and settings.env fallback logic from loadEnvironment() (lines ~870–940): same findEnvFile call, same dotenv.parse, same excludedVars/isHomeScopedEnvFile/isQwenScopedEnvFile branching. If the filtering logic drifts (e.g., a new exclusion category added to one but not the other), reload and boot will produce different environment states.
Consider extracting the shared parsing + filtering into a private helper (e.g., parseEnvSources(settings, workspaceCwd)) that both functions call, with each caller applying its own write semantics.
Additionally, this function — the most stateful in the PR, managing 3 module-level tracking sets, a snapshot map, 2 env-file sources, and exclusion lists — has zero diagnostic logging. Adding structured debug logging at key decision points would significantly ease future debugging.
— qwen3.7-max via Qwen Code /review
| } | ||
| } | ||
|
|
||
| publishWorkspaceEvent({ |
There was a problem hiding this comment.
[Suggestion] The child process returns its own updatedKeys and removedKeys from its reloadEnvironment call, but only sessionsRefreshed and sessionsSkipped are extracted from childResult. The child's env-change perspective is silently discarded — only the daemon's own keys appear in the HTTP response and SSE event.
The daemon and child can see different env deltas (e.g., the child may have divergent keys from tools set during session execution). Consider surfacing child keys in the response:
| publishWorkspaceEvent({ | |
| publishWorkspaceEvent({ | |
| type: 'env_reloaded', | |
| data: { ...result, childReloaded, sessionsRefreshed }, | |
| originatorClientId: ctx.originatorClientId, | |
| }); | |
| return { | |
| ...result, | |
| childReloaded, | |
| sessionsRefreshed, | |
| sessionsSkipped, | |
| childError, | |
| childUpdatedKeys: childResult?.updatedKeys, | |
| childRemovedKeys: childResult?.removedKeys, | |
| }; |
— qwen3.7-max via Qwen Code /review
|
[Suggestion] 当前 PR scope(env vars + modelProviders + auth)是合理的增量起点。注意到 当前各配置类型的热加载状态:
建议 follow-up 优先级:
This comment was generated by QoderWork AI |
Summary
POST /workspace/reload-envendpoint that reloads environment variables from.env/settings.envwithout restarting the daemonenv_reloadedSSE event type +DaemonClient.reloadEnv()convenience methodKey Design Decisions
lastReloadSnapshotseeded at boot captures all parsed file keys (not just written ones), enabling correct key deletion in ACP children that inherit daemon env without trackingRELOAD_EXCLUDED_KEYS: Safety list blocksQWEN_CLI_ENTRY,NODE_OPTIONS,LD_PRELOAD,PATH,HOMEetc. from being overwritten or deletedskipLoadEnvironment: PreventsloadSettings()→loadEnvironment()from writing new keys beforereloadEnvironment()computes the diffworkspace_reload_envonly advertised whenreloadDaemonEnvdependency is configuredisIdle()checkspendingPromptCompletion(reset to null in finally block) to prevent refreshAuth during prompt cleanupTest plan
npx vitest run packages/cli/src/serve/server.test.ts— capability registration + drift insurancenpx vitest run packages/cli/src/serve/workspace-service/__tests__/— facade mocksettings.json env.OPENAI_API_KEY→curl -X POST /workspace/reload-env→ verify old session uses new key🤖 Generated with Qwen Code