Merge/upstream safe commits#24
Conversation
* ci: skip unnecessary release and SDK checks * ci: guard release skip classification for non-pr events * ci: harden release sync skip gate * ci: refine release sync skip fallback
* test: stabilize main e2e flakes * test: stabilize macos e2e assertions
… buffer; lazy message count (QwenLM#3897) * perf(core): drop full-file readline + pool tail buffer in listSessions `listSessions` previously called `countSessionMessages` per file, which streamed the entire JSONL through `readline` to count unique user/assistant UUIDs. For a project with N sessions averaging M bytes each, every /resume open paid O(N · M) wall time before showing the picker — by far the dominant cost once a project accumulated many multi-MB sessions. This change: - Drops the per-file count from listSessions / findSessionsByTitle. `SessionListItem.messageCount` is now optional. Callers that need a count call the new public `SessionService.countSessionMessages (sessionId)` lazily — typically only when a SessionPreview panel is about to display the badge. - Pools a single 64KB tail-read buffer across the per-file metadata reads in listSessions / findSessionsByTitle, mirroring the pattern in claude-code's `enrichLogs`. The two helpers in `sessionStorageUtils` (`readLastJsonStringFieldSync` and `readLastJsonStringFieldsSync`) accept an optional caller-owned scratch buffer; one-off callers (rename, single-session lookup) pass nothing and keep the original alloc behaviour. - Updates SessionPicker's row metadata to omit the "N messages" segment when `messageCount` isn't available, keeping the visual layout intact. Tests: - Pin that listSessions does not populate messageCount (regression guard against silently re-introducing the per-file scan). - Smoke test the buffer-pool plumbing — same caller-owned buffer handed to two reads of different file sizes returns both correct values without state bleed. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * perf(core): re-anchor custom_title; drop Phase-2 full-file scan Tightens the title-write / title-read invariant so the picker's metadata read is bounded to a fixed 2 × 64KB per file, regardless of session length, with no fallback that scales with file size. Writer (`ChatRecordingService`): - New `bytesSinceTitleAnchor` counter and `TITLE_REANCHOR_BYTES` threshold (32 KB, half of LITE_READ_BUF_SIZE). - `appendRecord` now updates the counter and, once it crosses the threshold while a title is set, re-appends a fresh `custom_title` record to EOF (`reanchorTitle`). The recursive append routes back through the same tracking path, which sees the title record and resets the counter to zero. - Sessions that never set a title pay zero overhead — the early return in `updateTitleAnchorTracking` short-circuits. Reader (`sessionStorageUtils`): - `readLastJsonStringFieldSync` / `readLastJsonStringFieldsSync` now read the file's last 64 KB, fall back to the first 64 KB on miss, and return `undefined` if neither contains the field. The previous Phase-2 streaming full-file scan (capped at 64 MB) is removed entirely. - The pooled scratch buffer (already optional) now backs both the tail and head reads — only one allocation per `listSessions` page even with the head fallback firing. - `MAX_FULL_SCAN_BYTES` constant deleted (unused). The two changes are coupled: the reader's tighter bound only works because the writer guarantees the title stays in tail. A title buried mid-file is now intentionally `undefined` (the picker falls back to `firstPrompt` for display) rather than triggering a full-file scan that would freeze the UI on long agent transcripts. Tests: - `chatRecordingService.customTitle.test.ts` — three scenarios pinning the re-anchor invariant: threshold trigger, no-spurious on no-title sessions, no-trigger on small write bursts. - `sessionStorageUtils.test.ts` — replaces Phase-2 tests with head-window fallback tests; pins the new "buried beyond both windows returns undefined" contract; updates buffer-pool reuse test to cover both tail and head reads with the same scratch. E2E coverage (35/35 scenarios on a separate harness against real fs): R1–R10 reader contract, M1–M4 multi-field, W1–W7 writer re-anchor, L1–L6 listing latency + regression pins, S1 stress (200 × 3 MB), C1 concurrency, I1–I5 writer/reader integration. Measured 2.6× speedup on listSessions(50) over 50 × 4 MB sessions vs the legacy `countSessionMessages` baseline; speedup scales linearly with average file size. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * test(core): cover CR review gaps for session-list perf path - countSessionMessages: valid id counts unique user/assistant uuids and ignores other types/malformed lines; invalid id short-circuits without filesystem access; ENOENT degrades to 0 instead of bubbling. - readLastJsonStringFieldsSync: mirror the single-field variant's scratch-buffer reuse test across a tail-hit then head-fallback to catch any decode that ignores bytesRead. - ChatRecordingService re-anchor: legacy resumed session (source undefined) must omit titleSource on threshold-triggered re-anchor, not silently reclassify as 'manual'. Co-authored-by: Qwen-Coder <noreply@alibabacloud.com> * fix(core): count title-anchor bytes as utf-8 + reset on reanchor failure `String.length` undercounts the on-disk size of multi-byte payloads (CJK, emoji are 1 UTF-16 unit but 3 UTF-8 bytes), so a session full of CJK content could push >96KB of writes past the last anchor before the 32KB threshold thinks it has — silently drifting the title past the 64KB tail window the picker scans. Switch the byte counter to `Buffer.byteLength(..., 'utf8')` for parity with `jsonl.writeLine`. Also reset `bytesSinceTitleAnchor` to 0 in the reanchor catch: without it, a failing reanchor pins the counter at the threshold and turns a single transient I/O fault into a per-record retry storm. One missed anchor is the right tradeoff — finalize() will re-emit on the next lifecycle event. Co-authored-by: Qwen-Coder <noreply@alibabacloud.com> * fix(core): trim head-window to whole lines in lite metadata reads The 64KB head-window fallback in `readLastJsonStringFieldSync` / `readLastJsonStringFieldsSync` reads a fixed slice and hands it straight to the extractor — its trailing bytes can fall mid-record. A partial line whose `customTitle` value happens to close inside the buffer but whose body extends past 64KB would otherwise win the latest-match race and surface as the picker's title. Drop everything past the last `\n` before extracting (only when the buffer is shorter than the file — a small file is necessarily whole- line). Honors the original Phase-2 contract that only complete lines get a vote, without paying for the deleted full-file scan. Co-authored-by: Qwen-Coder <noreply@alibabacloud.com> * fix(core,cli): preview lazy count, project-scope countSessionMessages, pin perf contract Address QwenLM#3897 follow-up review findings: - SessionPreview footer no longer drops the message count when listSessions omits it. The prop is the override path; default falls back to a unique user/assistant uuid count derived from the already-loaded conversation, matching countSessionMessages semantics with zero extra disk I/O. - countSessionMessages now scopes to the current project, mirroring the first-record cwd check in deleteSession/renameSession/loadSession. A valid sessionId from another project sharing the chats dir no longer bypasses project boundaries on lazy count. - New regression tests: - SessionPicker row renders cleanly with messageCount === undefined (no "messages"/"undefined"/dangling separator) - findSessionsByTitle perf contract: matches have messageCount undefined and fs.createReadStream is never called - countSessionMessages: cross-project sessions return 0 without streaming, empty file returns 0 - Update corruption-recovery tests to call private countSessionMessagesFromPath (the streaming entry point) instead of the public sessionId-shaped API the merge from main pointed them at. Co-Authored-By: Qwen-Coder <noreply@qwen.com> * docs(core): correct stale 'full-file fallback' comments to head-window The session metadata reader uses two bounded 64KB windows (tail + head) since the perf rework on this branch — never a full-file scan. Two comments still described the prior behavior. Reported in MR review. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): route ACP renameSession through live ChatRecordingService ACP renameSession constructed a fresh SessionService and wrote the custom_title record straight to disk. When the same session was live in this process, ChatRecordingService.currentCustomTitle stayed at the old value, so the next title re-anchor (every 32KB) or finalize() re-emitted the stale title at EOF and silently reverted the rename. Now we look up the live ChatRecordingService first and route through recordCustomTitle, which keeps the in-memory cache and the on-disk record in sync. The SessionService path remains for the non-live case (e.g., another client renaming a backgrounded session). Reported in MR review. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(core): harden session title re-anchor invariants * docs(core): clarify bounded metadata race recovery cost * test(core): cover bounded multi-field title reads --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> Co-authored-by: Qwen-Coder <noreply@alibabacloud.com> Co-authored-by: Qwen-Coder <noreply@qwen.com> Co-authored-by: qqqys <266654365+qqqys@users.noreply.github.com>
…wenLM#3598) * feat(cli): add --json-schema for structured output in headless mode Registers a synthetic `structured_output` tool whose parameter schema IS the user-supplied JSON Schema. In headless mode (`qwen -p`), the first successful call terminates the session and exposes the validated payload via the result message's `structured_result` field. Invalid schemas are rejected at CLI parse time via a new strict Ajv compile helper so they can't silently no-op at runtime. * fix(cli): honour "first structured_output call ends session" + reject non-object root schemas Two review fixes for the `--json-schema` feature: 1. `runNonInteractive` now breaks out of the tool-call loop as soon as the first successful `structured_output` invocation is captured, rather than continuing to execute any trailing tool calls the model emitted in the same turn. This restores the documented single-shot contract and prevents side-effecting tools from running after the final answer has already been accepted. 2. `resolveJsonSchemaArg` rejects schemas whose root `type` is anything other than "object" (or a type array including "object"). Function- calling APIs require tool arguments to be JSON objects, so a schema like `{"type": "array"}` would have registered an unusable synthetic tool the model could never satisfy. Absent `type` and `type: "object"` remain accepted. Adds tests for both paths and updates the existing Ajv-compile test to exercise that path without tripping the new root-type guard first. * fix(cli): also reject root anyOf/oneOf schemas whose branches can't accept objects Addresses a review follow-up: the previous root-object check only inspected the top-level `type` keyword, so a schema like `{"anyOf":[{"type":"array"},{"type":"string"}]}` slipped through even though none of its branches can ever validate the object-shaped arguments that function-calling APIs send. Replace the single `type` check with `schemaRootAcceptsObject`, which recursively walks root-level anyOf/oneOf branches and requires at least one to accept objects. Absent `type`, `type: "object"`, `type: ["object", ...]`, and mixed anyOf branches where one accepts object all still pass. `allOf` is left to Ajv's runtime behaviour — guessing intent across contradictory allOf branches at parse time is fragile. * fix(cli): propagate exitCode from --json-schema failure path + tests Address two PR-3598 review findings: 1. gemini.tsx unconditionally called process.exit(0) after runNonInteractive/runNonInteractiveStreamJson, clobbering the process.exitCode = 1 set by nonInteractiveCli.ts when the model emits plain text instead of the structured_output tool. Switch both call sites to process.exit(process.exitCode ?? 0) so CI can detect the failure via the exit code. 2. nonInteractiveCli.test.ts: strengthen the structured-output success path to assert registry.abortAll() is called and that the stdout result envelope carries the JSON-stringified args under `result` plus the raw object under `structured_result`. Add a retry-path test that mocks executeToolCall to return an error on the first structured_output call, then verifies sendMessageStream is called a second time so the model can retry rather than the session terminating early. * fix(cli): suppress non-structured tool calls when structured_output is in the same turn When --json-schema is active and the model emits a batch like [write_file(...), structured_output(...)], the previous implementation ran the leading side-effecting tool before accepting the structured result, violating the "structured_output is the terminal contract" guarantee. The trailing-only break also let an invalid first structured_output fall through to subsequent tools before the retry turn. Pre-scan the batch: if a structured_output request is present, execute ONLY the first one and skip everything else (leading and trailing). This is consistent with the existing terminal-path semantics — the suppressed tool_use blocks lack a matching tool_result, the same way max-turns / cancellation leave the stream. Adds a test covering the reverse-order [side_effect, structured_output] case alongside the existing trailing-suppression and retry tests. * fix(cli): tighten --json-schema root validation per review feedback Three small holes flagged in the latest pass: 1. `schemaRootAcceptsObject` returned early when a root `type` keyword was present, ignoring sibling `anyOf`/`oneOf`. JSON Schema applies keywords at the same level conjunctively, so e.g. `{type:"object", anyOf:[{type:"string"}]}` is unsatisfiable for any value but used to pass. Now both `type` AND any sibling `anyOf`/`oneOf` must independently admit object. 2. The FatalConfigError text said "Every branch of a root anyOf/oneOf must be satisfiable by an object", but the actual logic only requires *at least one* branch (and tests still accept `anyOf:[object, string]`). Reworded to "at least one branch" so the message matches the behaviour. 3. `compileStrict` used `typeof schema !== 'object'` to gate input, which lets arrays through (`typeof [] === 'object'`). The contract says "schema must be a JSON object", so add an `Array.isArray` check so array input gets the intended error rather than a less helpful Ajv compile message. Tests cover the new rejection paths and the array case. * fix(cli): handle root $ref and allOf in --json-schema accept-object check `schemaRootAcceptsObject` previously only inspected `type`, `anyOf`, and `oneOf` at the root, so a couple of unsatisfiable shapes still slipped through: 1. `{"$ref":"#/$defs/Foo","$defs":{"Foo":{"type":"array"}}}` would be accepted because we don't follow $refs, but registers a synthetic tool whose params resolve to "array" — the model can never produce a valid object. Now reject any root $ref unless the user adds a sibling `type:"object"` as an explicit anchor. 2. `allOf` was deferred to Ajv runtime, but allOf is conjunctive at the same level as `type` / `anyOf` / `oneOf`, so an entry like `{"allOf":[{"type":"object"},{"type":"string"}]}` is unsatisfiable for any value. Walk it like the others, requiring every branch to admit object. Tests cover the new $ref-rejected / $ref+anchor-accepted paths and the allOf reject/accept paths. * fix(cli): explicit exit code from runNonInteractive + pair suppressed tool calls Three review threads on the structured-output flow: 1. The break that ends the for-loop on a successful structured_output call sat *before* the responseParts.push and modelOverride capture. SyntheticOutputTool currently returns neither, so it was safe today — but anyone wiring extra signals into the synthetic tool later would see them silently dropped. Move the break after both captures so the contract is explicit, not implicit. 2. The failure path used to set process.exitCode = 1 and return void, relying on global mutable state across an async boundary. Any cleanup task between runNonInteractive and process.exit could silently turn the structured-output failure into exit 0. Switch runNonInteractive to Promise<number>, return 0 / 1 directly from each function-level exit, and have gemini.tsx use the captured return value. 3. The pre-scan from the prior commit suppresses sibling tool calls when structured_output is in the same turn. On the retry path — when structured_output fails validation — the next-turn payload has tool_result for structured_output but no entry for the suppressed siblings, leaving the prior assistant turn's tool_use blocks unpaired. Anthropic and OpenAI both reject that batch shape, so the retry would surface as an opaque provider error. Synthesize a "skipped" functionResponse for every suppressed call so every tool_use in the prior assistant message has a matching tool_result. Tests cover the new retry-pairing contract and update the existing plain-text-failure test to assert on the return value rather than process.exitCode. * fix: address Copilot follow-up review on --json-schema scaffolding Five small but real findings flagged on the latest pass: 1. core/src/index.ts re-exported `SyntheticOutputTool` via `export type`, but it's a runtime class — that erased it from the emitted JS and would break value imports. Split into a value `export { ... }` and a `export type { StructuredOutputParams }`. 2. The structured-output success path returned without flushing `localQueue` notifications or finalising one-shot monitors. If a background task had already emitted `task_started`, exiting here could drop its paired `task_notification` and leave SDK consumers with unpaired lifecycle events. Mirror the regular terminal path's `flushQueuedNotificationsToSdk` + `finalizeOneShotMonitors` calls before `emitResult`. 3. `schemaRootAcceptsObject` ignored the `not` keyword, so `{not:{type:"object"}}` (which forbids every object value) slipped through. Add a best-effort `not` check that rejects when `not.type` directly excludes object. Deeper negated patterns still fall through to Ajv at runtime. 4. `compileStrict`'s JSDoc claimed it errored on "Ajv versions we can't support", but the function doesn't actually check Ajv versions. Reword to "malformed or uses unsupported draft/features for our Ajv configuration" so the contract matches the implementation. 5. The pre-scan suppressed sibling tool calls but only synthesised tool_result events for them on the retry path — the success path left those tool_use blocks unpaired in the emitted JSONL/stream-json event log. Move the synthesis after the for loop so it runs for both the success break and the validation-failure fall-through; the event log is now consistent regardless of which path the run takes. Tests cover the new \`not\`-rejection paths, the success-path tool_result synthesis, and the existing retry-pairing test still passes against the restructured emit ordering. * fix(cli): tighten --json-schema parse-time gate per Copilot review Two more shapes that used to slip through: 1. `schemaRootAcceptsObject` defaulted to true when no narrowing keyword was present, so root-value constraints like `{const: 1}` or `{enum: [1, 2]}` registered an unsatisfiable structured_output contract — the model could never produce a value matching the tool's parameter schema, and the run would loop on validation failures until max-turns. Reject `const` whose literal isn't an object, and `enum` whose members include no object. 2. The yargs check rejected `--json-schema` with `-i` and with no prompt, but not with `--input-format stream-json`. Stream-json keeps the process open waiting for protocol messages, so "terminate on the first valid structured_output call" silently drops everything queued after that point. Refuse the combination at parse time so the contradiction surfaces immediately. Tests cover the new const/enum reject and accept paths. * fix(cli): handle empty/boolean subschemas + allow stdin-only prompt Three more shapes flagged on the latest review pass: 1. `schemaRootAcceptsObject` treated an empty root `anyOf`/`oneOf` as "no constraint" (skipped when length === 0), but per JSON Schema an empty union is unsatisfiable — no value can match a member of the empty set. Reject those at parse time so users get a clear parse error instead of an opaque runtime never-validates loop. 2. JSON Schema (draft-06+) allows boolean subschemas anywhere a schema is accepted: `true` matches every value, `false` matches nothing. The `anyOf`/`oneOf`/`allOf` walks were rejecting booleans via the typeof-object guard, which incorrectly rejected `{anyOf:[true]}` and `{allOf:[true,{type:"object"}]}` while letting `{anyOf:[false]}` slip through. Replace the per-branch object guard with a `variantAcceptsObject` helper that treats `true` as accepting and `false` as rejecting, then recurses on object subschemas. 3. The yargs `.check` rejected `--json-schema` when no `-p` / positional prompt was given, but the headless CLI also reads the prompt from stdin (`cat prompt.txt | qwen --json-schema '...'`) — a legit usage pattern that was being blocked. Drop the parse-time no-prompt rejection; the existing runtime "No input provided via stdin..." error in gemini.tsx still catches genuinely empty input. Tests cover the empty-union, all-`false`, mixed-boolean accept, and `false`-in-allOf reject paths. Live-verified against the bundled CLI: `echo "..." | qwen --json-schema '...'` now reaches the model call, and the four schema edge cases all surface the expected error text or proceed past parse time. * docs(core): note SyntheticOutputTool as the value-export exception The block comment above the lazy-load type re-exports said tool classes "are now lazy-loaded and are not exported as values from the package root", but `SyntheticOutputTool` was just promoted to a runtime export in 6203852 so the CLI's `--json-schema` flow can construct it from the package root. Document that exception inline so downstream consumers reading the comment don't get told the wrong story. * fix(cli): try every structured_output in a same-turn batch in order The pre-scan used to pick only the FIRST structured_output call from a turn and suppress everything else, even other structured_output calls. That created two avoidable failure modes: 1. `[structured_output(bad), structured_output(good)]` would attempt only the bad one, fail validation, and force a full retry turn. The model already produced a valid structured payload — we should try it before asking again. 2. The trailing structured_output's tool_result was synthesised with the "Skipped: structured_output was also requested in this turn..." message, which is misleading because that call WAS the structured output we should have tried. Filter `requestsToExecute` to ALL structured_output calls (in original order) when --json-schema is active, and let the existing loop break on the first success. Track an `executedCallIds` set, then synthesise tool_result + retry parts after the loop for every tool_use the model emitted that we never actually executed — covering both non-structured siblings (always suppressed) and any structured_output left over after the success break (only one terminal contract per turn). Reworded the synthesised "skipped" output to "this turn's structured_output contract took precedence" so it reads correctly regardless of whether the suppressed call was structured or not. Tests cover the multi-structured retry-free success path; the existing single-structured retry and trailing/leading suppression tests still pass against the updated emit ordering. * fix: address gpt-5.5 review on --json-schema (privacy + $ref + core-tools) Three findings, three changes: 1. Reject every root `$ref` in --json-schema, even with a sibling `type: "object"` anchor. Ajv applies `$ref` conjunctively with sibling keywords, so the previous "accept when type:object is present" carve-out was unsound: `{type:"object",$ref:"#/$defs/Foo", $defs:{Foo:{type:"array"}}}` parsed fine but no object value can satisfy both at runtime, leaving the model to loop until maxTurns. Updated docstring + test cases (replaced the accept-with-anchor case with a reject case for both anchored and well-formed $ref shapes — users wanting composition should inline at the root). 2. Redact `function_args` for structured_output in ToolCallEvent. The args ARE the user's structured payload (already emitted via stdout `result` / `structured_result`); recording them again as ordinary tool-call function_args duplicates that data into OTLP exports, QwenLogger, ui-telemetry, and the chat-recording UI event mirror — surfaces that can leak off-device. Replace with a stable `__redacted` placeholder so consumers still see the call happened (duration, success, decision metrics preserved) but the payload itself doesn't ride along. Two new uiTelemetry tests cover the redacted vs non-redacted paths. 3. Document and test that structured_output bypasses the --core-tools allowlist (same as agent / skill / exit_plan_mode / ask_user_question etc.). The synthetic tool only exists when --json-schema is set, so adding it to CORE_TOOLS would let `--core-tools read_file --json-schema X` silently drop the terminal contract and loop the model until maxTurns — bypass is intentional. Expanded the CORE_TOOLS docstring to enumerate the synthetic-tool exclusions and added a permission-manager test mirroring the pattern used for agent / skill / exit_plan_mode. * fix(cli): apply structured_output terminal handling to drain turns The synthetic structured_output tool is registered for the entire headless session, so it can be invoked from EITHER the main assistant-turn loop OR from a drain turn (queued cron-job / notification reply). The drain path (drainOneItem) was treating it like any other tool: execute, append the response back into itemMessages, keep going. The submitted args were never captured and no structured_result envelope was emitted, so a run that legitimately satisfied --json-schema mid-drain ended up failing the contract with "Model produced plain text..." anyway. Apply the same terminal handling to drain turns: - Hoist `structuredSubmission` to session scope so both paths write to one variable. - In `drainOneItem`, run the same pre-scan: when --json-schema is active and structured_output is in the batch, execute every structured_output in original order until one succeeds; suppress every non-structured sibling. Synthesise tool_results for any unexecuted tool_use the model emitted, mirroring the main path. - On capture, return early from drainOneItem so the drained item's inner while loop stops. - `drainLocalQueue` short-circuits when a captured submission is in flight, so subsequent queued items don't run. - The cron `checkCronDone` watches the same flag and stops the scheduler immediately on capture, releasing the surrounding `await new Promise(...)`. - The final holdback loop bails out on capture so monitor lifecycle doesn't extend past the structured submission. - After the holdback, before the existing failure / regular-success emit, emit the structured success envelope and return 0. Adds a focused unit test that drives the drain path end-to-end via a synchronously-fired monitor notification: main turn produces plain text, the drain reply calls structured_output, and the test asserts exit 0 + structured_result populated + no "Model produced plain text..." error. * fix(cli): address gpt-5.5 review follow-ups on --json-schema scaffolding Six review findings, six small fixes: 1. **Nested $ref incorrectly rejected.** `schemaRootAcceptsObject` recurses into anyOf/oneOf/allOf branches and used to apply the root-only $ref rejection at every level, blocking common composition shapes like `{anyOf:[{$ref:"#/$defs/Foo"},{type:"string"}]}`. Add an `isRoot=true` parameter; non-root recursion treats `$ref` as opaque and defers to Ajv at runtime. Tests cover nested refs in anyOf / oneOf / allOf. 2. **Inaccurate package-root export comment.** `core/src/index.ts` claimed `SyntheticOutputTool` was exported as a runtime value for the CLI's --json-schema flow, but the only construction is inside `Config.registerLazy` via a relative dynamic import — no value consumer reaches into `@qwen-code/qwen-code-core`. Revert to a type-only re-export so `SyntheticOutputTool` lines up with every other lazy-loaded tool class. 3. **Unused constructor parameter.** `SyntheticOutputTool` took `(_config: Config, schema)` but never read `_config`. Drop the parameter (and the corresponding pass-through at the registration call site) so readers don't wonder why a Config is being threaded through. 4. **Tool description claimed "exactly once".** The retry path explicitly tolerates multiple calls until one validates, so "Call this tool exactly once" is misleading to a model that tried twice. Reword to "Call this tool to deliver the final result; the first call with valid arguments ends the session" so the description matches the actual contract. 5. **Asymmetric shutdown on the structured-output success path.** The regular terminal path waits in a holdback loop until `hasUnfinalizedTasks()` is false; the structured-output path used to call `abortAll()` and flush immediately, dropping the matching `task_notification` for any agent whose natural handler hadn't yet enqueued it. Add a bounded holdback (capped at 500ms via STRUCTURED_SHUTDOWN_HOLDBACK_MS) — long enough for typical abort callbacks to enqueue, short enough that a hung agent can't block exit. 6. **gemini.tsx exit-code asymmetry.** `runNonInteractive` returns an explicit exit code, but `runNonInteractiveStreamJson` still reads `process.exitCode` after `runExitCleanup`. Currently safe because the yargs `.check` rejects --json-schema with stream-json input, but a future stream-json equivalent of structured output would need to plumb the exit code through the return value too. Document this in a comment so the constraint is visible at the call site. Plus: strengthen `synthesises tool_result for suppressed sibling calls when structured_output fails validation` to assert the failed structured_output's `functionResponse.response` carries the actual validation error string ("args invalid"), not the synthesised "Skipped:" prose — a regression that overwrote it would otherwise slip past the existing pairing assertion. * fix(cli): close --json-schema gaps surfaced in self-audit + review Five fixes layered onto the same robustness pass over the `--json-schema` flow: 1. **bare-mode registration** (`packages/core/src/config/config.ts`): `qwen --bare --json-schema X -p "..."` previously skipped the synthetic `structured_output` registration entirely (the registration block lives below the bare-mode early-return), so the model had no way to terminate and the run looped to `maxSessionTurns`. Register the synthetic tool inside the bare branch too. 2. **TTY interactive rejection** (`packages/cli/src/gemini.tsx`): `qwen --json-schema X` on a TTY with no `-p` and no piped stdin routes to `isInteractive=true` (priority-3 fallback) and would launch the TUI, where `structured_output` is just an inert tool that prints "accepted" and lets the chat continue. Parse-time gating can't catch this (stdin isn't probed yet at parse time), so reject at runtime before the UI launches; runs `runExitCleanup` first so MCP subprocesses get torn down. 3. **drain-turn structured-success flush** (`packages/cli/src/nonInteractiveCli.ts`): when a drain turn captures `structured_output`, `drainLocalQueue` returns early, leaving any items the drain didn't process in `localQueue`. The prior emit path then ran `registry.abortAll()` + `emitResult` without flushing — stream-json consumers saw `task_started` events without paired `task_notification`. Add the same 500ms holdback + `flushQueuedNotificationsToSdk` the main-turn structured-success path uses, so the two paths agree. 4. **ACP mutual-exclusion** (`packages/cli/src/config/config.ts`): `--acp` runs an independent `runAcpAgent` turn loop that doesn't honour the synthetic-tool terminal contract, so `--acp --json-schema X` would register the tool but never terminate. Add a yargs `.check` rejection covering both `--acp` and the deprecated `--experimental-acp` alias. 5. **max-turns + Skipped wording** (review comments #3198579251/#3198579389/#3198579567 from yiliang114): - `handleMaxTurnsExceededError` now appends a `--json-schema`- specific hint pointing at the common stuck-run causes (structured_output denied by `permissions.deny` / `--exclude-tools`, unsatisfiable schema, prompt didn't instruct the model). Without this, three different failures all surfaced as the same generic "increase maxSessionTurns" line. - The synthesised "Skipped:" tool_result for suppressed sibling calls drops the trailing "Re-issue this call in a separate turn if needed." sentence on the success path, where the session terminates immediately and no consumer (model or SDK) can act on the advice. Retry path keeps the sentence — the model is about to receive these parts and may legitimately re-issue. Tests cover each fix: bare-mode registration order, ACP / experimental- acp rejection (×2), `--json-schema` hint in both text and JSON max-turns output, and explicit Skipped-text assertions on the success and retry paths. * fix: address 9 self-qreview comments on --json-schema PR Folds the 9 Suggestion-level comments from the previous /qreview pass into code/test fixes. Each one is a real issue, but mostly defensive — none changes the user-visible happy path. **Refactors (F4/F5/F6 — code-quality)** - F4 `nonInteractiveCli.ts`: extract `SUPPRESSED_OUTPUT_SUCCESS` / `SUPPRESSED_OUTPUT_RETRY` module-level constants and a `suppressedOutputBody(structuredCaptured)` helper. Both the main-turn and drain-turn synthesis sites previously had a 4-way duplicated ternary; future wording changes can no longer drift between them. - F5 `nonInteractiveCli.ts`: extract `emitStructuredSuccess()` closure inside `runNonInteractive`. The "abortAll → bounded holdback → flush → finalize one-shot monitors → emitResult → return 0" terminal block is now defined once and called from both the main-turn and drain-turn success paths. `finalizeOneShotMonitors` is idempotent (`oneShotMonitorsFinalized` guard) so the unconditional invocation is safe even when the drain-turn already finalized monitors before reaching the helper. - F6 `core/config/config.ts`: extract `registerStructuredOutputIfRequested()` helper. The synthetic-tool registration block is no longer duplicated between the bare-mode early-return branch and the regular registration branch. **Tests (F7/F8/F9 — pin existing behaviour)** - F7 `nonInteractiveCli.test.ts`: new test "holds back for in-flight background tasks before emitting structured success" — flips `hasUnfinalizedTasks: true → false` mid-poll so the holdback `while` body actually executes; spies on `abortAll` and asserts ordering of `task_notification` (must precede the result envelope) and the bounded elapsed-time cap. None of the existing structured-output success tests entered this branch (they all pinned `hasUnfinalizedTasks: () => false`). - F8 `gemini.test.tsx`: new test "rejects --json-schema when running in interactive (TUI) mode" — pins the TUI guard at gemini.tsx:694, asserting the headless-only stderr message AND the exact ordering `writeStderrLine → runExitCleanup → process.exit(1)` so a future refactor can't swap any of those steps. - F9 `cli/config.test.ts`: pin the two previously-untested `--json-schema` mutual-exclusion branches: `-i`/`--prompt-interactive` and `--input-format stream-json`. The stream-json check is load-bearing — `gemini.tsx:768` explicitly relies on this rejection holding (the parse-time `process.exitCode ?? 0` plumbing in the stream-json branch is only safe because `--json-schema` can't reach it). **Behaviour fixes (F1/F2/F15 — privacy / security / correctness)** - F1 `core/core/geminiChat.ts`: redact `functionCall.args` for `structured_output` tool calls before passing them to `chatRecordingService.recordAssistantTurn`. Without this, the user's structured payload (already emitted on stdout via `result` / `structured_result`) was persisted verbatim to `<projectDir>/chats/<sessionId>.jsonl` and re-fed into model context on `--continue` / `--resume`, contradicting the privacy contract documented next to the existing `ToolCallEvent` redaction. Each validation-failure retry was also recorded. Now mirrors the same `__redacted` placeholder. Helper extracted as `redactStructuredOutputArgsForRecording` so it's unit-testable. - F2 `cli/config/config.ts`: `resolveJsonSchemaArg`'s `@path` reader now (a) `fs.statSync`s first to refuse non-regular files (FIFOs, character devices like `/dev/zero`, directories), (b) caps the schema file at 1 MiB so an attacker who can influence the path through a wrapping process can't OOM the run, and (c) on JSON parse failure for `@path` source emits a generic "content of <path> is not valid JSON" instead of echoing the SyntaxError — Node ≥18's SyntaxError embeds a ~10-char file-content prefix in its message, which would otherwise ride out on stderr through any wrapper that surfaces the error. Inline (non-`@path`) JSON keeps the SyntaxError detail because the user is the source. - F15 `core/tools/tool-registry.ts`: `registerTool` now also checks the lazy `factories` map for name collisions, not just the eager `tools` map. An MCP server registering a tool whose name shadows a built-in lazy factory (e.g. `structured_output`) now gets auto-qualified to `mcp__<server>__<name>`, instead of silently winning the resolution. The synthetic structured-output tool no longer needs renaming for the corner case to be safe. Targeted suite (13 changed-area test files): 883/886 pass — 3 pre-existing skips. Typecheck clean on both packages. * fix: address 3 deepseek-v4-pro qreview comments on --json-schema PR Three Suggestion-level comments from the latest /qreview pass. **N1 — `schemaRootAcceptsObject` skips `if/then/else`** (cli/config/config.ts): A schema like `{"if": true, "then": {"type": "string"}}` passed parse-time gating but is unsatisfiable for object-typed tool args at runtime — the model would loop until maxSessionTurns. Add a best-effort check for the two decidable shapes: - `if: true` → object MUST match `then`; if `then` excludes objects (boolean `false`, non-object `type`, etc.), reject at parse time. - `if: false` → object MUST match `else` (`true` if absent); same check. Object-schema `if` cases stay runtime-decidable and fall through to Ajv, matching the existing best-effort scope on `not`. 4 new test cases pin both reject and accept paths. **N2 — subagent registries register `structured_output` too** (core/config/config.ts, core/tools/agent/agent.ts, core/agents/backends/InProcessBackend.ts): `createApprovalModeOverride` and `buildSubagentContextOverride` rebuild the tool registry on a `Object.create(base)` config. `this.jsonSchema` propagates through the prototype chain, so `registerStructuredOutputIfRequested` was firing for every subagent registry rebuild — but only `runNonInteractive`'s main / drain loops detect a successful `structured_output` call as terminal. A subagent that called the tool would receive "Session will end now" and then keep running because its own loop has no terminator: wasted tokens, no structured payload on stdout. Add a `forSubAgent: true` option to `createToolRegistry` (alongside the existing `skipDiscovery`), and propagate it from both subagent rebuild sites. The structured-output registration helper short-circuits when the flag is set. Bare-mode init does NOT set the flag, preserving the F6 fix where `qwen --bare --json-schema X -p "..."` still gets the synthetic tool. New test asserts the registry rebuilt with `forSubAgent: true` registers READ_FILE / EDIT / SHELL but NOT STRUCTURED_OUTPUT. **N3 — TEXT-mode `structuredResult` not integration-tested** (nonInteractiveCli.test.ts): All 8 existing `--json-schema` tests pin `OutputFormat.JSON` or `STREAM_JSON`. TEXT (the default for `qwen -p ...`) has no integration coverage, so a regression in `BaseJsonOutputAdapter.buildResultMessage`'s `hasStructured ? JSON.stringify(structuredResult) : resultText` contract or in `JsonOutputAdapter.emitResult`'s text-mode `process.stdout.write(`${result}\n`)` path would only surface to plain `qwen -p` users. New test pins TEXT-mode behaviour: stdout is exactly `${JSON.stringify(structuredArgs)}\n` — no JSON envelope, no event log. Targeted suite (13 spec files): 945/948 pass — 3 pre-existing skips. Typecheck clean on both packages. * fix(cli): narrow `not` rejection in schemaRootAcceptsObject Address Critical review comment #3216123734. `schemaRootAcceptsObject`'s `not` handler previously rejected any schema whose `not.type` included `"object"`, regardless of what other constraints `not` had. That's a false positive for schemas where the extra constraints NARROW what `not` excludes: { "not": { "type": "object", "required": ["error"] } } excludes only objects with an `error` key — the value `{}` satisfies this schema fine, but the old check rejected it at parse time with "--json-schema root must accept object-typed values". Fix: only reject when `not` is exactly `{type: ...}` with no narrowing siblings (the unambiguous "every object is excluded" case). When other keywords are present (`required`, `properties`, `minProperties`, `enum`, etc.), defer to Ajv at runtime — same best-effort scope as the sibling `anyOf`/`oneOf`/`allOf` deep-content checks. 3 new test cases pin the fixed accept paths (`{not:{type:"object",required:[...]}}`, `{not:{type:"object",properties:...,required:[...]}}`, `{not:{type:"object",minProperties:1}}`). The existing reject test for bare `{not:{type:"object"}}` still passes. * refactor: dedupe structured_output handling per qreview C1/C2/C3 Three Suggestion-level review comments from the latest /qreview pass. **C1 — main-turn / drain-turn `structured_output` dispatch was duplicated ~120 lines** (`nonInteractiveCli.ts`) The two batch-handling sites had near-identical bodies (filter `structured_output` from the batch when `--json-schema` is active → iterate with `executeToolCall` → write to `structuredSubmission` on first valid call → synthesise tool_result events for suppressed siblings). The only meaningful difference was which `modelOverride` binding the loop wrote to (session-scoped `modelOverride` for the main turn vs per-drain-item `itemModelOverride`). Extracted `processToolCallBatch(batchRequests, setModelOverride)` defined inside `runNonInteractive`: - Closes over session-scoped state (`adapter`, `config`, `abortController`, `options`, `structuredSubmission`, `executeToolCall`, `handleToolError`, `suppressedOutputBody`, the progress-handler helpers). - Takes the `modelOverride` setter as the one call-site-specific parameter so the main turn binds to the session var and the drain binds to the per-item var. Main-turn body went from ~120 lines to a single call; drain-turn body likewise. Net file shrink ~80 lines, no behaviour change. All 42 existing structured-output tests still pass (including `stops executing remaining tool calls...`, `tries multiple structured_output calls in the same turn...`, `synthesises tool_result for suppressed sibling calls...`, `captures structured_output emitted from a drain-turn (queued notification)`). **C2 + C3 — `{__redacted: '…'}` placeholder duplicated in two files** (`telemetry/types.ts` + `core/geminiChat.ts`) The `ToolCallEvent` constructor (for telemetry surfaces — OTLP / QwenLogger / ui-telemetry / chat-recording UI event mirror) and `redactStructuredOutputArgsForRecording` (for the on-disk chat-recording JSONL) each had a verbatim copy of: { __redacted: 'structured_output payload (see stdout result)' } If the redaction wording (or the `__redacted` key, or the placeholder text) ever drifted between the two surfaces, the privacy contract would be subtly broken on one and not the other. Hoisted to `STRUCTURED_OUTPUT_REDACTED_ARGS` exported from `packages/core/src/tools/syntheticOutput.ts`, imported in both sites. The constant carries its rationale in a JSDoc block so future readers see both call sites at once. Targeted suite (13 spec files): 961/964 pass — 3 pre-existing skips. Typecheck clean on both packages. --------- Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>
* feat(cli): Ctrl+B promote keybind — wire UI to PR-2's promoteAbortController (QwenLM#3831 PR-3 of 3) Final piece of the foreground → background promote feature. PR-1 (QwenLM#3842) landed the `signal.reason` foundation; PR-2 (QwenLM#3894) wired `shell.ts` to detect a `{ kind: 'background' }` abort, snapshot output, register a `BackgroundShellEntry`, and stash the promote `AbortController` on `TrackedExecutingToolCall`. This PR exposes the user-visible surface: pressing Ctrl+B during an in-flight foreground shell command transfers ownership to a background task the user can inspect via `/tasks` or stop via `task_stop`. - `keyBindings.ts`: new `Command.PROMOTE_SHELL_TO_BACKGROUND` bound to `Ctrl+B`. JSDoc explains the no-shell-running no-op semantics. - `useReactToolScheduler.ts`: project `promoteAbortController` from the core's `ExecutingToolCall` through `TrackedExecutingToolCall` so the React layer (AppContainer keypress handler) can find it by callId without re-plumbing through the scheduler. - `AppContainer.tsx`: `handleGlobalKeypress` gains a `PROMOTE_SHELL_TO_BACKGROUND` branch that walks `pendingToolCallsRef.current` (the ref, not the destructured array — keeps the deps list stable so the handler isn't re-bound on every tool-call status update), finds the executing tool call with a defined `promoteAbortController`, calls `.abort({ kind: 'background' })`, and returns early. No-op when no foreground shell is executing — Ctrl+B then falls through to the input layer's existing cursor-left binding. - `keyboard-shortcuts.md`: documents Ctrl+B with explicit fall-through behavior so the conflict with the prompt-area cursor-left binding is intentional + understandable. - `keyMatchers.test.ts` (+1): Ctrl+B positive / bare-b + meta+b + Ctrl+other negatives. - `AppContainer.test.tsx` (+2): - **Ctrl+B promotes** — pendingToolCalls includes an executing shell with a stubbed `AbortController` + spy; firing Ctrl+B asserts `abort({ kind: 'background' })` is called once. - **Ctrl+B no-op** — empty `pendingToolCalls` + Ctrl+B must NOT throw (pins the safety contract for the typing-mid-prompt case where the input layer's own Ctrl+B should still fire). - 37/37 keyMatchers + 58/58 AppContainer pass; tsc + ESLint clean. The unit / integration tests cover the keybind → abort wiring and the promote handler's downstream behavior (PR-2's tests). Real-PTY E2E is intentionally manual since headless test infrastructure doesn't drive a real shell child + Ctrl+B keystroke; documented in the PR description checklist. Closes the 3-PR sequence for QwenLM#3831 (Phase D part b of QwenLM#3634). * fix(cli): QwenLM#3969 review wave — broadcast comment + debug log + redundancy 5 QwenLM#3969 review threads addressed: - **AppContainer.tsx Ctrl+B handler**: documented the KeypressContext.broadcast caveat (after `return`, the same Ctrl+B is still dispatched to text-buffer cursor-left + DebugProfiler; visible cursor-left side effect is cosmetic) so future readers understand why the prompt cursor moves on a successful promote. Added `debugLogger.debug` calls on both branches (matched callId on success; streamingState + pendingToolCalls.length on no-op fall-through) so "Ctrl+B doesn't work" reports are debuggable. - **useReactToolScheduler.ts TrackedExecutingToolCall**: dropped the redundant `pid?` and `promoteAbortController?` declarations — both come through the `& ExecutingToolCall` intersection unchanged. Fixed the JSDoc that wrote `{ kind: 'background', shellId }`: callers don't generate `shellId` (it's optional on the abort-reason union and `handlePromotedForeground` produces it downstream). The corresponding executing branch in `toolCallsUpdateHandler` no longer projects pid / promoteAbortController explicitly — `...coreTc` already spreads them; the explicit-undefined clearing in the non-executing branch is also dropped (those fields aren't on coreTc when status !== 'executing', so `...coreTc` doesn't carry them). - **AppContainer.test.tsx**: replaced two `as unknown as Key` double-casts with direct `: Key` annotations on the literal — the object already conforms to the Key interface, double-cast was bypassing type safety needlessly. Tests: 37/37 keyMatchers + 58/58 AppContainer pass; tsc + ESLint clean. No behavior change beyond the new debug log lines. * fix(cli): QwenLM#3969 wave — tool-name guard + non-shell test + defensive clear 3 QwenLM#3969 review threads addressed; 1 deferred: - AppContainer.tsx: Ctrl+B `find()` predicate now also checks `tc.request.name === ToolNames.SHELL` before matching the executing tool call. Defense-in-depth — today only the shell tool wires `promoteAbortController`, but a future copy-paste / type confusion that adds the property to a non-shell tool would otherwise let Ctrl+B mistakenly fire `abort({kind:'background'})` on a tool whose service has no promote-handoff handler. - useReactToolScheduler.ts: re-added explicit `pid: undefined` and `promoteAbortController: undefined` to the non-executing return. Previously dropped on the assumption that `...coreTc` doesn't carry these fields when the status isn't `executing` — true today, but the explicit clearing is defense-in-depth against a future core change that adds either field to a non-executing status type (would surface as a stuck PID display or a Ctrl+B handler that matches a no-longer-executing tool call). - AppContainer.test.tsx: replaced the placeholder "no-op when no pending tool calls" framing on the empty-array case (it does exercise the `executing-status` predicate but NOT the tool-name guard) with TWO tests: 1. existing empty-array no-throw test (renamed for clarity) 2. NEW: executing non-shell tool with a hostile-shape `promoteAbortController` — asserts `abortSpy` is NOT called. This is the regression test for the new tool-name guard above. Tests: 61/61 AppContainer.test.tsx pass; tsc + ESLint clean. Deferred to follow-up (replied + tracked): - `debugLogger.debug` is file-only; success-path "agent unblocks + next message says 'promoted to bg_xxx'" is the user-visible signal. Adding a synthetic history item or stderr line for the gap between keypress and agent message conflicts with Ink rendering and is better as a focused UX PR. * test(cli): pin inheritance of pid + promoteAbortController via type assertions explicit `pid?: number` and `promoteAbortController?: AbortController` from `TrackedExecutingToolCall`, relying on the `& ExecutingToolCall` intersection to inherit them. Current review flags the type-safety regression: if core renames or removes either field, the React-side build won't catch it locally — Ctrl+B handler silently breaks at runtime. Compromise: keep the type minimal (no re-declaration noise the prior review flagged) but add compile-time `extends keyof ExecutingToolCall` assertions that fail loudly + locally if either field disappears. The assertions are evaluated at compile time and zero-cost at runtime; the dummy `const` pins them so they aren't dead code. 61/61 AppContainer tests pass; tsc clean.
* feat(core): write runtime.json sidecar for active sessions Port kimi-cli PR QwenLM#2082 part 1 to qwen-code. On interactive session start, atomically write a small JSON sidecar at <projectDir>/chats/<sessionId>.runtime.json recording the (pid, session_id, work_dir, hostname, started_at, qwen_version) tuple. External tools (terminal multiplexers, IDE integrations, status daemons) can map a running PID to its session id and work dir without parsing argv. Write is best-effort: a read-only filesystem must not block UI startup. OS process title (was QwenLM#3713) and dynamic OSC tab title (kimi QwenLM#2083) remain out of scope. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(core): refresh runtime.json on same-PID session swap Config.startNewSession() reassigns this.sessionId in the same process, which is reached by /clear, /reset, /new and /resume. Previously the old <oldId>.runtime.json was left behind, falsely claiming the still- live PID for a session no longer being served, and no new sidecar was written for the incoming session. Centralize the swap by clearing the old sidecar and writing a fresh one for the new session id from inside startNewSession itself, so all same-PID transitions are covered. The refresh runs as a fire-and- forget best-effort; failures must not block the session switch. Mirrors the post-merge Codex P1 fix on kimi-cli PR QwenLM#2082 (the source of the runtime.json sidecar pattern this PR ports). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(core): only refresh runtime.json when this process owns it Mirrors kimi-cli PR QwenLM#2082 commit e237951f (Codex P1 r3158754463): a short-lived non-interactive invocation (qwen --prompt, ACP, etc.) that runs `/clear` would otherwise call `Config.startNewSession()`, delete a concurrent shell's runtime.json sidecar (same outgoing session id), and never write a replacement — leaving the shell discoverable to nobody. Add a `runtimeStatusEnabled` flag on Config, flipped on by the interactive UI bootstrap immediately after the first successful sidecar write, and gate the swap-time refresh in `startNewSession()` on it. Non-interactive entry points never reach the bootstrap, so they won't touch sibling sidecars. Kimi later reverted the equivalent `write only from shell mode` guard (commit 7083975a) in favor of writing from every long-lived mode, but qwen's wire point is already interactive-only, so the narrower guard is the right shape here. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…/auth TUI dialog (QwenLM#3959) The `qwen auth` CLI subcommand (with subcommands like qwen-oauth, coding-plan, api-key, openrouter, status) has been superseded by the richer /auth TUI dialog introduced in the provider-first auth registry (QwenLM#3864). Running `qwen auth` now prints a deprecation notice pointing users to the /auth TUI dialog (interactive), env vars (CI/headless), or /doctor (status check). Changes: - Replace auth.ts with a stub that prints a removal notice and exits - Delete handler.ts (734 lines), interactiveSelector.ts, and their tests (interactiveSelector.test.ts, openrouter.test.ts, status.test.ts) - Update /auth slash command to handle non-interactive/ACP modes gracefully - Enrich /doctor auth check with provider-aware diagnostics using findProviderByCredentials - Mark `auth` as a subcommand that handles its own exit in config.ts Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
* feat(installer): add standalone archive installation * fix(installer): harden standalone archive installs * fix(installer): address standalone review findings * chore(installer): clarify review followups * fix(installer): stabilize standalone script checks * chore(installer): remove internal planning docs * chore(installer): simplify standalone release review fixes * test(installer): add Windows batch install smoke * test(installer): fix Windows batch smoke quoting * test(installer): preserve Windows cmd quotes * fix(installer): use robust Windows checksum hashing * ci: narrow installer debug matrix * fix(installer): address standalone review hardening * fix(installer): avoid Windows validation parse errors * fix(installer): simplify Windows option validation * fix(installer): harden standalone review fixes
* feat(i18n): expand built-in locale coverage * feat(cli): add dynamic slash command translation * test(cli): stabilize session picker assertions * fix(core): close jsonl readers before cleanup * fix: address i18n review regressions * fix(cli): address dynamic i18n review findings * fix(cli): address i18n review follow-ups * fix(cli): address i18n review feedback * test(cli): align i18n parity coverage with strict locales * fix(cli): address i18n review findings
…LM#3673) * feat(memory): add autoSkill background project skill extraction * fix(test): add missing mock methods for autoSkill (getAutoSkillEnabled, recordCompletedToolCall, consumePendingMemoryTaskPromises) * fix(test): fix cross-platform path comparison in skillReviewNudge integration test * fix(autoSkill): address critical review comments - Fix merged_with_extract silent drop: remove broken merge optimization in scheduleSkillReview(). When an extract task is pending/running, skill review is now scheduled independently instead of recording metadata that no production code ever reads. - Fix SKILL_MANAGE blocked from skill review agent: prepareTools() now only enforces the recursion guard (AGENT tool) when the agent has an explicit tools list. Wildcard/inherit subagents still get the full EXCLUDED_TOOLS_FOR_SUBAGENTS filter, preventing task subagents from calling skill_manage. The dedicated skill review agent can now receive the skill_manage tool it requires. - Update manager.test.ts: replace merged_with_extract tests with concurrent-extract independent-scheduling tests. - Update skill-manage.test.ts: clarify test description to reflect wildcard-only exclusion semantics. * fix(autoSkill): reject symlink traversal in skill_manage path guard assertProjectSkillPath() uses path.resolve() which is purely lexical and does not dereference symlinks. If any path component inside .qwen/skills/ is a symlink pointing outside the project, fs.writeFile/readFile/rm would follow the link and mutate files outside the advertised write boundary. Add assertRealProjectSkillPath() (async) in skill-paths.ts that: - Resolves the real path of the skills root via fs.realpath() - Walks up from targetPath to find the nearest existing ancestor - Resolves that ancestor to its real filesystem path - Rejects if the real path falls outside the real skills root skill-manage.ts execute() now calls both the cheap lexical check (fast fail for obviously wrong paths) and the async real-path check before any fs.writeFile / fs.rm mutation. Add three symlink-specific tests in skill-paths.test.ts covering: - Legitimate path accepted - Symlinked directory pointing outside skills root rejected - Skills root itself being a symlink (safe target) accepted * refactor(autoSkill): remove skill_manage tool, use path-based skill write detection Address reviewer feedback: instead of keeping skill_manage as the sole write gate (which still had symlink bypass risk via generic tools), remove the dedicated tool entirely and replace with a two-layer protection: 1. skillsModifiedInSession (client.ts): detects writes to .qwen/skills/ by inspecting the file_path arg of every completed tool call, replacing the fragile historyCallsSkillManage() history scan. 2. hasAutoSkillSource + evaluateScopedDecision (skillReviewAgentPlanner.ts): the review agent's permission sandbox now verifies BOTH that the target path is inside the skills directory AND that the existing file already contains 'source: auto-skill' in its frontmatter before allowing edits, preventing the agent from overwriting user-managed skills. Changes: - Delete skill-manage.ts and skill-manage.test.ts - Remove SKILL_MANAGE from ToolNames, ToolDisplayNames, config registerLazy, agent-core EXCLUDED_TOOLS comment, and agent.ts comment - Replace historyCallsSkillManage() with skillsModified: boolean param in scheduleSkillReview; skip reason renamed skills_modified_in_session - recordCompletedToolCall(name, filePath?) detects .qwen/skills/ writes; CLI layers pass file_path arg from tool call request - Fix buildTaskPrompt frontmatter template to use top-level source: auto-skill - Update skill-paths.ts error messages to remove skill_manage references - Update all unit/integration tests accordingly * fix(autoSkill): deduplicate concurrent skill-review tasks per projectRoot scheduleSkillReview() was launching a new background task every time the threshold was reached for the same project, with no guard against multiple in-flight reviews running concurrently. Fix: add skillReviewInFlightByProject Map that tracks the taskId of any running review per projectRoot. A second call while one is in-flight returns { status: 'skipped', skippedReason: 'already_running', taskId: <existing> }. The map entry is cleared in a finally block inside runSkillReview() so the next session can schedule a fresh review after the current one completes. Also extend SkillReviewScheduleResult.skippedReason union to include 'already_running', and add a unit test covering the full lifecycle: first call schedules, second call is skipped with existing taskId, and a third call after completion schedules a new task. * fix(autoSkill): address all critical review comments 1. hasAutoSkillSource: narrow catch to ENOENT only (EISDIR/EACCES etc. return false to deny); tighten frontmatter regex to match opening block only. 2. evaluateScopedDecision: add explicit allow for READ_FILE and LS so they don't fall to 'default' which the base PermissionManager might widen; EDIT/WRITE_FILE now call assertRealProjectSkillPath() (async realpath guard) in addition to the lexical check, closing the symlink traversal hole. 3. isScopedTool / getScopedDenyRule: cover READ_FILE and LS so hasRelevantRules returns true and findMatchingDenyRule is correctly consulted for them. 4. recordCompletedToolCall (client.ts): broaden tool name set to match WRITE_TOOL_NAMES in manager.ts (write_file, edit, replace, create_file) and inspect all three arg keys (file_path, path, target_file). Signature changed from (name, filePath?) to (name, args?) to carry all args through. 5. client.ts hardcoded literals: replace threshold/maxTurns/timeoutMs with the named constants AUTO_SKILL_THRESHOLD / DEFAULT_AUTO_SKILL_MAX_TURNS / DEFAULT_AUTO_SKILL_TIMEOUT_MS imported from manager.ts and skillReviewAgentPlanner.ts. 6. toolCallCount / skillsModifiedInSession reset: only reset when skill review is actually scheduled (status === 'scheduled'), not every turn, so the counter correctly accumulates across turns within a session as per design doc. 7. runSkillReview (manager.ts): rethrow after marking record failed, consistent with runExtract behavior. 8. skillReviewNudge.integration.test.ts test 5: rewrite to reflect the in-flight dedup contract (second same-project call returns already_running with existing taskId; third call after completion gets a new task). Add vi.mock for runSkillReviewByAgent so the test does not need a full Config. * fix(autoSkill): address all review comments - skill-paths: detect dangling symlinks with lstat before treating ENOENT as safe - skill-paths: fix isProjectSkillPath relative path resolution to use projectRoot - skillReviewAgentPlanner: restrict READ_FILE/LS to project root only - skillReviewAgentPlanner: remove SHELL tool from review agent tool list - skillReviewAgentPlanner: add path import; remove unused shell imports - skillReviewAgentPlanner: add comment for buildAgentHistory trailing user message - client: fix runManagedAutoMemoryBackgroundTasks gate widening - client: fix skillsModifiedInSession deadlock - client: add .catch() to skill review promise - client: hoist SKILL_WRITE_TOOL_NAMES to module-level ReadonlySet - agent-core: use full EXCLUDED_TOOLS_FOR_SUBAGENTS for explicit tool list subagents - manager: extend notify() signature to accept 'skill-review' taskType - config: fix JSDoc default value comment (false, not true) * fix(autoSkill): address second round review comments - client: reset toolCallCount when scheduleSkillReview returns already_running and count >= threshold, preventing immediate cascade after in-flight review - client.test: add autoSkill branch tests (scheduled/already_running/skills_modified) - client.test: add full recordCompletedToolCall unit tests (skillsModifiedInSession, toolCallCount increment, skill path detection for write_file/edit/read_file) - client.test: add scheduleSkillReview mock to mockMemoryManager - nonInteractiveCli.test: add assertions for recordCompletedToolCall and consumePendingMemoryTaskPromises in tool-call integration test
…rompt cache scope (QwenLM#4020)
) The "fails fast at CLI parse time on invalid JSON Schema" integration test stopped exercising the Ajv strict-compile path once the `--json-schema` root-accepts-object precheck landed. The precheck rejects `{type: "this-is-not-a-real-type"}` before Ajv runs, so the CLI exits with the "root must accept object-typed values" error instead of the "is not a valid JSON Schema" error the test expects. Move the bogus `type` into a property so the root precheck passes and Ajv catches the unknown type, restoring the test's original intent.
…, case-sensitive import
|
Important Review skippedToo many files! This PR contains 299 files, which is 149 over the limit of 150. To get a review, narrow the scope: ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Plus Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (299)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d6275bf555
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "node_modules/@adobe/css-tools": { | ||
| "version": "4.4.4", | ||
| "resolved": "https://registry.npmjs.org/@adobe/css-tools/-/css-tools-4.4.4.tgz", | ||
| "integrity": "sha512-Elp+iwUx5rN5+Y8xLt5/GRoG20WGoDCQ/1Fb+1LiGtvwbDavuSk0jhD/eZdckHAuzcDzccnkv+rEjyWfRx18gg==", | ||
| "dev": true, | ||
| "license": "MIT" |
There was a problem hiding this comment.
Restore lockfile integrity metadata
This lockfile update drops the resolved/integrity fields from registry packages (starting here with @adobe/css-tools). The CI workflow still runs npm run check:lockfile after npm ci (.github/workflows/ci.yml lines 105-109), and running that command in this checkout fails with hundreds of packages reported as missing those fields, so every normal CI run will fail until the lockfile is regenerated with the required metadata.
Useful? React with 👍 / 👎.
| if (isGitHubBlobUrl(url)) { | ||
| url = convertGitHubBlobToRaw(url); | ||
| if (url.includes('github.com') && url.includes('/blob/')) { | ||
| url = url |
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
* fix(core): log internal OpenAI JSON requests * fix(core): avoid duplicate OpenAI log metadata
Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
) * test(e2e): stabilize MCP tool message flow * ci(e2e): cancel stale main E2E runs * test(e2e): accept paired MCP tool results * test(e2e): stabilize monitor tool check * test(e2e): stabilize run_shell_command file-listing assertion The model consistently picks list_directory over run_shell_command for file-listing prompts. Make the prompt explicit about which tool to use, matching the approach taken for the MCP tool flow test.
| const sanitized = promptId | ||
| .replace(/[^a-zA-Z0-9._-]+/g, '-') | ||
| .replace(/^-+|-+$/g, ''); |
… and CLI flags (QwenLM#4066) * docs(telemetry): align config and docs semantics for target, outfile, and CLI flags - Remove stale warning note "This feature requires corresponding code changes" — the OTLP implementation is now complete (QwenLM#3779, QwenLM#4061) - Clarify that `target` is an informational destination label and does not control exporter routing; `otlpEndpoint` or `outfile` must be set to configure where data is sent - Mark `--telemetry-target` CLI flag as deprecated in the configuration table to match the deprecateOption() call in cli/src/config/config.ts - Fix `outfile` / `QWEN_TELEMETRY_OUTFILE` descriptions: remove the incorrect "when target is local" qualifier — outfile overrides OTLP export regardless of the target value - Simplify the file-based output example by removing the now-redundant `"target": "local"` and `"otlpEndpoint": ""` fields Closes the "Align telemetry config and docs semantics for target, useCollector, otlpEndpoint, otlpProtocol, and outfile" checklist item in QwenLM#3731. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * docs(telemetry): address Copilot review comments on outfile and target descriptions - Fix outfile table row in telemetry.md: "overrides `otlpEndpoint`" → "overrides OTLP export" (outfile disables all OTLP exporting, not just the base endpoint) - Use fully-qualified setting names (`telemetry.otlpEndpoint`, `telemetry.outfile`) in the target description in settings.md for consistency with the rest of the table 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * docs(telemetry): update QWEN_TELEMETRY_TARGET env var description and add outfile note - Align QWEN_TELEMETRY_TARGET env var description with the updated telemetry.target setting semantics (informational label, not routing) - Add a note after the file-based output example clarifying that outfile automatically disables OTLP export 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
… workflow - Add root-level one-off migration/rebrand .mjs scripts to eslint global ignores (_rebrand_*.mjs, rebrand_cherry*.mjs, resolve_*.mjs, fix_*.mjs, inspect_qwen*.mjs) so they no longer produce lint errors - Fix hopcode-issue-followup-bot.yml: change repository condition from 'QwenLM/qwen-code' to 'TaimoorSiddiquiOfficial/HopCode' (functional blocker — workflow would never run without this fix) - Rebrand all Qwen/qwen-issue-bot markers to HopCode/hopcode-issue-bot throughout the workflow (description, step names, prompt text, HTML comment markers used in issue comments) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- storage: fix getRuntimeStatusPath to use chats/ dir and .runtime.json suffix - askUserQuestion: set shouldDefer to false so tool loads immediately - syntheticOutput: fix STRUCTURED_OUTPUT_REDACTED_ARGS key (__redacted) and value - gitDiff.test: skip Windows symlink test with test.skipIf(process.platform === 'win32') - AppContainer.test: fix Ctrl+B tests — pendingGeminiHistoryItems + missing mock fields - gemini.tsx: reject --json-schema in interactive (TUI) mode with clear error - gemini.test: add getJsonSchema mock to kitty-protocol tests - chatRecordingService: implement title re-anchor afterWrite() with UTF-8 byte counting, sidecar writes that don't update lastRecordUuid, and counter resets in recordCustomTitle/finalize Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- package-lock.json: update version references to 0.27.6 - auth.test.ts: rebrand qwen auth -> hopcode auth in removal notice test - config.ts: improve schemaRootAcceptsObject type guard and error message - settings.test.ts: import resetHomeEnvBootstrapForTesting, remove duplicate mock block - nonInteractiveCli.test.ts: mock CommandService.fromCommands and fix YOLO->Izn mode message - prompts.ts: move Quran guidance before memory suffix, use plain newline prefix - prompts.test.ts.snap: update snapshots to match new prompt ordering - hopCodeOAuth2.ts: export hopCodeOAuth2Events and clearHopcodeCredentials aliases - subagent-manager.test.ts: update built-in agent counts (4 new built-in agents) - vscode-ide-companion/NOTICES.txt: update third-party notices Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…wenLM#4647) * fix(clipboard): use platform-native tools for image paste on Linux Replace @teddyzhu/clipboard native module with wl-paste/xclip on Linux to fix image paste in WSL2+Wayland environments. The native module uses X11 protocol and cannot read clipboard images when the session uses Wayland (common in WSL2 with WSLg). This causes clipboardHasImage() to return false even when the clipboard contains an image. Changes: - Use wl-paste --list-types to detect images (Wayland) - Use xclip -selection clipboard -t TARGETS -o to detect images (X11) - Handle image/bmp format from Windows clipboard (WSL2 exposes BMP) - Convert BMP to PNG using Python PIL when available - Detect clipboard tool via WAYLAND_DISPLAY when XDG_SESSION_TYPE is unset - Keep @teddyzhu/clipboard as fallback for macOS/Windows Fixes QwenLM#3517 Fixes QwenLM#2885 * test: update clipboard tests for platform-native tools The tests were mocking @teddyzhu/clipboard but the implementation now uses platform-native tools (wl-paste/xclip) on Linux. Update mocks to test the spawn-based implementation. * fix: address critical review comments 1. Fix command injection in Python BMP-to-PNG conversion - Use sys.argv instead of string interpolation - Prevents path traversal via single-quote injection 2. Fix BMP fallback dead code - When PIL is not available, return BMP file path instead of deleting the only copy and returning false - Update saveClipboardImage to handle non-PNG return paths * fix: address review suggestions for resource leaks and robustness - #3: Add proper cleanup in saveFromCommand error paths (kill child, destroy stream) - #4: Add 5s timeout for all spawned processes to prevent TUI hangs - #7: Check exit code in checkClipboardForImage (code === 0) - #8: Move fs.mkdir inside try/catch in saveClipboardImage - #10: Merge checkWlPasteForImage/checkXclipForImage into checkClipboardForImage * fix: address all remaining review comments Source code fixes: - #25: Add timeout to getWlPasteImageTypes (PROCESS_TIMEOUT_MS) - #26: Add timeout to python3 spawn in BMP-to-PNG conversion - #27: Wrap child.kill() in try-catch in timeout handlers - #28: Replace dynamic import('node:fs/promises') with static statSync - #30: Export resetLinuxClipboardTool() for testability - Add try-catch around spawn in checkClipboardForImage - Use stdio: ['ignore', 'ignore', 'ignore'] for python3 spawn Test fixes: - #24: Use vi.hoisted() for mock functions (avoids hoisting issue) - #31: Stub process.platform = 'linux' in beforeEach - Add default export to node:child_process mock - Use EventEmitter-based mock child for async behavior - All 7 tests passing * perf: cache wl-paste --list-types result to avoid redundant calls Avoid spawning wl-paste twice on the paste hot path: 1. clipboardHasImage calls wl-paste --list-types (check) 2. saveClipboardImage calls getWlPasteImageTypes (get types) Now the result is cached after the first call and reused. Cache is reset via resetLinuxClipboardTool() for testing. * fix: address remaining review suggestions - #1: Add child.stdout error handler in saveFromCommand - #2: Add macOS/Windows test coverage for @teddyzhu/clipboard fallback - #3: Fix .replace('.png', '.bmp') to use regex /\.png$/ to prevent path corruption * fix: address critical cache invalidation and other review feedback - #1 Critical: Reset cachedWlPasteImageTypes at start of clipboardHasImage to prevent stale data between paste operations - #1 Critical: Check exit code in getWlPasteImageTypes close handler, do not cache failed results - #2: Replace statSync with async fs.stat to avoid blocking event loop - #3: Remove async from close handler, use promise chain instead - #4: Return false instead of bmpPath when PIL conversion fails, as downstream expects .png files - #5: Capture stderr from spawned processes for diagnostics * fix: address remaining code review issues - #1: Narrow detection to only report supported formats (png/bmp) - #2: Do not cache results on timeout or error - #3: Use line-level matching instead of includes('image/') - #4: Replace execSync with execFileSync to avoid shell injection - #5: Upgrade BMP→PNG failure log to warn level with install hint * fix: restore getClipboardModule import caching (regression fix) The original Qwen Code cached the @teddyzhu/clipboard module import via getClipboardModule() with cachedClipboardModule and clipboardLoadAttempted. Our refactoring removed this caching, causing the module to be re-imported on every clipboardHasImage/saveClipboardImage call. Restored the original caching mechanism for macOS/Windows fallback path. * test: add saveClipboardImage success path and cache behavior tests - Add test for successful PNG save path - Add test for cache invalidation between clipboardHasImage calls - All 11 tests passing * fix: revert execSync to fix WSL2 clipboard detection execFileSync('command', ['-v', 'wl-paste']) fails because 'command' is a shell built-in, not an executable. execSync runs through a shell so it can find 'command'. Reverted to execSync to restore clipboard tool detection on WSL2. Also fixed TypeScript errors in tests by using (child as any) for mock event emitter properties. * fix: address critical file leak and filter issues from review - #1: Clean up bmpPath in catch block when PIL conversion fails - #2: Narrow getWlPasteImageTypes filter to only image/png and image/bmp - #3: Clean up empty PNG file when size guard fails - #3b: Fix typo python3-pyl → python3-pil * test: add xclip, BMP, error path test coverage; fix weak assertion - Add xclip/X11 path tests (detection, no image, not found) - Add BMP-to-PNG conversion tests (PIL failure, prefer PNG over BMP) - Add saveFromCommand error path tests (timeout, spawn error, stdout error) - Replace tautological 'successful PNG save' assertion with proper null-on-error tests - Fix ESLint: add no-explicit-any suppressions, prefix unused setupWaylandEnv Note: xclip save success path requires createWriteStream mock that vitest cannot fully support with ...actual spread. Detection and error paths verified. 19 tests passing. * fix: remove unused _setupWaylandEnv function that breaks TS build Fixes TS6133 error caused by noUnusedLocals: true in tsconfig.json. The function was generated by test agent but never called. * fix: clean up tempFilePath on PIL conversion failure When python3 PIL conversion fails mid-write, tempFilePath (the target .png) may have been partially written. Add fs.unlink(tempFilePath) in the catch block to prevent partial file leakage. Suggested by wenshao in PR review. * fix: address review feedback on file leaks and test coverage - Add tempFilePath cleanup when python3 PIL conversion fails mid-write - Restore image/bmp detection with clarifying comment (WSL2 Wayland) - Fix stat mock syntax (remove debug console.log, simplify) - Fix originalPlatform scope (was undefined in afterEach) Co-authored-by: Shaojin Wen <shaojin.wensj@alibaba-inc.com> 19 tests passing, tsc + eslint clean. * ci: retrigger tests * fix: address review feedback on test coverage and defensive guard - Replace tautological saveClipboardImage assertion with meaningful spawn-argument verification - Wrap clipboardHasImage Linux branch in try/catch guard (preserve 'never throw, return false' contract) - Fix node:fs/promises mock to use importOriginal for indirect deps - Add readFile/writeFile/appendFile/access/copyFile/rename/rm/rmdir to mock (required by indirect deps like chatCompressionService) - Remove node:fs root mock to avoid cross-test pollution 19 tests passing, tsc + eslint clean. * fix: address review feedback on test coverage and defensive guard - Replace tautological saveClipboardImage assertion with spawn-arg verification (prefer PNG over BMP test) - Wrap clipboardHasImage Linux branch in try/catch guard - Fix node:fs/promises mock to use importOriginal for indirect deps - Add missing fs/promises methods (readFile etc.) required by deps - Remove node:fs root mock entirely to avoid cross-test pollution - Document xclip/BMP save success path: blocked by vitest built-in module mock limitation 19 tests passing, tsc + eslint clean. * fix: secure clipboard temp filename with random UUID suffix Add random UUID to temp filename to prevent predictable path symlink attacks (Critical review feedback). The UUID makes the path unguessable, eliminating the symlink attack vector. 19 tests passing, tsc + eslint clean. * fix: add O_EXCL protection against symlink attacks in saveFromCommand Use fs.open with O_EXCL flag (O_WRONLY|O_CREAT|O_EXCL) to atomically create the file, refusing to follow symlinks. Combined with the random UUID filename from the previous commit, this fully addresses the symlink attack vector identified in review. Also update 'prefer PNG over BMP' test: with O_EXCL, the save path fails when mkdir is mocked (directory doesn't exist), so the test now verifies format detection only rather than the full save pipeline. 19 tests passing, tsc + eslint clean. * fix: capture python3 stderr for BMP conversion errors Use stdio 'pipe' for stderr instead of 'ignore' so users see useful diagnostic messages (e.g. ModuleNotFoundError: No module named PIL) when python3 BMP-to-PNG conversion fails. 19 tests passing, tsc + eslint clean.
Summary This PR cherry-picks upstream improvements from QwenLM/qwen-code into HopCode, applying HopCode rebranding where necessary. It closes the gap between HopCode main and the upstream, covering 36 of 48 new upstream commits. ### Features included - autoSkill memory: background project skill extraction (QwenLM#3673) - Slash command discovery: improved slash command UI (QwenLM#3736) - /diff command: git diff statistics utility (QwenLM#3491) - /branch command: fork conversation to a new branch - ToolSearch: on-demand loading of deferred tool schemas (QwenLM#3589, QwenLM#4022, QwenLM#4069) - JSON schema output: --json-schema flag for structured headless output (QwenLM#3598) - Ctrl+B keybind: promote message keybind (QwenLM#3969) - i18n coverage: core built-in internationalization (QwenLM#3871) - runtime.json sidecar: writes session metadata for active sessions (QwenLM#3714) - HOPCODE_HOME env var: customize config directory (upstream: QWEN_HOME QwenLM#2953) - Standalone archive install: binary archive distribution (QwenLM#3776) - Codegraph skill: PR review risk analysis and conflict detection (QwenLM#3910) - Anthropic proxy + prompt cache: global scope support (QwenLM#4020) - Hierarchical session tracing: OTel span hierarchy (QwenLM#4071) - DASHSCOPE_PROXY_BASE_URL: prompt cache via API gateway (QwenLM#3991) - fdir → git ls-files: replace crawler with git ls-files + ripgrep fallback (QwenLM#3214) - ask_user_question always-visible: surfaces clarification UX (QwenLM#4041) - Session-list perf: head/tail 64KB bound, pooled buffer, lazy message count (QwenLM#3897) - VSCode message edit/rewind: message metadata UI (QwenLM#3762) ### Bug fixes included - preserve comments in settings.json migration write-back (QwenLM#3861) - monitor notifications for subagents (QwenLM#3933) - unfreeze Ctrl+O compact-mode on long conversations (QwenLM#3905) - harden reactive compression follow-ups (QwenLM#3985) - unify Edit/WriteFile prior-read with Claude Code (QwenLM#4002) - repair stale --json-schema integration assertion (QwenLM#4075) - log internal OpenAI JSON requests through debug logger (QwenLM#4081) ### Refactors included - Remove legacy qwen auth CLI subcommand → /auth TUI dialog (QwenLM#3959) - Route side-query LLM calls through runSideQuery chokepoint (QwenLM#3775) - Remove dead useCollector setting and unreachable TelemetryTarget (QwenLM#4061) - runtime.json sidecar follow-ups (QwenLM#4030) ### Telemetry - Inject traceId/spanId into debug log files for OTel correlation (QwenLM#3847) - Add hierarchical session tracing spans (QwenLM#4071)
npm run build && npm run typecheck- Expected result: clean build, no type errors - Observed result: ✅ clean build, no type errors ## Scope / Risk - Main risk: cherry-pick conflicts resolved by applying HopCode rebranding (qwen→hopcode, QWEN→HOPCODE, QWEN_HOME→HOPCODE_HOME) - refactor(cli): remove legacy qwen auth has been adapted — auth commands now redirect to the TUI dialog - ink dependency was upgraded and then reverted (7.0.2 → 6.x) due to Static-remount regression; net change: no ink version bump - Not covered: 8 upstream bug/fix/CI commits listed above ## Testing Matrix | | 🍏 | 🪟 | 🐧 | | -------- | --- | --- | --- | | npm run | ⚠️ | ⚠️ | ⚠️ | | npx | ⚠️ | ⚠️ | ⚠️ | | Docker | ⚠️ | ⚠️ | ⚠️ | ## Linked Issues / Bugs Tracks upstream: main...QwenLM:qwen-code:main