feat(tools): keep ask_user_question always-visible to surface clarification UX#4041
Merged
Conversation
…cation UX Previously ask_user_question was deferred behind ToolSearch (shouldDefer: true), so the model only saw it as a name in the deferred-tools section and had to discover it via tool_search before invoking. In practice the model rarely went through that discovery step and instead asked clarifying questions in plain prose, losing the structured multiple-choice UX the tool exists to provide. Flip shouldDefer to false so the schema lives in the initial tool list. Add a unit-test guard so a future refactor doesn't silently put it back behind ToolSearch.
Collaborator
Author
E2E test reportPre-implementation pass against
Notes:
Full plan and raw observations: |
Contributor
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
yiliang114
approved these changes
May 11, 2026
yiliang114
left a comment
Collaborator
There was a problem hiding this comment.
Reviewed the change. This is narrowly scoped and safe: it only keeps ask_user_question visible in the initial tool declarations by setting shouldDefer to false, and does not change the ToolSearch/deferred-tool flow for any other tools. The regression guard is sufficient for this PR's scope, and CI is green. Approving.
wenshao
pushed a commit
that referenced
this pull request
May 17, 2026
…cation UX (#4041) Previously ask_user_question was deferred behind ToolSearch (shouldDefer: true), so the model only saw it as a name in the deferred-tools section and had to discover it via tool_search before invoking. In practice the model rarely went through that discovery step and instead asked clarifying questions in plain prose, losing the structured multiple-choice UX the tool exists to provide. Flip shouldDefer to false so the schema lives in the initial tool list. Add a unit-test guard so a future refactor doesn't silently put it back behind ToolSearch.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ask_user_questionis no longer deferred behindToolSearch— its schema lives in the initial tool list sent to the model every turn.tool_searchdiscovery before invoking it, and instead asked clarifying questions in plain prose. That loses the structured multiple-choice UX the tool exists to provide.shouldDefer === falseso future refactors can't silently regress this. Other deferred tools (cron_create,cron_list,lsp, …) are unchanged. The deferred-tool reveal flow itself is untouched.Validation
--openai-logging), an ambiguous prompt to verify the model now reaches forask_user_questiondirectly, a trivial prompt to verify it doesn't over-invoke, and a regression check that theToolSearchreveal flow still works forcron_list.ask_user_questionappears in the initial tool declarations; the model invokes it directly for ambiguous prompts; other deferred tools stay deferred.Scope / Risk
ask_user_questionis now sent every API request. The schema is moderate-sized; the cost is bounded and the UX gain (avoided rework from missed clarifications) is the trade we want.--exclude-tools ask_user_questionflag is unchanged in spirit (the registry filter still applies before the new always-loaded flag matters). Token-budget instrumentation hasn't been re-baselined.Testing Matrix
Testing matrix notes:
npm run build && npm run typecheck && npm run bundleplus E2E runs againstnode dist/cli.js. No platform-specific code paths touched.