feat(core): experimental in-progress steering hints#18973
Conversation
This is a rebase / refactor of: #18783
Summary of ChangesHello @joshualitt, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces an experimental feature that significantly enhances user interaction with the Gemini CLI by allowing real-time steering of the model's in-progress tasks. It provides the foundational infrastructure for capturing and processing user hints, ensuring the model can dynamically adapt its plan based on immediate feedback. This capability is supported by new robust testing utilities and dedicated model configurations, making the CLI more responsive and user-centric. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces an experimental feature for in-progress model steering, allowing users to provide hints to guide the model's behavior during a task. The implementation is comprehensive, touching the core agent execution logic, the UI container, and the main streaming hook to correctly capture and inject these hints. A significant and impressive part of this change is the new testing infrastructure, including the AppRig utility and extensive behavioral evaluation tests, which ensures the new interactive capabilities are robust. The code quality is high, with notable improvements to existing components like making Config.initialize() idempotent. I have reviewed the changes and found no high or critical issues. This is a well-executed and well-tested feature addition.
| export const DEFAULT_FLASH_LITE_MAX_OUTPUT_CHARS = 180; | ||
| const INPUT_TRUNCATION_SUFFIX = '\n...[truncated]'; | ||
|
|
||
| export const USER_STEERING_INSTRUCTION = |
There was a problem hiding this comment.
nit: the file is called flash-lite helper but this prompt seems rather model specific. Perhaps we should either rename the file or bifurcate this into flash specific and task specific files?
| return responseText; | ||
| } catch (error) { | ||
| debugLogger.debug( | ||
| `[FlashLiteHelper] Generation failed: ${error instanceof Error ? error.message : String(error)}`, |
There was a problem hiding this comment.
Should we use getErrorMessage() here?
| 'Classify it as ADD_TASK, MODIFY_TASK, CANCEL_TASK, or EXTRA_CONTEXT. ' + | ||
| 'Apply minimal-diff changes only to affected tasks and keep unaffected tasks active. ' + | ||
| 'Do not cancel/skip tasks unless the user explicitly cancels them. ' + | ||
| 'Acknowledge the steering briefly and state the course correction.'; |
There was a problem hiding this comment.
Is this purely for driving the UX of the feature? Have we tried or considered prompt the agent to do this in its thoughts instead?
The one thing I'd worry is that the flash model might ACK the message and then the real user model ignores it, leading to a confusing experience where GLCI says one thing and does another.
| - **Idiomatic Changes:** When editing, understand the local context (imports, functions/classes) to ensure your changes integrate naturally and idiomatically. | ||
| - **Comments:** Add code comments sparingly. Focus on *why* something is done, especially for complex logic, rather than *what* is done. Only add high-value comments if necessary for clarity or if requested by the user. Do not edit comments that are separate from the code you are changing. *NEVER* talk to the user or describe your changes through comments. | ||
| - **Proactiveness:** Fulfill the user's request thoroughly. When adding features or fixing bugs, this includes adding tests to ensure quality. Consider all created files, especially tests, to be permanent artifacts unless the user says otherwise.${mandateConflictResolution(options.hasHierarchicalMemory)} | ||
| - **User Hints:** During execution, the user may provide real-time hints (marked as "User hint:" or "User hints:"). Treat these as high-priority but scope-preserving course corrections: apply the minimal plan change needed, keep unaffected user tasks active, and never cancel/skip tasks unless cancellation is explicit for those tasks. Hints may add new tasks, modify one or more tasks, cancel specific tasks, or provide extra context only. If scope is ambiguous, ask for clarification before dropping work. |
There was a problem hiding this comment.
meta q: do we ever decide features like this are not coming to older models?
| modelConfig: { | ||
| model: 'gemini-2.5-flash-lite', | ||
| generateContentConfig: { | ||
| temperature: 0.2, |
There was a problem hiding this comment.
How did we choose the temperature value?
| color: 'gray', | ||
| marginBottom: 1, | ||
| text: ackText, | ||
| } as Omit<HistoryItem, 'id'>); |
There was a problem hiding this comment.
Is there an alternative to using this cast?
This is a rebase / refactor of:
#18783
Fixes #18782