fix(cli): insert voice transcription at cursor position instead of ap…#26287
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request improves the user experience for voice transcription in the CLI by respecting the cursor position during text input. By capturing the cursor offset at the start of a recording session and maintaining a baseline, the system now correctly splices transcribed text into the existing buffer rather than forcing it to the end. This ensures that users can insert voice-dictated text into the middle of existing content seamlessly. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request updates the voice mode transcription logic to support inserting text at the current cursor position rather than always appending to the end of the buffer. It introduces a new ref to track the cursor offset at the start of a turn and refactors the text construction logic to preserve any text following the cursor. Corresponding tests were updated to verify correct cursor placement and text insertion. Feedback was provided regarding the need for a separator space between the transcription and the trailing text, as well as suggestions for sanitizing the transcribed text to prevent prompt injection.
Note: Security Review is unavailable for this PR.
devr0306
left a comment
There was a problem hiding this comment.
The code changes look excellent. You correctly isolated the cursor offset and updated the text splicing logic instead of blind appending. I've run the related InputPrompt test suite and it passes locally, and I see you've included a specific new test for this behavior. Approved!
- BerriAI/litellm#26969: tool-permission guardrail tightening (merge-after-nits) - BerriAI/litellm#26967: VCR Redis observability (merge-as-is) - google-gemini/gemini-cli#26303: brain/critique role split + iteration (needs-discussion) - google-gemini/gemini-cli#26287: voice transcription cursor-position insert (merge-after-nits) - google-gemini/gemini-cli#26274: ssh:// extension install scheme (merge-as-is)
google-gemini#26287) Co-authored-by: Zheyuan <zlin252@emory.edu>
google-gemini#26287) Co-authored-by: Zheyuan <zlin252@emory.edu>
Summary
Voice transcription text was always appended to the end of the input buffer, ignoring the cursor position.
Details
Capture the cursor offset (via
getOffset()) when recording starts alongside the existing text baseline snapshot.On each transcription event, splice the text at that fixed offset instead of appending.
On
turnComplete, advance the baseline so subsequent speech turns chain correctly after the previous one.Related Issues
Fixes #25494
Pre-Merge Checklist