fix: make Gemini max-tokens optional and clarify max-token errors#81
Merged
Conversation
Gemini agentic tool runs were failing with opaque "max_tokens limit" errors when the configured max output tokens were too small for multi-step tool calls. Lift Gemini's effective output budget to a practical floor and surface clear guidance in both the core agent and the QGIS chat dock when a model output cap is hit.
|
🚀 Deployed on https://69fd41a9d49eee949eda0752--opengeos.netlify.app |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR improves the Gemini provider experience by (1) ensuring Gemini has a practical minimum output-token budget for agentic/tool workflows, and (2) translating “max tokens reached” failures into actionable guidance in both the core library and the QGIS chat UI, with accompanying unit tests.
Changes:
- Enforce an 8192 minimum
max_output_tokensfor Gemini while preserving larger user-configured limits. - Add max-token stop detection and clearer remediation guidance in
geoagentand the QGIS chat dock. - Add unit tests covering the Gemini token floor and the QGIS error-formatting branch.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
geoagent/core/model.py |
Adds Gemini output-token floor helper and applies it during model resolution. |
geoagent/core/agent.py |
Detects max-token stop errors and formats them into user-facing guidance. |
qgis_geoagent/open_geoagent/dialogs/chat_dock.py |
Adds UI-side formatting for max-token stop-related errors (incl. Gemini-specific advice). |
tests/test_model_providers.py |
Tests Gemini token-floor behavior and preservation of larger explicit limits. |
qgis_geoagent/tests/test_chat_tool_inputs.py |
Tests QGIS chat-dock error formatter for Gemini max-token-stop messages. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Streaming path now treats `stop_reason=max_tokens` as a failure and routes the error through `_format_chat_worker_error`, so users see the same actionable max-token guidance during streamed runs instead of a truncated answer. - Replace the duplicated max-token marker list in `_format_chat_worker_error` with the shared `_looks_like_max_tokens_reached` helper from `geoagent.core.agent`, keeping the QGIS UI and core agent in sync (now also catches `stop reason: max_tokens`). - Add a test covering both `stop_reason=max_tokens` and `stop reason: max_tokens` phrasings so the two paths cannot drift again.
for more information, see https://pre-commit.ci
- Clamp positive max-token values below 256 up to 256 in `_max_tokens_to_setting`. The settings spinbox now uses 0 as the Auto sentinel, which lets users type 1-255 that the previous min=256 spinbox rejected. Saving normalizes those into the historical 256-token floor so the saved config never holds a value the spinbox previously disallowed. - Add a unit test that pins the clamp behavior and the Auto sentinel.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
max_tokensas optional across providers. For Gemini in particular, omitmax_output_tokenswhen no explicit limit is set so the provider's native output budget is used. Explicit user values are preserved as-is._format_chat_worker_errorso users see it instead of a truncated answer. Detection markers are now shared between the core helper and the UI.Test plan
pre-commit run --all-filespytest tests/test_model_providers.py qgis_geoagent/tests/test_chat_tool_inputs.py -qmax_tokensand confirm the new guidance is shown.