feat(bot): enforce evaluation role, multi-iteration feedback loop, and diagnostic rigor by gundermanc · Pull Request #26303 · google-gemini/gemini-cli

gundermanc · 2026-04-30T23:55:37Z

Summary

This PR improves the Gemini CLI Bot's system prompts to explicitly identify and resolve architectural conflicts, restricts the critique agent to an evaluation-only role, implements a multi-iteration feedback loop, and improves its diagnostic rigor and context awareness.

Details

metrics.md (The Brain): Added a new rule instructing the Brain to actively search the repository for overlapping systems before optimizing and to verify their intent (contradictory vs. complementary).
critique.md (The Reviewer):
- Redefined the critique agent as an "evaluator ONLY" that must NOT apply fixes or modify the code itself.
- Added an Architectural Conflict check to the Logical & Workflow Integrity checklist.
- Added a Systemic Simulation requirement, forcing the agent to explicitly write out a timeline simulation for time-based logic.
- Added a validation step instructing the critique agent to ensure changes pass the build, tests, and linter.
common.md:
- Clarified the "Surgical Changes" rule to explicitly state that deleting duplicated, conflicting, or obsolete workflows is considered the ultimate "surgical" fix.
- Added Empirical Log Verification rule to the Defensive Scripting section, mandating the use of gh run view --log-failed for diagnosing CI/workflow failures instead of relying on static code analysis.
interactive.md: Added an exception to the "Ignore Pending Tasks" rule so that if the user's comment is on a PR authored by the bot, the bot can engage the UNBLOCKING PROTOCOL and inspect the PR's true technical state (like CI failures) instead of being stuck in a restricted context loop.
gemini-cli-bot-brain.yml: Implemented a new bash loop to replace the static Brain -> Critique pipeline. The pipeline now supports an investigate -> critique -> investigate -> critique loop. If the Critique agent rejects the Brain's changes, it passes the feedback via critique_feedback.md, resets the repository state, and allows the Brain a second attempt to implement a fix.

Related Issues

Resolves a systemic issue discovered while reviewing PR #26302 and PR #26304, where the bot either ignored conflicting workflows or introduced structural bugs due to generative flaws in the critique phase. Also addresses the root cause of the bot's inability to diagnose CI failures on its own PRs as seen in #26513.

How to Validate

Review the prompt changes to ensure they are clear, nuanced, and enforce the desired behavior for log verification, unblocking protocol engagement, and architectural conflict resolution.
Review the workflow changes to confirm the while loop logic successfully captures and passes feedback between the agents.
Trigger the workflow manually and observe the run logs. If the critique agent rejects the first attempt, you should see Iteration 2 start, receive the feedback, and attempt a new solution.

Pre-Merge Checklist

github-actions · 2026-04-30T23:59:07Z

Size Change: -4 B (0%)

Total Size: 34 MB

Filename	Size	Change
`./bundle/chunk-2KV5MEJA.js`	0 B	-3.43 kB (removed)	🏆
`./bundle/chunk-57KM6YND.js`	0 B	-2.78 MB (removed)	🏆
`./bundle/chunk-ASRU6KWB.js`	0 B	-658 kB (removed)	🏆
`./bundle/chunk-H2G2F75F.js`	0 B	-14.7 MB (removed)	🏆
`./bundle/chunk-KIXZ2FCH.js`	0 B	-3.8 kB (removed)	🏆
`./bundle/chunk-NKU7TUNE.js`	0 B	-19.5 kB (removed)	🏆
`./bundle/chunk-VF2E5OO3.js`	0 B	-49.2 kB (removed)	🏆
`./bundle/chunk-VYNU6FEB.js`	0 B	-12.5 kB (removed)	🏆
`./bundle/core-SM75YMGM.js`	0 B	-48.8 kB (removed)	🏆
`./bundle/devtoolsService-3FA3XYXG.js`	0 B	-28 kB (removed)	🏆
`./bundle/gemini-N5PF7HCL.js`	0 B	-583 kB (removed)	🏆
`./bundle/interactiveCli-AKPXIBTN.js`	0 B	-1.29 MB (removed)	🏆
`./bundle/liteRtServerManager-3USJIXPG.js`	0 B	-2.11 kB (removed)	🏆
`./bundle/oauth2-provider-WTI6KIQJ.js`	0 B	-9.16 kB (removed)	🏆
`./bundle/chunk-4TCIBMU6.js`	19.5 kB	+19.5 kB (new file)	🆕
`./bundle/chunk-73DCHPYR.js`	12.5 kB	+12.5 kB (new file)	🆕
`./bundle/chunk-C42BOO2A.js`	3.8 kB	+3.8 kB (new file)	🆕
`./bundle/chunk-FXK3LICV.js`	49.2 kB	+49.2 kB (new file)	🆕
`./bundle/chunk-JLQRYP72.js`	658 kB	+658 kB (new file)	🆕
`./bundle/chunk-O7FCEU4Z.js`	2.78 MB	+2.78 MB (new file)	🆕
`./bundle/chunk-VZ6CZHM7.js`	14.7 MB	+14.7 MB (new file)	🆕
`./bundle/chunk-XWGESCQP.js`	3.43 kB	+3.43 kB (new file)	🆕
`./bundle/core-FPNOC3JH.js`	48.8 kB	+48.8 kB (new file)	🆕
`./bundle/devtoolsService-G2UYLTUH.js`	28 kB	+28 kB (new file)	🆕
`./bundle/gemini-D5A7NLRF.js`	583 kB	+583 kB (new file)	🆕
`./bundle/interactiveCli-QPMYW6QD.js`	1.29 MB	+1.29 MB (new file)	🆕
`./bundle/liteRtServerManager-LSMIXHLQ.js`	2.11 kB	+2.11 kB (new file)	🆕
`./bundle/oauth2-provider-RPZNJDAR.js`	9.16 kB	+9.16 kB (new file)	🆕

ℹ️ View Unchanged

Filename	Size	Change
`./bundle/bundled/third_party/index.js`	8 MB	0 B
`./bundle/chunk-34MYV7JD.js`	2.45 kB	0 B
`./bundle/chunk-5AUYMPVF.js`	858 B	0 B
`./bundle/chunk-5PS3AYFU.js`	1.18 kB	0 B
`./bundle/chunk-664ZODQF.js`	124 kB	0 B
`./bundle/chunk-DAHVX5MI.js`	206 kB	0 B
`./bundle/chunk-IUUIT4SU.js`	56.5 kB	0 B
`./bundle/chunk-RJTRUG2J.js`	39.8 kB	0 B
`./bundle/chunk-VJSUVOZ4.js`	1.97 MB	0 B
`./bundle/cleanup-O7FPPK6O.js`	0 B	-932 B (removed)	🏆
`./bundle/devtools-36NN55EP.js`	696 kB	0 B
`./bundle/dist-T73EYRDX.js`	356 B	0 B
`./bundle/events-XB7DADIJ.js`	418 B	0 B
`./bundle/examples/hooks/scripts/on-start.js`	188 B	0 B
`./bundle/examples/mcp-server/example.js`	1.43 kB	0 B
`./bundle/gemini.js`	5.1 kB	0 B
`./bundle/getMachineId-bsd-TXG52NKR.js`	1.55 kB	0 B
`./bundle/getMachineId-darwin-7OE4DDZ6.js`	1.55 kB	0 B
`./bundle/getMachineId-linux-SHIFKOOX.js`	1.34 kB	0 B
`./bundle/getMachineId-unsupported-5U5DOEYY.js`	1.06 kB	0 B
`./bundle/getMachineId-win-6KLLGOI4.js`	1.72 kB	0 B
`./bundle/memoryDiscovery-NGHTMHWQ.js`	980 B	0 B
`./bundle/multipart-parser-KPBZEGQU.js`	11.7 kB	0 B
`./bundle/node_modules/@google/gemini-cli-devtools/dist/client/main.js`	222 kB	0 B
`./bundle/node_modules/@google/gemini-cli-devtools/dist/src/_client-assets.js`	229 kB	0 B
`./bundle/node_modules/@google/gemini-cli-devtools/dist/src/index.js`	13.4 kB	0 B
`./bundle/node_modules/@google/gemini-cli-devtools/dist/src/types.js`	132 B	0 B
`./bundle/sandbox-macos-permissive-open.sb`	890 B	0 B
`./bundle/sandbox-macos-permissive-proxied.sb`	1.31 kB	0 B
`./bundle/sandbox-macos-restrictive-open.sb`	3.36 kB	0 B
`./bundle/sandbox-macos-restrictive-proxied.sb`	3.56 kB	0 B
`./bundle/sandbox-macos-strict-open.sb`	4.82 kB	0 B
`./bundle/sandbox-macos-strict-proxied.sb`	5.02 kB	0 B
`./bundle/src-QVCVGIUX.js`	47 kB	0 B
`./bundle/start-ZKA2MUE3.js`	0 B	-652 B (removed)	🏆
`./bundle/tree-sitter-7U6MW5PS.js`	274 kB	0 B
`./bundle/tree-sitter-bash-34ZGLXVX.js`	1.84 MB	0 B
`./bundle/cleanup-LKHVZU2K.js`	932 B	+932 B (new file)	🆕
`./bundle/start-BNGOVM37.js`	652 B	+652 B (new file)	🆕

_{compressed-size-action}

gemini-code-assist · 2026-05-01T00:00:14Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the Gemini CLI Bot's decision-making capabilities by refining its system prompts. The changes focus on improving how the bot identifies and handles architectural conflicts within the repository, ensuring that it differentiates between redundant workflows that should be consolidated and complementary systems that should be preserved. This update aims to prevent the bot from making superficial optimizations that ignore underlying systemic conflicts.

Highlights

Architectural Conflict Detection: Updated system prompts to require the bot to actively search for and evaluate overlapping systems before proposing optimizations.
Nuanced Resolution Strategy: Instructed the bot to distinguish between contradictory and complementary workflows, favoring consolidation for conflicts rather than naive adjustments.
Reviewer Checklist Update: Added an explicit 'Architectural Conflict' check to the critique process to prevent the bot from treating symptoms of deeper structural issues.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request updates the bot's internal guidelines across common.md, critique.md, and metrics.md to better address architectural conflicts and redundant systems. Key changes include defining the removal of obsolete workflows as a surgical fix, adding a critique rule to reject symptom-only fixes for architectural issues, and requiring the identification of overlapping systems during policy evaluation. Feedback was provided to refine the language in common.md, suggesting that workflows should be deleted only when redundant or in direct conflict with other systems, rather than based on vague 'standard practices', to avoid the risk of deleting valid legacy code.

_{Note: Security Review has been skipped due to the limited scope of the PR.}

gemini-code-assist · 2026-05-01T00:03:22Z

+    delete files or workflows if your evidence shows they are conflicting with
+    standard practices.


The phrase "conflicting with standard practices" is vague and potentially risky for an automated agent. It could lead to the deletion of valid workflows that simply use non-standard patterns or legacy styles that are still intended. Given the PR's focus on architectural conflicts, it is safer to instruct the bot to delete files only when they are redundant or in direct conflict with other systems, as specified in the other prompt updates in this PR. Maintaining consistency in documentation is crucial to avoid contradictions across the repository.

Suggested change

delete files or workflows if your evidence shows they are conflicting with

standard practices.

delete files or workflows if your evidence shows they are redundant or

conflicting with other systems.

References

Maintain consistency in documentation. When information about a feature is present in multiple documents, ensure all instances are updated or removed together to avoid contradictions.

- BerriAI/litellm#26969: tool-permission guardrail tightening (merge-after-nits) - BerriAI/litellm#26967: VCR Redis observability (merge-as-is) - google-gemini/gemini-cli#26303: brain/critique role split + iteration (needs-discussion) - google-gemini/gemini-cli#26287: voice transcription cursor-position insert (merge-after-nits) - google-gemini/gemini-cli#26274: ssh:// extension install scheme (merge-as-is)

Fixes the throughput metrics script and introduces new visibility into backlog bottlenecks and priority distribution. ### Changes - **Throughput Fixes**: Resolved a `ReferenceError` where `isMaintainer` was not correctly scoped, fixed a malformed license header, and added a new metric for `issue_arrival_rate_per_day` to enable growth-vs-closure analysis. - **Backlog Bottlenecks**: Introduced `bottlenecks.ts` to identify "Zombie" issues (no activity > 30 days) and "Hot" issues (high activity). - **Priority Distribution**: Introduced `priority_distribution.ts` to track the count of open issues by priority level (P0-P3). ### Impact These metrics will provide the necessary data to confirm if the repository is experiencing systemic backlog growth (Arrival Rate > Throughput) and help identify which segments of the backlog require urgent triage.

…ardcoded main

…edback loops This update hardens the bot's reasoning and validation layers to stop thrashing and ensure technical quality: - Mandates local validation (lint, build, test) in Brain and Critique prompts. - Uncaps bottleneck metrics (zombie issues, priority distribution) to 1000 items. - Enhances PR awareness to handle multiple bot identities and exclude release PRs. - Formally defines closed (unmerged) PRs as explicit user rejection signals. - Strengthens domain rotation and anti-pigeonholing enforcement.

…g CI This update resolves the bot's persistent focus on already-completed tasks: - Moves and syncs lessons-learned.md to tools/gemini-cli-bot/ to ensure persistent memory. - Marks metrics fixes, prompt hardenings, and user rejection signals as DONE in the ledger. - Implements the CI matrix optimization (Node 20.x for PRs) the bot was re-attempting. - This forces the bot to rotate to a new domain in the next run by satisfying its current goals.

- Mandate the use of `gh run view` for empirical log verification rather than static code inspection. - Update interactive mode prompt to allow the agent to retain task context and run the unblocking protocol when following up on its own PRs.

gundermanc force-pushed the bot-prompt-improvements branch from db278fa to 0b87ff1 Compare April 30, 2026 23:59

gundermanc changed the title ~~feat(bot): improve conflict detection in system prompts~~ feat(bot): improve nuanced conflict detection in system prompts Apr 30, 2026

gundermanc marked this pull request as ready for review May 1, 2026 00:00

gundermanc requested a review from a team as a code owner May 1, 2026 00:00

gemini-code-assist Bot reviewed May 1, 2026

View reviewed changes

gundermanc force-pushed the bot-prompt-improvements branch from 0b87ff1 to d6add1d Compare May 1, 2026 00:04

gundermanc changed the title ~~feat(bot): improve nuanced conflict detection in system prompts~~ feat(bot): improve nuanced conflict detection and validation in system prompts May 1, 2026

github-actions Bot mentioned this pull request May 1, 2026

📊 AI CLI 工具社区动态日报 2026-05-01 gsscsd/big_model_radar#275

Open

gundermanc force-pushed the bot-prompt-improvements branch from d6add1d to be65d2f Compare May 1, 2026 03:47

gundermanc requested a review from a team as a code owner May 1, 2026 03:47

gundermanc changed the title ~~feat(bot): improve nuanced conflict detection and validation in system prompts~~ feat(bot): enforce evaluation role and multi-iteration feedback loop May 1, 2026

feat(bot): enforce evaluation role and multi-iteration feedback loop

c6121d5

gundermanc force-pushed the bot-prompt-improvements branch from be65d2f to c6121d5 Compare May 1, 2026 03:51

gundermanc and others added 2 commits May 1, 2026 10:22

fix(bot): prevent publish job from creating PRs for rejected changes

b266912

gundermanc force-pushed the bot-prompt-improvements branch from b93f573 to 381aae2 Compare May 1, 2026 20:52

gundermanc added 4 commits May 1, 2026 14:12

fix(bot): forbid metrics changes and require policy changes only

c18ae0c

fix(bot): enforce defensive scripting and preservation of exemptions

393d72a

feat(bot): increase loop iterations and enforce memory of failures

39de958

test(critique): improve prompt robustness for scale and rate limits

1b021bd

This was referenced May 2, 2026

📊 AI CLI 工具社区动态日报 2026-05-02 borq168/big_model_radar#104

Open

📊 AI CLI Tools Digest 2026-05-02 borq168/big_model_radar#107

Open

docs(bot): prevent pigeonholing and thrashing in metrics agent

cd8bfce

github-actions Bot mentioned this pull request May 5, 2026

📊 AI CLI 工具社区动态日报 2026-05-05 gsscsd/big_model_radar#295

Open

gundermanc added 3 commits May 4, 2026 19:27

fix(bot): dynamically checkout github.ref in publish job instead of h…

128bba3

…ardcoded main

Merge branch 'main' into bot-prompt-improvements

76c97bf

docs(bot): mandate strict domain rotation to stop workflow thrashing

ec786ae

gundermanc added 4 commits May 5, 2026 08:34

fix(bot): implement pagination for metric scripts to avoid GraphQL limit

8572e60

gundermanc changed the title ~~feat(bot): enforce evaluation role and multi-iteration feedback loop~~ feat(bot): enforce evaluation role, multi-iteration feedback loop, and diagnostic rigor May 5, 2026

github-actions Bot mentioned this pull request May 6, 2026

📊 AI CLI 工具社区动态日报 2026-05-06 gsscsd/big_model_radar#300

Open

gundermanc marked this pull request as draft May 6, 2026 20:57

github-actions Bot mentioned this pull request May 7, 2026

📊 AI CLI 工具社区动态日报 2026-05-07 gsscsd/big_model_radar#305

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(bot): enforce evaluation role, multi-iteration feedback loop, and diagnostic rigor#26303

feat(bot): enforce evaluation role, multi-iteration feedback loop, and diagnostic rigor#26303
gundermanc wants to merge 15 commits intomainfrom
bot-prompt-improvements

gundermanc commented Apr 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented May 1, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		delete files or workflows if your evidence shows they are conflicting with
		standard practices.

Conversation

gundermanc commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Related Issues

How to Validate

Pre-Merge Checklist

Uh oh!

github-actions Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot commented May 1, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gundermanc commented Apr 30, 2026 •

edited

Loading

github-actions Bot commented Apr 30, 2026 •

edited

Loading