[feat] add grammar sessions#9156
Conversation
- Introduced CreateGrammarReqInput and CreateGrammarReqOutput data classes for grammar creation requests and responses. - Added DeleteGrammarReqInput data class for grammar deletion requests. - Implemented create_grammar and delete_grammar methods in the Scheduler and TokenizerManager classes to handle grammar operations. - Updated SamplingParams to include grammar_id for better request handling. - Enhanced grammar management with error handling and logging for existing grammar IDs and initialization checks.
- Implemented /create_grammar and /delete_grammar API routes for managing grammars. - Integrated CreateGrammarReqInput and DeleteGrammarReqInput for request handling. - Enhanced error handling for grammar creation and deletion processes.
- Introduced a new test suite for grammar management, covering creation, deletion, and error handling for grammars. - Implemented tests for creating grammars with JSON schema and EBNF notation, including scenarios for duplicate IDs and missing content. - Added tests for generating content using grammars and handling sessions for incremental generation. - Ensured proper cleanup of grammars and sessions after tests to maintain isolation.
- Updated the grammar dictionary to include an owner request for better session handling. - Added checks to prevent reusing grammar IDs while an owner request is still active. - Adjusted the grammar assignment logic to accommodate the new structure.
There was a problem hiding this comment.
Summary of Changes
Hello @nathanrchn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces a new feature that enables persistent grammar sessions for multi-turn conversations. This allows users to define and reuse grammars (JSON schema, regex, EBNF) across multiple API requests, ensuring consistent structured output. This is particularly useful for agentic workflows requiring strict output formatting.
Highlights
- New API Endpoints: Added /create_grammar and /delete_grammar HTTP endpoints for managing the lifecycle of grammar sessions.
- Persistent Grammar IDs: Introduced a grammar_id parameter in sampling settings, allowing users to reference and apply previously defined grammars during text generation.
- Backend Grammar Management: Implemented core logic within the scheduler and tokenizer manager to store, retrieve, and validate grammar objects, ensuring efficient reuse and proper error handling for grammar sessions.
- Comprehensive Testing: Included a new dedicated test suite to thoroughly validate the functionality of grammar session creation, usage, deletion, and error scenarios.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request introduces grammar sessions, allowing for persistent grammars in multi-turn conversations. It adds /create_grammar and /delete_grammar endpoints and a grammar_id sampling parameter. The implementation involves changes in the HTTP server, scheduler, and tokenizer manager to handle the lifecycle of grammars. New tests are added to cover this functionality.
My review found a high-severity bug in the grammar creation logic that could lead to a server crash, and a minor typo in a variable name. I've provided suggestions to fix these issues. Overall, the changes are well-structured and the feature is a valuable addition.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Motivation
Enables persistent grammar for multi-turn conversations where the same grammar needs to be enforced across multiple requests, supporting agentic workflows that require consistent structured output formatting using the same grammar.
Modifications
Accuracy Tests
Benchmarking and Profiling
Checklist