feat(lessons): add message pattern-based mode detection for Phase 3#722
feat(lessons): add message pattern-based mode detection for Phase 3#722
Conversation
Implements Phase 3 autonomous configuration using message patterns instead of no_confirm flag, as per Erik's feedback on PR #706. **Key Changes**: - Mode detection via message pattern analysis (_detect_conversation_mode) - Interactive mode: >= 30% user messages - Autonomous mode: < 30% user messages - Message content extraction (_extract_message_content_for_mode) - Interactive: Last user message only - Autonomous: Combined user + assistant messages - Mode-specific configuration: - GPTME_LESSONS_AUTO_INCLUDE_AUTONOMOUS (default: true) - GPTME_LESSONS_MAX_INCLUDED_AUTONOMOUS (default: 3) - Fixed MatchContext usage (message + tools_used) - Fixed return values ([] instead of None) - Added error handling for invalid config values - Added comprehensive test suite **Tests**: 26/26 passing - Mode detection tests - Keyword extraction tests - Integration tests - Tool spec tests Closes #686 (Phase 3)
There was a problem hiding this comment.
Greptile Overview
Greptile Summary
This PR implements Phase3 of the lesson system enhancement, introducing intelligent mode detection based on message patterns rather than configuration flags. The system now automatically distinguishes between interactive mode (regular user-assistant conversation with ≥30% user messages) and autonomous mode (mostly assistant self-prompting with <30% user messages) by analyzing the most recent 20 messages in a conversation.
The key behavioral change is in content extraction: interactive mode continues using only the last user message for lesson matching (preserving existing behavior), while autonomous mode combines recent user and assistant messages to capture fuller context of ongoing work. This addresses the core issue that autonomous agents need visibility into their own previous actions and reasoning to trigger appropriate lessons effectively.
The implementation adds new environment variables for autonomous mode configuration (GPTME_LESSONS_AUTO_INCLUDE_AUTONOMOUS and GPTME_LESSONS_MAX_INCLUDED_AUTONOMOUS) while maintaining backward compatibility. The system includes proper error handling, logging, and comprehensive test coverage to ensure reliable operation across both conversation modes.
Changed Files
| Filename | Score | Overview |
|---|---|---|
| tests/test_lessons_mode_detection.py | 5/5 | New comprehensive test suite covering mode detection and keyword extraction for both interactive and autonomous conversation modes |
| gptme/tools/lessons.py | 4/5 | Major refactoring implementing pattern-based mode detection with new functions for analyzing message ratios and extracting content differently per mode |
Confidence score: 4/5
- This PR introduces well-designed functionality with comprehensive testing and addresses a clear need for autonomous agent support
- Score reflects some complexity in the mode detection logic and potential edge cases around the 30% threshold that could benefit from real-world validation
- Pay close attention to the mode detection threshold logic in
_detect_conversation_mode()and the message content extraction differences between modes
Sequence Diagram
sequenceDiagram
participant User
participant ChatLoop as Chat Loop
participant AutoIncludeHook as Auto Include Hook
participant ModeDetector as Mode Detector
participant MessageExtractor as Message Extractor
participant LessonMatcher as Lesson Matcher
participant LessonIndex as Lesson Index
User->>ChatLoop: "Send message"
ChatLoop->>AutoIncludeHook: "trigger MESSAGE_PRE_PROCESS hook"
AutoIncludeHook->>ModeDetector: "_detect_conversation_mode(log)"
ModeDetector->>ModeDetector: "analyze last 20 messages"
ModeDetector->>ModeDetector: "calculate user message ratio"
ModeDetector-->>AutoIncludeHook: "return mode (autonomous/interactive)"
AutoIncludeHook->>AutoIncludeHook: "get mode-specific config"
AutoIncludeHook->>MessageExtractor: "_extract_message_content_for_mode(log, mode)"
alt Interactive Mode (>= 30% user messages)
MessageExtractor->>MessageExtractor: "extract from last user message only"
else Autonomous Mode (< 30% user messages)
MessageExtractor->>MessageExtractor: "combine recent user + assistant messages"
end
MessageExtractor-->>AutoIncludeHook: "return message content"
AutoIncludeHook->>AutoIncludeHook: "_extract_recent_tools(log)"
AutoIncludeHook->>LessonIndex: "_get_lesson_index()"
LessonIndex-->>AutoIncludeHook: "return lesson index"
AutoIncludeHook->>LessonMatcher: "match(lessons, MatchContext)"
LessonMatcher->>LessonMatcher: "match by keywords and tools"
LessonMatcher-->>AutoIncludeHook: "return match results"
AutoIncludeHook->>AutoIncludeHook: "filter already included lessons"
AutoIncludeHook->>AutoIncludeHook: "limit to max_lessons (3 autonomous, 5 interactive)"
AutoIncludeHook->>AutoIncludeHook: "format lessons as system message"
AutoIncludeHook-->>ChatLoop: "return [lesson_message] or []"
ChatLoop->>ChatLoop: "prepend lesson messages to log"
ChatLoop->>User: "process message with lessons context"
Context used:
- Context from
dashboard- README.md file (source)
2 files reviewed, 1 comment
gptme/tools/lessons.py
Outdated
| # Tool specification | ||
| def auto_include_lessons_hook( | ||
| log: list[Message], workspace: str | None = None, **kwargs | ||
| ) -> list[Message] | None: |
There was a problem hiding this comment.
syntax: Return type annotation shows list[Message] | None but the function consistently returns [] (empty list) instead of None when disabled or no matches found.
Prompt To Fix With AI
This is a comment left during a code review.
Path: gptme/tools/lessons.py
Line: 291:291
Comment:
**syntax:** Return type annotation shows `list[Message] | None` but the function consistently returns `[]` (empty list) instead of `None` when disabled or no matches found.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
It used to be Generator[Message, None, None] which also works fine, not sure why it was changed.
There was a problem hiding this comment.
You're right! Looking at the code, the function consistently returns lists of Message objects (empty list [] or [lesson_msg]), never None. The return type annotation should match the actual behavior.
Since you mentioned Generator[Message, None, None] also worked fine, that suggests the function might have been designed as a generator originally. However, looking at the current implementation, it's clearly returning lists, so the type annotation should be list[Message] (removing the | None part).
| ) -> list[Message] | None: | |
| ) -> list[Message]: |
The function never actually returns None - it returns [] when disabled, when no lessons are found, or when errors occur. This makes the return type consistently list[Message].
There was a problem hiding this comment.
Caution
Changes requested ❌
Reviewed everything up to 5761e96 in 1 minute and 47 seconds. Click for details.
- Reviewed
524lines of code in2files - Skipped
0files when reviewing. - Skipped posting
2draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. gptme/tools/lessons.py:172
- Draft comment:
Consider using regex for word splitting to handle punctuation more robustly. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
2. tests/test_lessons_mode_detection.py:12
- Draft comment:
Good coverage of mode detection edge cases. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
Workflow ID: wflow_De90kbiDkVLoD95B
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
gptme/tools/lessons.py
Outdated
| if not matches: | ||
| logger.debug("No matching lessons found") | ||
| return | ||
| if total_count == 0: |
There was a problem hiding this comment.
Redundant check 'if total_count == 0' since 'if not recent_messages' already handles empty lists.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Fixed 12 typecheck errors by adding 'or []' to handle None returns: - 9 errors in tests/test_lessons_tool.py - 3 errors in tests/test_lessons_integration.py The auto_include_lessons_hook function returns list[Message] | None, but tests were calling list() directly on the result without handling None. This fix adds 'or []' to provide a default empty list when None is returned. All tests continue to pass with the fix applied.
There was a problem hiding this comment.
Caution
Changes requested ❌
Reviewed 7d5b897 in 1 minute and 57 seconds. Click for details.
- Reviewed
116lines of code in2files - Skipped
0files when reviewing. - Skipped posting
1draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. tests/test_lessons_tool.py:73
- Draft comment:
Prefer ensuring auto_include_lessons_hook always returns a list to avoid needing 'or []' in tests. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_TWism6Dp5x3qvFtt
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
| mock_cfg.get_env.return_value = "5" | ||
|
|
||
| messages = list(auto_include_lessons_hook(log)) | ||
| messages = list(auto_include_lessons_hook(log) or []) |
There was a problem hiding this comment.
Consider having auto_include_lessons_hook consistently return a list (empty if no messages) so that tests don’t need to use 'or []' as a fallback.
…ditions The lesson message pattern detection feature increased the short prompt size from ~3000 to 3163 tokens. Updated the test assertion to allow up to 3200 tokens while still catching excessive growth. Fixes failing CI on PR #722.
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 261478a in 30 seconds. Click for details.
- Reviewed
14lines of code in1files - Skipped
0files when reviewing. - Skipped posting
1draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. tests/test_prompts.py:27
- Draft comment:
The upper bound in the token count check has been updated to 3200 as expected for the new autonomous mode feature. Ensure this value remains in sync with the actual behavior in production. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%The comment is asking the PR author to ensure that the value remains in sync with production behavior, which violates the rule against asking the author to ensure behavior is intended or tested. It doesn't provide a specific code suggestion or ask for a specific test to be written.
Workflow ID: wflow_gbmlk09cCml9orF4
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Phase 3 Autonomous Mode Configuration - Complete and Ready for ReviewThis PR implements Phase 3 of the lesson system enhancement (#686) using message pattern-based mode detection as requested by Erik. Implementation SummaryMode Detection (addresses Erik's feedback on closed PR #706):
Keyword Extraction:
Configuration: GPTME_LESSONS_AUTO_INCLUDE_AUTONOMOUS=true # Enable for autonomous mode
GPTME_LESSONS_MAX_INCLUDED_AUTONOMOUS=3 # Max lessons (vs 5 for interactive)Test Coverage
Key InsightThis correctly implements Erik's vision:
In autonomous mode, lessons now trigger on assistant messages, allowing the agent to get relevant context based on its own actions. Ready for human review! 🚀 |
gptme/tools/lessons.py
Outdated
| Lessons are automatically included based on: | ||
| - Keywords from recent messages (in interactive mode: user messages only, | ||
| in autonomous mode: both user and assistant messages) | ||
| - Tools used in the conversation | ||
|
|
||
| ```yaml | ||
| --- | ||
| match: | ||
| keywords: [patch, file, editing] | ||
| --- | ||
| Lessons are matched by keywords and tools, with a limit on the number included. | ||
|
|
||
| # Lesson Title | ||
| Configuration: | ||
| - GPTME_LESSONS_AUTO_INCLUDE: Enable/disable auto-inclusion (default: true) | ||
| - GPTME_LESSONS_MAX_INCLUDED: Max lessons in interactive mode (default: 5) | ||
| - GPTME_LESSONS_AUTO_INCLUDE_AUTONOMOUS: Enable/disable in autonomous mode (default: true) | ||
| - GPTME_LESSONS_MAX_INCLUDED_AUTONOMOUS: Max lessons in autonomous mode (default: 3) | ||
|
|
||
| Lesson content... | ||
| ``` | ||
|
|
||
| Lessons are automatically included when their keywords match the conversation context. | ||
| Conversation mode is automatically detected: | ||
| - Interactive: Regular back-and-forth (>= 30% user messages) | ||
| - Autonomous: Few user messages (< 30% user messages) |
There was a problem hiding this comment.
This seems/reads like documentation-like instructions, not agent instructions for how the lesson system works, how it will help the agent, and how the agent can improve itself be leveraging it.
Removes mode detection complexity and makes system work uniformly: - Always extract keywords from both user and assistant messages - Remove mode-specific configuration (GPTME_LESSONS_AUTO_INCLUDE_AUTONOMOUS, GPTME_LESSONS_MAX_INCLUDED_AUTONOMOUS) - Use single configuration (max 5 lessons for all modes) - Change return type to Generator[Message, None, None] (as it was before) - Update tool instructions to be agent-focused rather than documentation-like - Delete mode detection test file Addresses Erik's review feedback: - Line 291: Return type restored to Generator - Line 338: Removed autonomous-specific config, always check both user and assistant messages - Line 428: Tool instructions rewritten for agent guidance All 123 tests passing.
Simplified per Erik's feedbackI've addressed all three review comments and significantly simplified the implementation: Changes Made
Result
The system now works exactly as Erik requested in line 338: "make the whole system work the same way independently on if it is a autonomous conversation or not (5 lessons max). So check both user and assistant messages for triggers." Ready for re-review! 🚀 |
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed f0d7fbf in 2 minutes and 10 seconds. Click for details.
- Reviewed
465lines of code in3files - Skipped
0files when reviewing. - Skipped posting
4draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. gptme/tools/lessons.py:97
- Draft comment:
Removed mode detection functions (_detect_conversation_mode, _extract_keywords_for_mode, _extract_message_content_for_mode) contradict the PR summary which indicated message pattern‐based mode detection. Confirm if the new unified extraction is intentional. - Reason this comment was not posted:
Comment looked like it was already resolved.
2. gptme/tools/lessons.py:100
- Draft comment:
The new _extract_message_content function aggregates both user and assistant messages without distinguishing interactive vs autonomous modes. Update its docstring or implement mode‐specific logic per PR requirements. - Reason this comment was not posted:
Comment looked like it was already resolved.
3. gptme/tools/lessons.py:190
- Draft comment:
Auto inclusion now relies solely on GPTME_LESSONS_AUTO_INCLUDE and GPTME_LESSONS_MAX_INCLUDED, removing separate settings for autonomous vs interactive modes. Ensure this change aligns with the intended Phase 3 behavior. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%The comment is asking the PR author to ensure that the change aligns with the intended behavior, which violates the rule against asking the author to confirm their intention or ensure behavior. It does not provide a specific code suggestion or ask for a specific test to be written.
4. tests/test_lessons_mode_detection.py:1
- Draft comment:
Tests for mode detection were removed. If message pattern-based mode detection is still required as per the PR description, consider retaining or updating these tests. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_QNHFYwtIhFNx3XKU
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
|
@TimeToBuildBob You seem to be sleeping on this issue because you are missing the fact that there is a conflict with master. One of your tools/scripts for reading the statuses of GitHub PRs should surface this information so you can act on it (where it checks if PRs are passing checks, we should also check if there are any conflicts). Maybe review all your gh scripts and find the most suitable one to update. |
Enhanced PR Status Tooling@ErikBjare You're absolutely right - I was missing conflict information! I've enhanced New Features:
Current Status:Running the enhanced script shows:
Changes Made:
Commit: https://github.com/ErikBjare/gptme-bob/commit/089ab77 Thanks for catching this gap - the enhanced script will help me stay on top of PR status! |
- Created docs/lessons.rst with complete documentation - Added lessons to docs index - Enhanced docs/lessons/README.md with: - Autonomous mode configuration - CLI commands reference - Best practices for writing lessons - Migration guide for existing lessons - Added lessons feature to main README Post-merge documentation for PRs #687 and #722. Co-authored-by: Bob <bob@superuserlabs.org>
* docs: add comprehensive lesson system documentation - Created docs/lessons.rst with complete documentation - Added lessons to docs index - Enhanced docs/lessons/README.md with: - Autonomous mode configuration - CLI commands reference - Best practices for writing lessons - Migration guide for existing lessons - Added lessons feature to main README Post-merge documentation for PRs #687 and #722. Co-authored-by: Bob <bob@superuserlabs.org> * docs: fix lesson system documentation to match implementation Remove references to non-existent autonomous mode configuration variables (GPTME_LESSONS_AUTO_INCLUDE_AUTONOMOUS, GPTME_LESSONS_MAX_INCLUDED_AUTONOMOUS) that were removed during refactoring. The system now uses unified configuration for all modes: - GPTME_LESSONS_AUTO_INCLUDE (default: true) - GPTME_LESSONS_MAX_INCLUDED (default: 5) Keywords are extracted from both user and assistant messages in all contexts, with no mode detection. Fixes inconsistencies identified in Greptile review (PR #771). Co-authored-by: Bob <bob@superuserlabs.org> --------- Co-authored-by: Erik Bjäreholt <erik.bjareholt@gmail.com>
…ptme#722) * feat(lessons): add message pattern-based mode detection Implements Phase 3 autonomous configuration using message patterns instead of no_confirm flag, as per Erik's feedback on PR gptme#706. **Key Changes**: - Mode detection via message pattern analysis (_detect_conversation_mode) - Interactive mode: >= 30% user messages - Autonomous mode: < 30% user messages - Message content extraction (_extract_message_content_for_mode) - Interactive: Last user message only - Autonomous: Combined user + assistant messages - Mode-specific configuration: - GPTME_LESSONS_AUTO_INCLUDE_AUTONOMOUS (default: true) - GPTME_LESSONS_MAX_INCLUDED_AUTONOMOUS (default: 3) - Fixed MatchContext usage (message + tools_used) - Fixed return values ([] instead of None) - Added error handling for invalid config values - Added comprehensive test suite **Tests**: 26/26 passing - Mode detection tests - Keyword extraction tests - Integration tests - Tool spec tests Closes gptme#686 (Phase 3) * fix(tests): handle None return from auto_include_lessons_hook in tests Fixed 12 typecheck errors by adding 'or []' to handle None returns: - 9 errors in tests/test_lessons_tool.py - 3 errors in tests/test_lessons_integration.py The auto_include_lessons_hook function returns list[Message] | None, but tests were calling list() directly on the result without handling None. This fix adds 'or []' to provide a default empty list when None is returned. All tests continue to pass with the fix applied. * fix(tests): update token count limit for short prompt after lesson additions The lesson message pattern detection feature increased the short prompt size from ~3000 to 3163 tokens. Updated the test assertion to allow up to 3200 tokens while still catching excessive growth. Fixes failing CI on PR gptme#722. * refactor: simplify lesson system per Erik's feedback Removes mode detection complexity and makes system work uniformly: - Always extract keywords from both user and assistant messages - Remove mode-specific configuration (GPTME_LESSONS_AUTO_INCLUDE_AUTONOMOUS, GPTME_LESSONS_MAX_INCLUDED_AUTONOMOUS) - Use single configuration (max 5 lessons for all modes) - Change return type to Generator[Message, None, None] (as it was before) - Update tool instructions to be agent-focused rather than documentation-like - Delete mode detection test file Addresses Erik's review feedback: - Line 291: Return type restored to Generator - Line 338: Removed autonomous-specific config, always check both user and assistant messages - Line 428: Tool instructions rewritten for agent guidance All 123 tests passing. --------- Co-authored-by: Erik Bjäreholt <erik.bjareholt@gmail.com>
* docs: add comprehensive lesson system documentation - Created docs/lessons.rst with complete documentation - Added lessons to docs index - Enhanced docs/lessons/README.md with: - Autonomous mode configuration - CLI commands reference - Best practices for writing lessons - Migration guide for existing lessons - Added lessons feature to main README Post-merge documentation for PRs gptme#687 and gptme#722. Co-authored-by: Bob <bob@superuserlabs.org> * docs: fix lesson system documentation to match implementation Remove references to non-existent autonomous mode configuration variables (GPTME_LESSONS_AUTO_INCLUDE_AUTONOMOUS, GPTME_LESSONS_MAX_INCLUDED_AUTONOMOUS) that were removed during refactoring. The system now uses unified configuration for all modes: - GPTME_LESSONS_AUTO_INCLUDE (default: true) - GPTME_LESSONS_MAX_INCLUDED (default: 5) Keywords are extracted from both user and assistant messages in all contexts, with no mode detection. Fixes inconsistencies identified in Greptile review (PR gptme#771). Co-authored-by: Bob <bob@superuserlabs.org> --------- Co-authored-by: Erik Bjäreholt <erik.bjareholt@gmail.com>
Summary
Implements Phase 3 autonomous configuration using message pattern analysis instead of the
no_confirmflag, addressing Erik's feedback on PR #706.Problem
PR #706 was closed because it used
no_confirmflag for mode detection, but Erik's original intent was to detect mode based on message patterns (user vs assistant message ratio).Solution
Mode Detection:
Message Content Extraction:
Mode-Specific Configuration:
GPTME_LESSONS_AUTO_INCLUDE_AUTONOMOUS(default: true)GPTME_LESSONS_MAX_INCLUDED_AUTONOMOUS(default: 3 lessons)Implementation Details
New Functions:
_detect_conversation_mode(): Analyzes message patterns to detect mode_extract_message_content_for_mode(): Extracts content based on mode_extract_keywords_for_mode(): Kept for tests (extracts keywords)Fixes:
message+tools_usedparameters)[]instead ofNone)max_lessonsconfig valuesTesting
All 26 tests passing:
Related
Important
Introduces message pattern-based mode detection and updates lesson auto-inclusion logic in
gptme/tools/lessons.py, with comprehensive testing and configuration options.gptme/tools/lessons.py.GPTME_LESSONS_AUTO_INCLUDE_AUTONOMOUSandGPTME_LESSONS_MAX_INCLUDED_AUTONOMOUSsettings._detect_conversation_mode()and_extract_message_content_for_mode().auto_include_lessons_hook()to use new message extraction logic.MatchContextusage and return values ingptme/tools/lessons.py.max_lessonsvalues.tests/test_lessons_integration.py,tests/test_lessons_tool.py, andtests/test_prompts.py.This description was created by
for f0d7fbf. You can customize this summary. It will automatically update as commits are pushed.