feat(anthropic): add native web search support#723
Conversation
There was a problem hiding this comment.
Greptile Overview
Greptile Summary
This PR implements Anthropic Claude's native web search functionality to address broken web search capabilities in gptme. The traditional browser-based search through Google and DuckDuckGo has been failing due to bot detection (issue #492), so this change introduces Claude's built-in web search as an alternative solution.
The implementation adds a new _create_web_search_tool() function in gptme/llm/llm_anthropic.py that creates an Anthropic web search tool definition, and modifies _prepare_messages_for_api() to conditionally include this tool based on environment variables. The feature is controlled by GPTME_ANTHROPIC_WEB_SEARCH (enable/disable) and GPTME_ANTHROPIC_WEB_SEARCH_MAX_USES (max search cycles, default: 5).
This change integrates seamlessly with gptme's existing architecture by leveraging Anthropic's native API capabilities rather than adding new tool implementations. When enabled, Claude automatically decides when to use web search based on user queries, providing more reliable search results than the current scraping-based approach.
Important Files Changed
Changed Files
| Filename | Score | Overview |
|---|---|---|
gptme/llm/llm_anthropic.py |
4/5 | Implements core native web search functionality with environment variable configuration |
tests/test_llm_anthropic.py |
4/5 | Adds comprehensive test coverage for web search feature including enabled, disabled, and integration scenarios |
gptme/tools/browser.py |
5/5 | Documentation-only update explaining the new provider native search capability with usage examples |
Confidence score: 4/5
- This PR is safe to merge with low risk as it provides a valuable alternative to broken search functionality
- Score reflects well-structured implementation with good test coverage, but minor concerns about external API dependency and environment variable validation could be improved
- Pay close attention to
gptme/llm/llm_anthropic.pyfor potential error handling around environment variables and API integration
Sequence Diagram
sequenceDiagram
participant User
participant CLI as gptme CLI
participant Anthropic as llm_anthropic.py
participant API as Anthropic API
User->>CLI: "gptme -m anthropic/claude-sonnet-4-5"
CLI->>Anthropic: init(config)
Anthropic->>Anthropic: Check GPTME_ANTHROPIC_WEB_SEARCH env var
User->>CLI: "What's the current weather in Tokyo?"
CLI->>Anthropic: chat(messages, model, tools)
Anthropic->>Anthropic: _prepare_messages_for_api(messages, tools)
Anthropic->>Anthropic: Check GPTME_ANTHROPIC_WEB_SEARCH="true"
Anthropic->>Anthropic: _create_web_search_tool(max_uses=5)
Anthropic->>Anthropic: Add web_search tool to tools_dict
Anthropic->>Anthropic: Log "Anthropic native web search enabled"
Anthropic->>API: messages.create() with web_search tool
API->>API: Claude decides to use web search
API->>API: Perform web search for Tokyo weather
API-->>Anthropic: Response with search results
Anthropic->>Anthropic: Parse response blocks (text, tool_use)
Anthropic-->>CLI: Formatted response with weather data
CLI-->>User: "Current weather in Tokyo is..."
Context used:
- Context from
dashboard- README.md file (source)
3 files reviewed, 2 comments
| "yes", | ||
| ) | ||
| if web_search_enabled: | ||
| max_uses = int(os.environ.get("GPTME_ANTHROPIC_WEB_SEARCH_MAX_USES", "5")) |
There was a problem hiding this comment.
style: Consider adding input validation to ensure max_uses is within reasonable bounds (e.g., 1-20) to prevent potential API rate limiting issues.
Prompt To Fix With AI
This is a comment left during a code review.
Path: gptme/llm/llm_anthropic.py
Line: 548:548
Comment:
**style:** Consider adding input validation to ensure max_uses is within reasonable bounds (e.g., 1-20) to prevent potential API rate limiting issues.
How can I resolve this? If you propose a fix, please make it concise.| web_search_tool = _create_web_search_tool(max_uses=max_uses) | ||
| if tools_dict is None: | ||
| tools_dict = [] | ||
| tools_dict.append(web_search_tool) # type: ignore |
There was a problem hiding this comment.
style: The type ignore comment suggests a type mismatch. Consider using explicit typing to ensure web_search_tool matches the expected anthropic.types.ToolParam type.
Prompt To Fix With AI
This is a comment left during a code review.
Path: gptme/llm/llm_anthropic.py
Line: 552:552
Comment:
**style:** The type ignore comment suggests a type mismatch. Consider using explicit typing to ensure `web_search_tool` matches the expected `anthropic.types.ToolParam` type.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed everything up to 5038e89 in 49 seconds. Click for details.
- Reviewed
181lines of code in3files - Skipped
0files when reviewing. - Skipped posting
2draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. gptme/llm/llm_anthropic.py:548
- Draft comment:
Consider adding error handling around converting GPTME_ANTHROPIC_WEB_SEARCH_MAX_USES to int. If the environment variable contains a non-integer value, a graceful error message would improve robustness. - Reason this comment was not posted:
Confidence changes required:50%<= threshold50%None
2. gptme/llm/llm_anthropic.py:552
- Draft comment:
Consider checking if a web search tool (type 'web_search_20250305') is already present in tools_dict before appending it. This would prevent potential duplication if the tool is passed manually. - Reason this comment was not posted:
Confidence changes required:40%<= threshold50%None
Workflow ID: wflow_bByrdO8PHNl3Gnxf
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 389fca3 in 43 seconds. Click for details.
- Reviewed
24lines of code in1files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. tests/test_llm_anthropic.py:276
- Draft comment:
Consider replacing the type ignore with an explicit cast to the expected type (e.g., using typing.cast) for increased type safety. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
2. tests/test_llm_anthropic.py:278
- Draft comment:
Instead of using '# type: ignore[typeddict-item]' inline, consider using a cast to properly annotate the dict type. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
3. tests/test_llm_anthropic.py:344
- Draft comment:
The type ignore comment on the 'max_uses' field is acceptable, but consider using a cast if feasible for clarity. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
Workflow ID: wflow_KxWJA8nigc7axphQ
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
…search - Add try/except for invalid GPTME_ANTHROPIC_WEB_SEARCH_MAX_USES values - Add duplicate tool check to prevent multiple web search tools - Add tests for both improvements - Addresses automated review feedback from PR #723 Changes: - Error handling: Falls back to default max_uses=5 if env var is invalid - Duplicate prevention: Checks tools_dict for existing web_search_20250305 - Test coverage: Added test_web_search_invalid_max_uses and test_web_search_no_duplicate_tools - All 5 web search tests passing
Improvements Added (2025-10-17)Addressed the automated review feedback from Ellipsis: Changes
TestingAdded two new test cases:
All 5 web search tests passing:
Ready for ReviewThis PR is now ready for human review with:
|
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed de6bc88 in 36 seconds. Click for details.
- Reviewed
136lines of code in2files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. gptme/llm/llm_anthropic.py:549
- Draft comment:
Good error handling for parsing 'max_uses'. Consider extracting the default value (5) into a constant for easier maintenance and consistency across the codebase. - Reason this comment was not posted:
Confidence changes required:20%<= threshold50%None
2. gptme/llm/llm_anthropic.py:561
- Draft comment:
The duplicate check for the web search tool is implemented by comparing the 'type' field. It might help to add a comment explaining why checking just 'type' is sufficient and that no other tool will share this identifier. - Reason this comment was not posted:
Confidence changes required:10%<= threshold50%None
3. tests/test_llm_anthropic.py:253
- Draft comment:
The new tests for enabling/disabling the web search tool, invalid max_uses values, and duplicate tool prevention are comprehensive. Nice work ensuring environment variables are cleaned up using try/finally. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
Workflow ID: wflow_RLmadhQcn7GgF2Ne
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
de6bc88 to
9335df9
Compare
loftybuilder
left a comment
There was a problem hiding this comment.
Review Summary
Approve ✅ - Clean implementation of Anthropic native web search support.
What's Good
- Clean architecture:
_create_web_search_tool()function is well-documented and focused - Opt-in by default: Feature is disabled unless explicitly enabled via env vars
- Good documentation: Browser tool docs updated with configuration examples
- Solid test coverage: Three test cases cover enabled, disabled, and combined tool scenarios
- CI passing: All checks green
Minor Observations
Looking at the current code:
max_uses = int(os.environ.get("GPTME_ANTHROPIC_WEB_SEARCH_MAX_USES", "5"))This could raise ValueError if someone sets an invalid value like GPTME_ANTHROPIC_WEB_SEARCH_MAX_USES=abc. Your PR description mentions adding error handling for this - I see Ellipsis reviewed it positively, but the current diff doesn't show the try/except. Not a blocker since users setting invalid values would see an obvious error, but worth verifying.
Decision
APPROVED - The core implementation is solid and addresses a real pain point (broken browser search). The feature is properly opt-in and well-tested. This provides a valuable workaround for issue #492.
Ready for maintainer merge.
Implements Anthropic Claude native web search to address issue #492 (broken search due to bot detection on Google/DuckDuckGo). Features: - Environment variable configuration (GPTME_ANTHROPIC_WEB_SEARCH) - Configurable max_uses parameter (GPTME_ANTHROPIC_WEB_SEARCH_MAX_USES) - Seamless integration with existing tools - Automatic web search - Claude decides when to use it - Backward compatible (disabled by default) Implementation: - Added _create_web_search_tool() to create tool definition - Modified _prepare_messages_for_api() to conditionally include web search - Updated browser tool documentation - Added comprehensive test suite (3 new tests, all passing) Usage: export GPTME_ANTHROPIC_WEB_SEARCH=true gptme -m anthropic/claude-sonnet-4-5 Addresses: #492
The web search tool uses Anthropic's API structure (type, name, max_uses) which differs from the SDK's ToolParam TypedDict (name, description, input_schema). This is expected since web search is a newer feature where SDK types may lag API. Fixes typecheck errors at lines 276, 278, and 344.
9335df9 to
27d94fa
Compare
CI Investigation ReportIssue: The anthropic test job ran for 2h3m despite a 15-minute timeout, suggesting a test got stuck. The OpenAI test hasn't started yet (showing 'pending' at 0s). Local Testing: All web search unit tests pass locally: Analysis: This appears to be a CI infrastructure issue rather than a code problem. The timeout mechanism ( Recommendation: Re-run CI once the current stuck run completes or is cancelled. |
|
Seems to work, but fails to handle the response it seems and displays weirdly. When I try it I get this: |
When using Anthropic's native web search tool, the streaming API may return TextDelta, ThinkingDelta, or InputJSONDelta with None values. This caused 'NoneType object is not iterable' errors when iterating over the stream. Add null checks before yielding delta values to prevent the error. Fixes issue reported by @ErikBjare in PR review.
✅ Fix Applied@ErikBjare I've identified and fixed the issue you reported. Root CauseWhen using the native web search tool, the Anthropic streaming API was returning FixAdded null checks before yielding delta values in the streaming code: if isinstance(delta, anthropic.types.TextDelta):
if delta.text is not None: # Added guard
yield delta.textThis prevents TestingAll web search unit tests pass locally. Commit: e9ee71f - |
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed e9ee71f in 56 seconds. Click for details.
- Reviewed
22lines of code in1files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. gptme/llm/llm_anthropic.py:509
- Draft comment:
Good addition: checking if delta.text is not None before yielding prevents None outputs. Consider refactoring if similar checks are needed elsewhere. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
2. gptme/llm/llm_anthropic.py:512
- Draft comment:
Null check for delta.thinking is properly added to avoid yielding None values. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
3. gptme/llm/llm_anthropic.py:515
- Draft comment:
Ensuring delta.partial_json is not None before yielding is a correct safeguard. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
Workflow ID: wflow_5VXpP8nxvnDUiOGK
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
|
It works now! still a bit weird with the query being seen as raw JSON in the output, but I can fix that later. |
Add handling for server-side tool blocks that occur during Anthropic's native web search: - ServerToolUseBlock in content_block_start: Show searching indicator - WebSearchToolResultBlock: Handle results block - CitationsDelta: Display source URLs from search results - ServerToolUseBlock in content_block_stop: Show completion with input These blocks are returned by the Anthropic API when using the web_search tool (GPTME_ANTHROPIC_WEB_SEARCH=true) and were previously causing 'Unknown block type' warnings. Follow-up to #723, incorporates remaining features from #542.
* feat(anthropic): add web search block handling Add handling for server-side tool blocks that occur during Anthropic's native web search: - ServerToolUseBlock in content_block_start: Show searching indicator - WebSearchToolResultBlock: Handle results block - CitationsDelta: Display source URLs from search results - ServerToolUseBlock in content_block_stop: Show completion with input These blocks are returned by the Anthropic API when using the web_search tool (GPTME_ANTHROPIC_WEB_SEARCH=true) and were previously causing 'Unknown block type' warnings. Follow-up to #723, incorporates remaining features from #542. * fix(anthropic): remove non-existent type checks for web search blocks The types ServerToolUseBlock and WebSearchToolResultBlock don't exist in the anthropic Python SDK. Server-side tool use (like web search) comes through as regular ToolUseBlock with specific tool names, not special types. This fixes the typecheck error: - Module has no attribute 'ServerToolUseBlock' * fix(anthropic): add null check for citation attribute in CitationsDelta handling Addresses potential AttributeError if delta.citation exists but is None. Thanks @greptile-apps for catching this!
Summary
Implements Anthropic Claude native web search to address issue #492 (broken search due to bot detection on Google/DuckDuckGo).
Fixes #492
Changes
Features
GPTME_ANTHROPIC_WEB_SEARCH)GPTME_ANTHROPIC_WEB_SEARCH_MAX_USES, default: 5)Implementation Details
Modified Files:
gptme/llm/llm_anthropic.py:_create_web_search_tool()function to create Anthropic web search tool definition_prepare_messages_for_api()to conditionally include web search toolgptme/tools/browser.py:tests/test_llm_anthropic.py:Usage
Claude will automatically use its native web search capability to find current information.
Testing
All tests pass:
Future Work
This PR focuses on Anthropic's native search. Future enhancements could include:
Notes
Important
Adds native web search support for Anthropic Claude models, configurable via environment variables, with seamless integration and comprehensive testing.
GPTME_ANTHROPIC_WEB_SEARCHandGPTME_ANTHROPIC_WEB_SEARCH_MAX_USES.gptme/llm/llm_anthropic.py: Adds_create_web_search_tool()and modifies_prepare_messages_for_api()to include web search tool if enabled.gptme/tools/browser.py: Updates docstring to document native search feature and configuration examples.tests/test_llm_anthropic.py: Adds tests for web search tool functionality, covering enabled, disabled, and combined tool scenarios.This description was created by
for e9ee71f. You can customize this summary. It will automatically update as commits are pushed.