[Feature]: Expose limit parameter and document query operators in web_search tool

### Problem or Use Case

I worked with my Hermes Agent to improve web_search tool some and wanted to share the changes we made.  Thought this might make the current web_search more functional.  For context, I am using local Firecrawl with SearXNG as the backend for Firecrawl.

**Motivation**
The current web_search tool only accepts a query string, hardcoding limit=5 and offering no documented way to use search query operators. This means:
- The LLM cannot request more than 5 results without workarounds, which is limiting for research-heavy tasks.
- Query operators like site:, filetype:pdf, intitle:, and -exclude work with most backends but are undocumented, so the LLM never discovers them.
- The tool description is minimal ("Search the web for information on any topic"), giving the LLM no guidance on advanced usage.


### Proposed Solution

### Verified changes (tested on self-hosted Firecrawl + SearXNG backend)
All changes are backend-agnostic — they only expose existing functionality that Firecrawl, Tavily, Exa, and Parallel already support.

### File: tools/web_tools.py

**1. Update WEB_SEARCH_SCHEMA**
```
# Before
WEB_SEARCH_SCHEMA = {
    "name": "web_search",
    "description": "Search the web for information on any topic. Returns up to 5 relevant results with titles, URLs, and descriptions.",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "The search query to look up on the web"
            }
        },
        "required": ["query"]
    }
}

After
WEB_SEARCH_SCHEMA = {
    "name": "web_search",
    "description": "Search the web for information. Returns results with titles, URLs, and descriptions. Use query operators for targeted filtering: site:domain (restrict to a domain), intitle:word (title must contain), allintitle:word (all words in title), filetype:ext (file type, e.g. filetype:pdf), inurl:word, allinurl:word, related:domain (find similar sites), -term (exclude), \"exact phrase\" (exact match). For large content extraction, use web_extract on returned URLs.",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "The search query. Supports operators: site:domain (restrict to domain), intitle:word, allintitle:word, filetype:ext (e.g. filetype:pdf), inurl:word, allinurl:word, related:domain (similar sites), -term (exclude results containing term), \"exact phrase\" (exact match). Example: 'site:arxiv.org LLM fine-tuning' or 'filetype:pdf machine learning survey'"
            },
            "limit": {
                "type": "integer",
                "description": "Maximum number of results to return (default: 10, max: 100). Use higher limits when you need more candidates for downstream extraction.",
                "default": 10
            }
        },
        "required": ["query"]
    }
}
```

**2. Update function signature and docstring**
```
# Before
def web_search_tool(query: str, limit: int = 5) -> str:

After
def web_search_tool(query: str, limit: int = 10) -> str:
```

**3. Update the handler lambda**
```
# Before
handler=lambda args, **kw: web_search_tool(args.get("query", ""), limit=5),

After
handler=lambda args, **kw: web_search_tool(args.get("query", ""), limit=args.get("limit", 10)),
```

**A few notes:**
- Default limit 5 → 10: This doubles results for cloud-tier users (higher credit/token cost per search). If maintainers prefer cost conservatism, keeping limit=5 as default while still exposing the parameter is a reasonable alternative.
- Query operators are backend-agnostic: Operators pass through the query string. Backends that support them (Firecrawl, Tavily) honor them; those that don't simply ignore them. No harm either way.
- No backend-specific code changes: All four backends (Firecrawl, Tavily, Exa, Parallel) already support limit. This change only wires an existing parameter through to the tool schema and handler.


### Alternatives Considered

_No response_

### Feature Type

Performance / reliability

### Scope

Small (single file, < 50 lines)

### Contribution

- [ ] I'd like to implement this myself and submit a PR

### Debug Report (optional)

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Expose limit parameter and document query operators in web_search tool #16696

Problem or Use Case

Proposed Solution

Verified changes (tested on self-hosted Firecrawl + SearXNG backend)

File: tools/web_tools.py

Alternatives Considered

Feature Type

Scope

Contribution

Debug Report (optional)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature]: Expose limit parameter and document query operators in web_search tool #16696

Description

Problem or Use Case

Proposed Solution

Verified changes (tested on self-hosted Firecrawl + SearXNG backend)

File: tools/web_tools.py

Alternatives Considered

Feature Type

Scope

Contribution

Debug Report (optional)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions