fix(mcp_semantic_tool_filter): match tools with client-side namespace prefix (#26078) by sakenuGOD · Pull Request #26117 · BerriAI/litellm

sakenuGOD · 2026-04-20T16:01:15Z

Re-opening against litellm_oss_branch after the main-branch guard rejected the fork-based PR (#26116 was closed for that reason). Same commit, now on the correct base.

What was broken

SemanticMCPToolFilter._get_tools_by_names matched the canonical name stored in the router (e.g. fc_web_search-firecrawl_scrape) against the incoming tools[].name with plain equality. MCP clients that wrap every canonical name with their own additive namespace prefix — notably opencode (mcp/index.ts registers every tool as <client_alias>_<canonical>) — never matched, the filter returned [], the proxy forwarded tools: [] with tool_choice: \"auto\", and strict OpenAI-compatible upstreams rejected the request:

litellm.BadRequestError: Hosted_vllmException -
{\"error\":{\"message\":\"__all__: Invalid value for 'tool_choice': 'tool_choice' is only allowed when 'tools' are specified.\"}}

This is distinct from:

#22445 (registry-vs-expanded prefix mismatch on proxy side)
#24984 (non-MCP tool scope)

Here the mismatch is a client-side prefix layered on top of the already-prefixed canonical name.

Fix

Anchored suffix matching in _get_tools_by_names with a separator-character check.

A new static helper _name_matches_canonical(client_name, canonical) returns True when:

client_name == canonical, or
client_name.endswith(canonical) and the character immediately preceding the canonical suffix is _ or -.

The separator check is what keeps the match semantically correct — rain_gear is not treated as a namespaced version of canonical ear.

_get_tools_by_names now:

Builds an index of incoming tools by their client-facing name.
For each canonical name returned by the router, tries an exact lookup first and falls back to suffix matching.
Deduplicates: each incoming tool is returned at most once even if two canonical names happen to suffix-match the same incoming tool.
Passes the original tool object through unchanged, so the client-facing name is preserved and tool-call round-trips continue to work on the client side (matching the expected behavior called out in the issue).

Both _ and - are accepted as separators regardless of the proxy's MCP_TOOL_PREFIX_SEPARATOR setting, because the client has no way to know what the proxy is configured with.

Tests

New TestGetToolsByNames class in tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py:

Test	Scenario
`test_exact_match_unchanged`	Historical path still works
`test_client_prefix_with_underscore_separator`	The exact opencode scenario from #26078
`test_client_prefix_with_dash_separator`	Clients that use dash as alias separator
`test_suffix_without_separator_does_not_match`	`rain_gear` does not match canonical `ear`
`test_exact_match_preferred_over_prefixed`	Exact match wins over suffix match for ordering stability
`test_same_tool_not_returned_twice`	Two canonicals suffix-matching the same tool → one result, not two
`test_ordering_follows_router_output`	Returned tools follow the semantic router's order

pytest tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py — 14 passed (7 new + 7 pre-existing, none regressed).

Scope

Kept strictly to the matching bug called out in the issue. The hook's behavior when filter_tools legitimately returns [] (router finds nothing relevant) is unchanged and can be addressed separately if maintainers want a broader safeguard against forwarding tools: [] with tool_choice: \"auto\".

a client-side namespace prefix. `SemanticMCPToolFilter._get_tools_by_names` matched by exact equality between the canonical name stored in the router (`<server><MCP_TOOL_PREFIX_SEPARATOR><tool>`) and the name in the incoming `tools[]` list. MCP clients such as opencode wrap every tool name with their own additive alias prefix (`<client_alias>_<canonical>`), so the two never matched, the filter dropped every tool to zero, and the proxy forwarded `tools: []` with `tool_choice: auto` — which strict upstream providers reject with a 400. The fix adds anchored suffix matching with a separator check: the canonical must form the complete tail of the incoming name and be preceded by `_` or `-`. Exact matches still win over suffix matches, incoming tools are returned at most once, and the original tool object is passed through unchanged so the client-facing name survives for tool-call round-trips. Seven unit tests in a new TestGetToolsByNames class cover exact match, underscore- and dash-prefixed variants, non-separator-anchored suffixes (which must not match), exact-wins-over-prefixed precedence, deduplication when two canonicals suffix-match the same incoming tool, and ordering-follows-router-output. Fixes BerriAI#26078

CLAassistant · 2026-04-20T16:01:23Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ sakenuGOD
❌ krrish-berri-2
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

greptile-apps · 2026-04-20T16:03:54Z

Greptile Summary

This PR fixes SemanticMCPToolFilter._get_tools_by_names which was doing plain equality matching between canonical router names (e.g. fc_web_search-firecrawl_scrape) and client-facing tool names, causing MCP clients that add their own namespace prefix (e.g. opencode's litellm_fc_web_search-firecrawl_scrape) to produce an empty tool list and downstream 400 errors. The fix introduces _name_matches_canonical for separator-anchored suffix matching with a gate that prevents spurious collisions with non-MCP local functions, plus a shortest-prefix tie-breaker and id-based deduplication in the matching loop.

Confidence Score: 5/5

Safe to merge; the fix is narrow, well-tested, and all remaining findings are P2 style suggestions.

The core matching logic is correct (separator-anchored suffix, gate on MCP_TOOL_PREFIX_SEPARATOR presence, shortest-prefix tie-breaker, id-based dedup). Previously flagged concerns about the deduplication test and tie-breaker ambiguity have been addressed. The only remaining finding is that one test doesn't fully exercise its stated invariant — not a code defect.

No files require special attention.

Important Files Changed

Filename	Overview
litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py	Adds _name_matches_canonical static helper and rewrites _get_tools_by_names to tolerate client-side namespace prefixes via separator-anchored suffix matching with deduplication and shortest-prefix tie-breaking. Logic is correct and well-documented.
tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py	Adds 8 new unit tests for _get_tools_by_names; previously-flagged deduplication test now properly uses two distinct canonicals, and a new shortest-prefix test covers the tie-breaker. One test (test_exact_match_preferred_over_prefixed) doesn't fully exercise its stated invariant due to canonical 'search' having no separator.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["_get_tools_by_names(tool_names, available_tools)"] --> B["Build available_by_name index\n(client_name → tool object)"]
    B --> C["For each canonical from router"]
    C --> D{"Exact match?\navailable_by_name.get(canonical)"}
    D -- Yes --> E["tool = exact match"]
    D -- No --> F["Suffix scan:\nfor client_name in available_by_name"]
    F --> G{"_name_matches_canonical\n(client_name, canonical)?"}
    G -- No --> F
    G -- Yes --> H{"len(client_name) < len(best_name)?"}
    H -- Yes --> I["best_name = client_name"]
    H -- No --> F
    I --> F
    F --> J["tool = available_by_name[best_name]"]
    E --> K{"id(tool) in used_ids?"}
    J --> K
    K -- No --> L["append tool\nadd id to used_ids"]
    K -- Yes --> M["skip (dedup)"]
    L --> C
    M --> C

_{Reviews (4): Last reviewed commit: "Merge branch 'litellm_oss_staging_04_22_..." | Re-trigger Greptile}

greptile-apps · 2026-04-20T16:03:57Z

+    def test_same_tool_not_returned_twice(self):
+        """
+        Two canonicals that both suffix-match the same incoming tool
+        must not produce a duplicate in the output list.
+        """
+        filter_instance = self._make_filter()
+        available_tools = [
+            {"name": "srv_read_file", "description": "read"}
+        ]
+
+        matched = filter_instance._get_tools_by_names(
+            ["read_file", "read_file"], available_tools
+        )
+
+        assert len(matched) == 1


Test doesn't exercise its own comment

The docstring says "Two canonicals that both suffix-match the same incoming tool," but the actual call passes ["read_file", "read_file"] — the same canonical twice. That only exercises the used_ids guard for duplicate canonical names, not the case where two distinct canonicals both suffix-match a single incoming tool. A canonical list like ["read_file", "file"] (both suffix-match "srv_read_file") would properly cover the stated scenario and give stronger regression protection.

Suggested change

def test_same_tool_not_returned_twice(self):

"""

Two canonicals that both suffix-match the same incoming tool

must not produce a duplicate in the output list.

"""

filter_instance = self._make_filter()

available_tools = [

{"name": "srv_read_file", "description": "read"}

]

matched = filter_instance._get_tools_by_names(

["read_file", "read_file"], available_tools

)

assert len(matched) == 1

def test_same_tool_not_returned_twice(self):

"""

Two canonicals that both suffix-match the same incoming tool

must not produce a duplicate in the output list.

"""

filter_instance = self._make_filter()

available_tools = [

{"name": "srv_read_file", "description": "read"}

]

matched = filter_instance._get_tools_by_names(

["read_file", "file"], available_tools

)

assert len(matched) == 1

greptile-apps · 2026-04-20T16:03:58Z

+        for canonical in tool_names:
+            tool = available_by_name.get(canonical)
+            if tool is None:
+                for client_name, candidate in available_by_name.items():
+                    if self._name_matches_canonical(client_name, canonical):
+                        tool = candidate
+                        break


First-in-dict-order wins when multiple tools suffix-match the same canonical

When no exact match is found and more than one available tool suffix-matches the same canonical (e.g. "my_tag_search" and "my_search" both matching canonical "search"), the suffix fallback returns whichever tool appears first in available_by_name's insertion order. In the opencode scenario described in the PR this is never ambiguous (one client prefix → one canonical), but if the proxy is ever used with a client that prefixes tools from multiple namespaces the wrong tool could be silently selected. Returning the shortest-prefix match (i.e. len(client_name) is minimised) would be a more principled tie-breaker with negligible overhead.

deduplication regression test (Greptile comments on BerriAI#26117) - test_same_tool_not_returned_twice now passes two distinct canonicals ("read_file" and "file") that both suffix-match the same incoming tool, rather than the same canonical twice, so the assertion actually exercises the used_ids dedup path instead of the duplicate-input-list path. - The suffix fallback in _get_tools_by_names now prefers the shortest incoming name that still qualifies under the separator-anchored match. In the one-prefix-per-client opencode scenario this is a no-op, but in multi-namespace configurations the shortest qualifying name is the least-wrapped one and is the most defensible deterministic choice, replacing the dict-insertion-order fallback. - Adds test_suffix_fallback_prefers_shortest_candidate covering the new tie-breaker directly. Still 15 tests passing locally (was 14).

sakenuGOD · 2026-04-20T16:08:08Z

Thanks for the review @greptile-apps! Both P2 points addressed in 930187e:

1. test_same_tool_not_returned_twice — you were right, passing the same canonical twice only exercised the duplicate-input path. Now passes two distinct canonicals (read_file and file) that both suffix-match srv_read_file, which is what the docstring claims and actually hits the used_ids guard.

2. Suffix-match tie-breaker — swapped dict-insertion-order for shortest-qualifying-name. Rationale is the same as you sketched: the incoming name closest in length to the canonical is the least-wrapped and the most defensible deterministic choice in multi-namespace configs. Added test_suffix_fallback_prefers_shortest_candidate to pin the behavior down.

Local run: 15/15 passing.

krrish-berri-2 · 2026-04-22T03:31:54Z

+                    tool = available_by_name[best_name]
+            if tool is not None and id(tool) not in used_ids:
+                matched.append(tool)
+                used_ids.add(id(tool))


will there be collisions if a local function matches a litellm mcp server tool?

or is this handled since litellm always prefixes it's mcp tools with the server name in front of it?

Good catch — yes, collision was possible on unprefixed canonical names. Addressed in ac7df0e:

Since LiteLLM always registers server-side MCP tools as <server_name><MCP_TOOL_PREFIX_SEPARATOR><tool_name> via add_server_prefix_to_name, a canonical that does not contain MCP_TOOL_PREFIX_SEPARATOR is not a namespaced MCP tool, so I gated the suffix-match fallback on the separator being present in the canonical. Added test_does_not_collide_with_local_function_on_unprefixed_canonical that spuriously matched before the fix and passes after.

The opencode scenario is unaffected because its canonical (fc_web_search-firecrawl_scrape) contains the separator. A few pre-existing TestGetToolsByNames fixtures that used bare canonicals for brevity (get_weather, search, etc.) were switched to realistic server-prefixed canonicals so they continue to exercise the suffix-fallback path under the new guard.

@krrish-berri-2

…CP_TOOL_PREFIX_SEPARATOR @krrish-berri-2 flagged a possible collision in the suffix fallback: a local user function whose name happens to end in a bare canonical substring (e.g. my_firecrawl_scrape vs canonical firecrawl_scrape) would be spuriously selected. Server-registered MCP tools are always emitted as <server_name><MCP_TOOL_PREFIX_SEPARATOR><tool_name> via add_server_prefix_to_name, so a canonical without the separator is not a namespaced MCP tool and does not warrant suffix matching. Added that guard to _name_matches_canonical with a regression test (test_does_not_collide_with_local_function_on_unprefixed_canonical) that reproduces the collision before the fix and is pinned after. Pre-existing TestGetToolsByNames fixtures that relied on bare canonicals (get_weather, search, read_file, write/delete/read) were switched to realistic server-prefixed ones so they continue to exercise the suffix-fallback path under the new guard. The opencode scenario (client prefix on already-server-prefixed canonical) is unchanged.

…ntic-filter-oss

codecov · 2026-04-23T02:09:22Z

Codecov Report

❌ Patch coverage is 94.11765% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...y/_experimental/mcp_server/semantic_tool_filter.py	94.11%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

gitguardian · 2026-04-23T02:22:43Z

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secret in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
29203053	Triggered	Generic Password	`f8eb263`	.circleci/config.yml	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secret safely. Learn here the best practices.
Revoke and rotate this secret.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

sakenuGOD mentioned this pull request Apr 20, 2026

fix(mcp_semantic_tool_filter): match tools with client-side namespace prefix (#26078) #26116

Closed

greptile-apps Bot reviewed Apr 20, 2026

View reviewed changes

krrish-berri-2 reviewed Apr 22, 2026

View reviewed changes

krrish-berri-2 changed the base branch from litellm_oss_branch to litellm_oss_staging_04_22_2026 April 23, 2026 02:04

Merge branch 'litellm_oss_staging_04_22_2026' into fix/26078-mcp-sema…

f8eb263

…ntic-filter-oss

krrish-berri-2 merged commit 034f4fd into BerriAI:litellm_oss_staging_04_22_2026 Apr 23, 2026
29 of 30 checks passed

dborysenko mentioned this pull request Apr 25, 2026

[Bug]: mcp_semantic_tool_filter drops all tools when client sends MCP tool names with its own unique-ID suffix (e.g. LibreChat) #26507

Open

1 task

Prithvi1994 mentioned this pull request Apr 25, 2026

fix(proxy): handle client-side unique-ID suffixes in MCP semantic tool filter #26533

Open

GopalGB mentioned this pull request Apr 30, 2026

fix(mcp_semantic_tool_filter): resolve client-suffixed tool names (LibreChat-style) #26858

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(mcp_semantic_tool_filter): match tools with client-side namespace prefix (#26078)#26117

fix(mcp_semantic_tool_filter): match tools with client-side namespace prefix (#26078)#26117
krrish-berri-2 merged 4 commits into
BerriAI:litellm_oss_staging_04_22_2026from
sakenuGOD:fix/26078-mcp-semantic-filter-oss

sakenuGOD commented Apr 20, 2026

Uh oh!

CLAassistant commented Apr 20, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 20, 2026 •

edited

Loading

Important Files Changed

Uh oh!

greptile-apps Bot Apr 20, 2026

Uh oh!

greptile-apps Bot Apr 20, 2026

Uh oh!

sakenuGOD commented Apr 20, 2026

Uh oh!

krrish-berri-2 Apr 22, 2026

Uh oh!

sakenuGOD Apr 22, 2026

Uh oh!

Uh oh!

codecov Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

gitguardian Bot commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

sakenuGOD commented Apr 20, 2026

What was broken

Fix

Tests

Scope

Uh oh!

CLAassistant commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

sakenuGOD commented Apr 20, 2026

Uh oh!

krrish-berri-2 Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

sakenuGOD Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gitguardian Bot commented Apr 23, 2026

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented Apr 20, 2026 •

edited

Loading

greptile-apps Bot commented Apr 20, 2026 •

edited

Loading

codecov Bot commented Apr 23, 2026 •

edited

Loading