Switch from Copilot Proxy to CAPI#318443
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Replaces the hardcoded ProxyAgenticEndpoint (with model vscode-agentic-search-router-a) used by the search subagent with a new SearchAgentChatEndpoint based on a CAPI-advertised search-agent family model, and gates the SearchSubagent/ExploreSubagent tools on availability of that model.
Changes:
- Introduce
SearchAgentChatEndpointextendingCopilotChatEndpointwith custom token limits (260000 prompt / 16000 output) and aSEARCH_AGENT_FAMILYconstant. - Update
SearchSubagentToolCallingLoopto discover the search-agent endpoint fromendpointProvider.getAllChatEndpoints()instead of constructingProxyAgenticEndpoint, falling back to the main agent endpoint on failure. - Gate
SearchSubagentandExploreSubagenttool availability inagentIntenton the presence of asearch-agentfamily endpoint.
Show a summary per file
| File | Description |
|---|---|
| extensions/copilot/src/platform/endpoint/node/searchAgentChatEndpoint.ts | New endpoint class wrapping CopilotChatEndpoint with search-agent-specific token limits. |
| extensions/copilot/src/extension/prompt/node/searchSubagentToolCallingLoop.ts | Switches endpoint resolution from ProxyAgenticEndpoint to CAPI-discovered SearchAgentChatEndpoint with fallback. |
| extensions/copilot/src/extension/intents/node/agentIntent.ts | Adds availability check for the search-agent family to gate subagent tools. |
Copilot's findings
- Files reviewed: 3/3 changed files
- Comments generated: 3
| // Use the CAPI search-agent model. Fall back to the main agent endpoint if the model | ||
| // is not available for this user | ||
| try { | ||
| const allEndpoints = await this.endpointProvider.getAllChatEndpoints(); | ||
| const searchAgentEndpoint = allEndpoints.find(e => e.family === SEARCH_AGENT_FAMILY); | ||
| if (searchAgentEndpoint instanceof ChatEndpoint) { | ||
| return this.instantiationService.createInstance(SearchAgentChatEndpoint, searchAgentEndpoint.modelMetadata); | ||
| } | ||
| this._logService.warn(`Search-agent model not available in CAPI, falling back to main agent endpoint`); | ||
| } catch (error) { | ||
| this._logService.warn(`Failed to get search-agent endpoint from CAPI, falling back to main agent: ${error}`); | ||
| } | ||
| return await this.endpointProvider.getChatEndpoint(this.options.request); |
There was a problem hiding this comment.
Keeping both paths; the checks happen at different times - agentIntent.ts for whether we expose the tool when the req starts, and getEndpoint for when we pick the endpoint at fetch time.
| const allEndpoints = await endpointProvider.getAllChatEndpoints().catch(() => [] as IChatEndpoint[]); | ||
| const searchAgentAvailable = allEndpoints.some(e => e.family === SEARCH_AGENT_FAMILY); |
| } catch (error) { | ||
| this._logService.warn(`Failed to get search-agent endpoint from CAPI, falling back to main agent: ${error}`); | ||
| } | ||
| return await this.endpointProvider.getChatEndpoint(this.options.request); | ||
| } | ||
|
|
There was a problem hiding this comment.
can you elaborate on this? your changes should only apply to search agent, not to the background todo tool
There was a problem hiding this comment.
^ just something I noticed, changes should be unrelated to this PR!
There was a problem hiding this comment.
Ah yes, thanks for noticing that, I'll be pushing a fix for this
21c046c to
dbfc90f
Compare
| } catch (error) { | ||
| this._logService.warn(`Failed to get search-agent endpoint from CAPI, falling back to main agent: ${error}`); | ||
| } | ||
| return await this.endpointProvider.getChatEndpoint(this.options.request); |
There was a problem hiding this comment.
correct me if I'm wrong, but I thought we shouldn't be using the main agent in the search subagent if the model isn't available as it can end up charging the users much more now with UBB enabled. We should throw an error/gracefully exit the search subagent if the model isn't available
There was a problem hiding this comment.
if the search subagent model isn't in the CAPI model list, we remove the tool (which is the behavior we agreed upon). If, however, we fall into the situation where the model is available but there's an error, falling back to the main agent should be ok. The CoGS delta when using the main agent as the search subagent model wasn't stat sig IIRC. And this case should be happening very rarely --only when the search model is available but erroring.
We could change it to just fail immediately with a message back to the main agent model, but sometimes the model will keep trying to call the tool because it's available, and having it return errors over and over might actually increase costs more than substituting the main agent model for the subagent model so the system still works. I don't think one is clearly better from a CoGS perspective than the other.
|
FYI I had to revert this via #320208 because it causes CI failures See https://github.com/microsoft/vscode-engineering/issues/2967 |

Currently, we use copilot-proxy as a go-between to get access from VS Code to models hosted in Fireworks. This PR covers changes needed to switch to CAPI: