Skip to content

Fix Ollama 404 error and streaming response parsing#3684

Closed
lsby wants to merge 6 commits intomicrosoft:mainfrom
lsby:fix-ollama-models-endpoint
Closed

Fix Ollama 404 error and streaming response parsing#3684
lsby wants to merge 6 commits intomicrosoft:mainfrom
lsby:fix-ollama-models-endpoint

Conversation

@lsby
Copy link
Contributor

@lsby lsby commented Feb 12, 2026

Summary

This PR fixes a 404 error when calling Ollama models.

The root causes include:

  1. Incorrect API endpoint being used
  2. Incompatibility between the streaming response parser and Ollama’s response format
image

Screenshot from here


Environment

Ollama

  • Version: 0.15.5
  • Model: mistral-small3.2:latest (supports vision and tool calling)

VSCode

  • Version: 1.110.0-insider (user setup)
  • Commit: f43307e9e5511fa40b15ff8e3edbd84b225fc8d0

Issue 1: Incorrect API Endpoint

Ollama uses the API path /api/chat (see official documentation), not /chat/completions.

In the current implementation, OllamaLMProvider extends AbstractOpenAICompatibleLMProvider, which constructs the request path based on the OpenAI-compatible protocol. As a result, requests were incorrectly sent to /chat/completions, leading to a 404 error.

image

I overrode createOpenAIEndPoint in OllamaLMProvider to provide the correct path.


Issue 2: Incompatible Streaming Response Parsing

Ollama’s streaming responses use JSON Lines format (one JSON object per line).

The current parser does not handle this format, which prevents it from correctly parsing the streamed content.

To resolve this, I implemented a dedicated response parser processOllamaStreamResponse for Ollama.

Since the parser needs to be conditionally applied based on the endpoint type, I introduced a subclass in src/extension/byok/node/ollamaEndpoint to indicate whether the current endpoint targets Ollama.

@lsby
Copy link
Contributor Author

lsby commented Feb 12, 2026

@microsoft-github-policy-service agree

@kingeke
Copy link

kingeke commented Feb 12, 2026

LGTM

@wssccc
Copy link

wssccc commented Feb 12, 2026

Ollama has a built-in OAI compatible endpoint /v1/chat/completions, which was working well previously. Why not just fix that path issue?

@zhichli zhichli assigned lszomoru and unassigned zhichli Feb 12, 2026
@lsby
Copy link
Contributor Author

lsby commented Feb 12, 2026

@wssccc

Thanks for pointing that out. I wasn’t aware that Ollama provides a built-in OpenAI-compatible endpoint at /v1/chat/completions.

After testing, it works as expected. Given that, updating the endpoint path is indeed a cleaner solution and avoids the need for a custom streaming parser.

I’ve revised the commits accordingly to use the compatible endpoint and removed the additional parsing logic. Thanks again for the suggestion.

Copy link

@wrenchpilot wrenchpilot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested update that includes the /v1/chat/completions endpoint. LGTM2.

@oomek
Copy link

oomek commented Feb 14, 2026

It's taking a while to approve just 6 lines of code :/

@MarslMarcello
Copy link

It's taking a while to approve just 6 lines of code :/

CoPilot needs its time. 😅 🙃

@hoppersoft
Copy link

Can someone with write access review this PR, please?

@adifyr
Copy link

adifyr commented Feb 18, 2026

Now, if y'all could just go ahead and approve this PR right 'ere, the rest of us can move on with our lives. Much obliged...

@vs-code-engineering vs-code-engineering bot added this to the February 2026 milestone Feb 18, 2026
@kingeke
Copy link

kingeke commented Feb 18, 2026

Just one more person, come on humanity !!!

@sandy081 sandy081 enabled auto-merge February 19, 2026 16:49
@sandy081
Copy link
Member

Reopening to try again to merge

@sandy081 sandy081 closed this Feb 19, 2026
auto-merge was automatically disabled February 19, 2026 16:50

Pull request was closed

@sandy081 sandy081 reopened this Feb 19, 2026
@sandy081 sandy081 enabled auto-merge February 19, 2026 16:58
@sandy081
Copy link
Member

Since this is stuck in approvals, I created another PR to merge this - #3858

@sandy081 sandy081 closed this Feb 19, 2026
auto-merge was automatically disabled February 19, 2026 17:00

Pull request was closed

@riverar
Copy link

riverar commented Feb 19, 2026

Thanks @sandy081. Can we talk about why this remained broken for a week+ and required back channel communication? Is there something we can do in the future to better alert folks to critical issues like this?

@grische grische mentioned this pull request Feb 19, 2026
@lsby lsby deleted the fix-ollama-models-endpoint branch February 20, 2026 04:57
@riverar
Copy link

riverar commented Feb 20, 2026

@sandy081 Can you confirm this is scheduled for release ~February 28, 2026? Is there a faster hotfix pipe this could potentially be pushed into?

@elemenopyunome
Copy link

the only agent i'm able to use is stuck behind a pay wall. can this not be released sooner than more than a week from now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.