Skip to content

fix(vllm): forward response_format to OpenAI-compatible API#4608

Merged
kartik-mem0 merged 1 commit intomem0ai:mainfrom
rakheesingh:fix/vllm-response-format
Mar 31, 2026
Merged

fix(vllm): forward response_format to OpenAI-compatible API#4608
kartik-mem0 merged 1 commit intomem0ai:mainfrom
rakheesingh:fix/vllm-response-format

Conversation

@rakheesingh
Copy link
Copy Markdown
Contributor

Summary

  • The vllm LLM provider accepted response_format as a parameter in generate_response() but never forwarded it into the actual API request params
  • This caused JSON parsing failures (Invalid JSON response: Expecting value) during memory extraction/update because the model returned free-form text instead of constrained JSON
  • Fix: add response_format to params, matching the pattern used by openai.py, groq.py, deepseek.py, and all other OpenAI-compatible providers

Closes #4607

Changes

  • mem0/llms/vllm.py — Added 2 lines to forward response_format into request params
  • tests/llms/test_vllm.py — Added 2 new tests:
    • test_generate_response_with_response_format — verifies response_format is passed to the API
    • test_generate_response_without_response_format — verifies it's omitted when not provided

Test plan

  • All 4 vllm tests pass (2 existing + 2 new)
  • Fix is identical to the pattern in openai.py line 134-135

The vllm provider accepted response_format as a parameter but never
passed it into the request params, causing JSON parsing failures during
memory extraction and update when the model returned free-form text
instead of constrained JSON output.

Closes mem0ai#4607

Made-with: Cursor
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 29, 2026

CLA assistant check
All committers have signed the CLA.

@kartik-mem0
Copy link
Copy Markdown
Contributor

Thanks for the clean fix @rakheesingh — this looks good! The bug is clearly real, the fix follows the exact pattern used across all other providers, and the tests cover it well. Approving and merging this.

One thing we noticed during review: deepseek.py has the exact same bug — response_format is accepted as a parameter but never forwarded into the API request params. Would you be interested in opening a follow-up PR to fix that as well? Totally understand if not, but since you're already familiar with the pattern, it'd be a quick win! 🙂

@kartik-mem0 kartik-mem0 merged commit dbac832 into mem0ai:main Mar 31, 2026
6 of 7 checks passed
@rakheesingh
Copy link
Copy Markdown
Contributor Author

Thanks for the clean fix @rakheesingh — this looks good! The bug is clearly real, the fix follows the exact pattern used across all other providers, and the tests cover it well. Approving and merging this.

One thing we noticed during review: deepseek.py has the exact same bug — response_format is accepted as a parameter but never forwarded into the API request params. Would you be interested in opening a follow-up PR to fix that as well? Totally understand if not, but since you're already familiar with the pattern, it'd be a quick win! 🙂

I am gonna raise it right now

rakheesingh added a commit to rakheesingh/mem0 that referenced this pull request Mar 31, 2026
The deepseek provider accepted response_format as a parameter but never
passed it into the request params, causing JSON parsing failures during
memory extraction and update when the model returned free-form text
instead of constrained JSON output.

This is the same bug that was fixed in vllm.py (mem0ai#4608), as noted by
the maintainer during review.

Closes mem0ai#4607

Made-with: Cursor
rakheesingh added a commit to rakheesingh/mem0 that referenced this pull request Apr 3, 2026
The azure_openai provider accepted response_format as a parameter but
never passed it into the request params, causing JSON parsing failures
during memory extraction and update when the model returned free-form
text instead of constrained JSON output.

This is the same bug class fixed in vllm.py (mem0ai#4608) and deepseek.py
(mem0ai#4618), completing the fix across all OpenAI-compatible providers.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

vllm provider drops response_format in generate_response, causing JSON parsing failures in memory extraction/update

4 participants