π Describe the bug
Bug: OpenAIEmbedding fails with OpenAI-compatible proxies β SDK defaults to encoding_format="base64" which proxies may not support
Description
mem0/embeddings/openai.py line 46 calls self.client.embeddings.create() without passing encoding_format:
# mem0/embeddings/openai.py (v1.0.3), line 44-49
def embed(self, text, memory_action=None):
text = text.replace("\n", " ")
return (
self.client.embeddings.create(
input=[text], model=self.config.model, dimensions=self.config.embedding_dims
) # <- no encoding_format
.data[0]
.embedding
)
The OpenAI Python SDK (v1.0+) intentionally defaults to encoding_format="base64" when not specified (see openai/resources/embeddings.py lines 107-108):
if not is_given(encoding_format):
params["encoding_format"] = "base64"
The SDK then decodes the base64 response back to floats client-side. This works fine against api.openai.com, but breaks with OpenAI-compatible proxies (OpenRouter, LiteLLM, vLLM, etc.) that either:
- Don't support
encoding_format="base64" at all
- Intermittently return an empty
data array when base64 is requested
When the proxy returns empty data, the SDK's parser raises:
ValueError: No embedding data received
(from openai/resources/embeddings.py line 116)
Impact
Any mem0 user pointing openai_base_url at a non-OpenAI provider will get intermittent ValueError: No embedding data received errors on memory.add() and memory.search() calls. This is especially problematic because:
- mem0's own docs show
openai_base_url configuration for compatible proxies
- The error is intermittent, making it hard to debug
- There's no way to configure
encoding_format via BaseEmbedderConfig
Reproduction
from openai import OpenAI
client = OpenAI(
api_key="sk-or-...",
base_url="https://openrouter.ai/api/v1"
)
# This mirrors exactly what mem0 does -- no encoding_format
# SDK sends encoding_format="base64" to OpenRouter
# Intermittently raises: ValueError("No embedding data received")
for i in range(10):
try:
response = client.embeddings.create(
input=["Hello world"],
model="openai/text-embedding-3-small",
dimensions=1536,
# No encoding_format -> SDK defaults to "base64"
)
print(f"Call {i+1}: OK")
except ValueError as e:
print(f"Call {i+1}: FAILED -- {e}")
# With explicit encoding_format="float", all 10 succeed:
for i in range(10):
response = client.embeddings.create(
input=["Hello world"],
model="openai/text-embedding-3-small",
dimensions=1536,
encoding_format="float", # <- fix
)
assert len(response.data) == 1
print(f"Call {i+1}: OK ({len(response.data[0].embedding)} dims)")
Suggested Fix
Option A (minimal): Add encoding_format="float" to the embed() call:
# mem0/embeddings/openai.py
def embed(self, text, memory_action=None):
text = text.replace("\n", " ")
return (
self.client.embeddings.create(
input=[text],
model=self.config.model,
dimensions=self.config.embedding_dims,
encoding_format="float", # <- explicit float for proxy compatibility
)
.data[0]
.embedding
)
Option B (configurable): Add encoding_format to BaseEmbedderConfig so users can choose:
# mem0/configs/embeddings/base.py
class BaseEmbedderConfig(BaseModel):
...
encoding_format: str = "float" # "float" or "base64"
Environment
- mem0ai: 1.0.3
- openai: 2.21.0
- Proxy: OpenRouter (
https://openrouter.ai/api/v1)
- Model:
openai/text-embedding-3-small
Related Issues
π Describe the bug
Bug:
OpenAIEmbeddingfails with OpenAI-compatible proxies β SDK defaults toencoding_format="base64"which proxies may not supportDescription
mem0/embeddings/openai.pyline 46 callsself.client.embeddings.create()without passingencoding_format:The OpenAI Python SDK (v1.0+) intentionally defaults to
encoding_format="base64"when not specified (seeopenai/resources/embeddings.pylines 107-108):The SDK then decodes the base64 response back to floats client-side. This works fine against
api.openai.com, but breaks with OpenAI-compatible proxies (OpenRouter, LiteLLM, vLLM, etc.) that either:encoding_format="base64"at alldataarray when base64 is requestedWhen the proxy returns empty data, the SDK's parser raises:
(from
openai/resources/embeddings.pyline 116)Impact
Any mem0 user pointing
openai_base_urlat a non-OpenAI provider will get intermittentValueError: No embedding data receivederrors onmemory.add()andmemory.search()calls. This is especially problematic because:openai_base_urlconfiguration for compatible proxiesencoding_formatviaBaseEmbedderConfigReproduction
Suggested Fix
Option A (minimal): Add
encoding_format="float"to theembed()call:Option B (configurable): Add
encoding_formattoBaseEmbedderConfigso users can choose:Environment
https://openrouter.ai/api/v1)openai/text-embedding-3-smallRelated Issues