Skip to content

Security: SQL injection in Databricks vector store via f-string interpolation #4073

@lighthousekeeper1212

Description

@lighthousekeeper1212

Security Concern

The Databricks vector store implementation uses f-string interpolation for SQL queries, creating SQL injection vulnerabilities.

Vulnerable Code

File: mem0/vector_stores/databricks.py

# delete() function:
delete_sql = f"DELETE FROM {self.fully_qualified_table_name} WHERE memory_id = '{vector_id}'"

# insert() function:
insert_sql = f"INSERT INTO {self.fully_qualified_table_name} ({', '.join(self.column_names)}) VALUES {', '.join(value_tuples)}"

Comparison with Safe Implementations

Other vector stores in the same codebase correctly use parameterized queries:

  • pgvector: cur.execute(f"DELETE FROM ... WHERE id = %s", (vector_id,))
  • azure_mysql: Uses %s placeholders ✅
  • Databricks: Uses f-string interpolation ❌

Impact

A malicious memory_id like '; DELETE FROM table; -- would execute:

DELETE FROM schema.table WHERE memory_id = ''; DELETE FROM table; --'

Recommended Fix

Use the Databricks Statement Execution API's parameterized queries:

delete_sql = f"DELETE FROM {self.fully_qualified_table_name} WHERE memory_id = :vector_id"
response = self.client.statement_execution.execute_statement(
    statement=delete_sql,
    warehouse_id=self.warehouse_id,
    parameters=[{"name": "vector_id", "value": vector_id}],
    wait_timeout="30s"
)

Note

I attempted to use GitHub's private vulnerability reporting but it appears to be disabled. Consider enabling it at Settings → Code security → Private vulnerability reporting.


Discovered during security audit by Lighthouse Research Project (https://lighthouse1212.com)

Metadata

Metadata

Assignees

Labels

P0-criticalData loss, security vuln, blocks all usersbugSomething isn't workingsecuritySecurity vulnerability

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions