Skip to content

feat(vector-store): Add Databricks Mosaic AI vector store support #3325

Merged
parshvadaftari merged 14 commits intomem0ai:mainfrom
hayescode:feat/mem0-databricks-vector-search
Aug 18, 2025
Merged

feat(vector-store): Add Databricks Mosaic AI vector store support #3325
parshvadaftari merged 14 commits intomem0ai:mainfrom
hayescode:feat/mem0-databricks-vector-search

Conversation

@hayescode
Copy link
Copy Markdown
Contributor

@hayescode hayescode commented Aug 15, 2025

Description

Add support for Databricks Mosaic vector store to mem0. Databricks Mosaic AI is a mature and high-performance vector store at the enterprise scale. This implementation provides users with an additional vector store option alongside existing options like Qdrant, Redis, and others.

This contribution adds:

Databricks connector with full vector operations (insert, search, delete, update, get)
Comprehensive configuration management
Authentication options for service principal and personal access tokens (PAT)
Support for both types of vector stores; STANDARD and STORAGE-OPTIMIZED
Complete documentation and examples
Extensive unit tests

closes #2501

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (does not change functionality, e.g. code style improvements, linting)
  • Documentation update

How Has This Been Tested?

Ran locally following CONTRIBUTING.md guidelines.

  • [ x ] Unit Test

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • have checked my code and corrected any misspellings

Maintainer Checklist

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Made sure Checks passed

@hayescode hayescode marked this pull request as ready for review August 15, 2025 19:04
Copy link
Copy Markdown
Contributor

@parshvadaftari parshvadaftari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please incorporate the requested changes.

@parshvadaftari
Copy link
Copy Markdown
Contributor

@hayescode The tests are failing. Use make format and make lint, it will help you out.

@hayescode hayescode marked this pull request as draft August 15, 2025 20:28
@hayescode
Copy link
Copy Markdown
Contributor Author

@hayescode The tests are failing. Use make format and make lint, it will help you out.

@parshvadaftari I just discovered that the databricks-sdk library can handle both the vector index functionality and the underlying data management, which would make this much easier/sustainable. I've moved this back to DRAFT while I do this, and i'll address your comments too. I'll let you know when it's ready again.

@parshvadaftari
Copy link
Copy Markdown
Contributor

@parshvadaftari I just discovered that the databricks-sdk library can handle both the vector index functionality and the underlying data management, which would make this much easier/sustainable. I've moved this back to DRAFT while I do this, and i'll address your comments too. I'll let you know when it's ready again.
No worries, let me know when it's ready.

@hayescode
Copy link
Copy Markdown
Contributor Author

@parshvadaftari Databricks can compute embeddings itself (this is the most common index in Databricks). The main Memory client seems to require embeddings. Will this be able to work? Are there other vector stores that compute embeddings themselves?

Databricks is unique (I'm used to Azure AI Search) in that there's an underlying table with all of the data, then another index built on top of that.

@hayescode hayescode marked this pull request as ready for review August 16, 2025 07:35
@hayescode
Copy link
Copy Markdown
Contributor Author

hayescode commented Aug 16, 2025

@parshvadaftari this is ready now. I converted this to use the databricks-sdk instead of databricks-vectorsearch since we also need non-vector-store operations like creating tables, getting warehouse_id, etc. This should be a much better and sustainable solution.

I couldn't get the make format or make lint to run on my machine. I will also be away for a few days so I clicked "Allow Edits by Maintainers" so please feel free to take this from here.

One oddity i noticed making this, that I've also seen in Azure AI Search is this mem0migration. I don't know why this is here but I always have to override it again. Should this be refactored? It's un-related to this so I didn't change it, but it seems like if we don't need any "migration" this should probably be removed/changed. Related: #2948

self.config.vector_store.config.collection_name = "mem0migrations"

@parshvadaftari
Copy link
Copy Markdown
Contributor

@hayescode Good catch! This one doesn’t impact functionality — it’s mainly there for internal consistency and to keep things aligned across environments. Nothing to worry about on your end.

Copy link
Copy Markdown
Contributor

@parshvadaftari parshvadaftari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@parshvadaftari parshvadaftari merged commit 3a1eff4 into mem0ai:main Aug 18, 2025
6 of 7 checks passed
@parshvadaftari
Copy link
Copy Markdown
Contributor

@hayescode Thanks for contributing!

@hayescode hayescode deleted the feat/mem0-databricks-vector-search branch August 18, 2025 17:10
@hayescode hayescode mentioned this pull request Sep 3, 2025
17 tasks
jamebobob pushed a commit to jamebobob/mem0-vigil-recall that referenced this pull request Mar 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Databricks Mosaic AI Vector Search

2 participants