Skip to content

.Net: Align Chroma memory store behavior with other stores#2215

Merged
lemillermicrosoft merged 13 commits intomicrosoft:mainfrom
jsboige:chroma
Aug 10, 2023
Merged

.Net: Align Chroma memory store behavior with other stores#2215
lemillermicrosoft merged 13 commits intomicrosoft:mainfrom
jsboige:chroma

Conversation

@jsboige
Copy link
Contributor

@jsboige jsboige commented Jul 28, 2023

Motivation and Context

This is an attempt to account for the following PR.
Chroma logs an error when calling DoesCollectionExistAsync on a non existing collection, it should log at debug level instead
Also Volatile Store and QDrant return an empty collection when calling GetNearestMatchesAsync whereas Chroma throws an exception. This PR aligns behavior with other stores.

Description

Align chroma store behavior with other common memory stores

Contribution Checklist

@jsboige jsboige requested a review from a team as a code owner July 28, 2023 12:34
@shawncal shawncal added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel memory connector labels Jul 28, 2023
@shawncal shawncal changed the title Align Chroma memory store behavior with other stores .Net: Align Chroma memory store behavior with other stores Jul 28, 2023
Copy link
Member

@dmytrostruk dmytrostruk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsboige thank you for contribution.

Based on our conversation in this PR #1989 and my comment:

Regarding alignment between different connectors - we should definitely do that at some point. And I think that for Search operation:

  • If collection exists but it's empty - we should return empty collection.
  • If collection doesn't exist - we should throw an exception to the caller, because it will clearly describe that something is wrong with memory connector setup.

In this case it will be easier to differentiate two scenarios and react on them appropriately.

For example, in Qdrant, memory record operations like Get, Upsert, Remove will throw exception in case you try to perform operation in non-existent collection. I don't see any reasons for Search operation to behave in different way and return empty result instead.

In case we make an agreement on this, then we should leave Chroma implementation as it is and align other connectors which don't follow this pattern.

Please let me know what you think about it.

@dmytrostruk dmytrostruk added the PR: feedback to address Waiting for PR owner to address comments/questions label Jul 28, 2023
Copy link
Member

@dmytrostruk dmytrostruk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsboige please see my comment above.

@jsboige
Copy link
Contributor Author

jsboige commented Jul 31, 2023

@dmytrostruk OK I'm happy either :

  • to revert the change in GetNearestMatchesAsync,, only keep the log event type change in this PR, leaving the distinct memory providers non aligned in their behavior concerning NearestMatch for missing collections.
  • to revert this one to throwing exceptions and also align the Volatile, Qdrant and potentially other providers to throw exceptions systematically in GetNearestMatchesAsync if the collection does not exist, but that will break the Copilot App until my other PR is merged, and potentially other apps that use the same pattern of create-on-insert / Search regardless of existence, so this is a breaking change that should be advertised as such, whether it is considered in this PR or postponed.
  • to keep the PR where the behavior is aligned to be more lenient allowing GetNearestMatchesAsync to return empty when collection does not exist yet. That was my choice because I think that this is the currently expected behavior, and it would make sense for all the providers to change in sync with the appropriate warning for better coherence.

Can you make a firm decision?

@dmytrostruk
Copy link
Member

@jsboige Please use this option for now:

  • to revert the change in GetNearestMatchesAsync,, only keep the log event type change in this PR, leaving the distinct memory providers non aligned in their behavior concerning NearestMatch for missing collections.

We will focus on aligning every connector to follow the same flow, I created issue for that - #2269

The preferred way would be to throw an exception when collection doesn't exist.

Thank you!

@jsboige
Copy link
Contributor Author

jsboige commented Aug 2, 2023

@dmytrostruk I reverted the over reach to only keep the logger update

Copy link
Member

@dmytrostruk dmytrostruk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsboige Thanks a lot!

@dmytrostruk dmytrostruk added PR: ready to merge PR has been approved by all reviewers, and is ready to merge. and removed PR: feedback to address Waiting for PR owner to address comments/questions labels Aug 3, 2023
@lemillermicrosoft lemillermicrosoft added this pull request to the merge queue Aug 10, 2023
Merged via the queue into microsoft:main with commit 508f4e8 Aug 10, 2023
madsbolaris pushed a commit to madsbolaris/semantic-kernel that referenced this pull request Aug 11, 2023
…#2215)

### Motivation and Context

<!-- Thank you for your contribution to the semantic-kernel repo!
Please help reviewers and future users, providing the following
information:
  1. Why is this change required?
  2. What problem does it solve?
  3. What scenario does it contribute to?
  4. If it fixes an open issue, please link to the issue here.
-->
This is an attempt to account for the [following
PR](microsoft#1989 (comment)).
Chroma logs an error when calling DoesCollectionExistAsync on a non
existing collection, it should log at debug level instead
Also Volatile Store and QDrant return an empty collection when calling
GetNearestMatchesAsync whereas Chroma throws an exception. This PR
aligns behavior with other stores.

### Description

<!-- Describe your changes, the overall approach, the underlying design.
These notes will help understanding how your code works. Thanks! -->
Align chroma store behavior with other common memory stores


### Contribution Checklist

<!-- Before submitting this PR, please make sure: -->

- [ ] The code builds clean without any errors or warnings
- [ ] The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [ ] All unit tests pass, and I have added new tests where possible
- [ ] I didn't break anyone 😄

---------

Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
Co-authored-by: Lee Miller <lemiller@microsoft.com>
SOE-YoungS pushed a commit to SOE-YoungS/semantic-kernel that referenced this pull request Nov 1, 2023
…#2215)

### Motivation and Context

<!-- Thank you for your contribution to the semantic-kernel repo!
Please help reviewers and future users, providing the following
information:
  1. Why is this change required?
  2. What problem does it solve?
  3. What scenario does it contribute to?
  4. If it fixes an open issue, please link to the issue here.
-->
This is an attempt to account for the [following
PR](microsoft#1989 (comment)).
Chroma logs an error when calling DoesCollectionExistAsync on a non
existing collection, it should log at debug level instead
Also Volatile Store and QDrant return an empty collection when calling
GetNearestMatchesAsync whereas Chroma throws an exception. This PR
aligns behavior with other stores.

### Description

<!-- Describe your changes, the overall approach, the underlying design.
These notes will help understanding how your code works. Thanks! -->
Align chroma store behavior with other common memory stores


### Contribution Checklist

<!-- Before submitting this PR, please make sure: -->

- [ ] The code builds clean without any errors or warnings
- [ ] The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [ ] All unit tests pass, and I have added new tests where possible
- [ ] I didn't break anyone 😄

---------

Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com>
Co-authored-by: Lee Miller <lemiller@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kernel Issues or pull requests impacting the core kernel memory connector .NET Issue or Pull requests regarding .NET code PR: ready to merge PR has been approved by all reviewers, and is ready to merge.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants