Skip to content

.Net: Pack of changes from MEVD API review#11882

Merged
roji merged 7 commits intomicrosoft:feature-vector-data-preb3from
roji:ApiReviewPack
May 5, 2025
Merged

.Net: Pack of changes from MEVD API review#11882
roji merged 7 commits intomicrosoft:feature-vector-data-preb3from
roji:ApiReviewPack

Conversation

@roji
Copy link
Member

@roji roji commented May 4, 2025

Note: this PR is based on top of #11879 - ignore first commit.

This PR contains a pack of several unrelated changes from API review; each is quite simple on its own, so I submitted them as a single PR. I recommend reviewing this commit-by-commit.

@roji roji requested a review from westey-m May 4, 2025 14:59
@roji roji requested a review from a team as a code owner May 4, 2025 14:59
@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel memory labels May 4, 2025
@github-actions github-actions bot changed the title Pack of changes from MEVD API review .Net: Pack of changes from MEVD API review May 4, 2025
@roji roji temporarily deployed to integration May 4, 2025 15:00 — with GitHub Actions Inactive
@roji roji force-pushed the ApiReviewPack branch from f96d44c to a085d34 Compare May 4, 2025 15:23
@roji roji temporarily deployed to integration May 4, 2025 15:23 — with GitHub Actions Inactive
/// Initializes a new instance of the <see cref="VectorStoreDataProperty"/> class by cloning the given source.
/// </summary>
/// <param name="source">The source to clone.</param>
public VectorStoreDataProperty(VectorStoreDataProperty source)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westey-m please note this separate commit that also makes the record definition property types mutable.

We had this conversation previously in #10879, but since then we introduced the MEVD model, so the record definition is no longer referenced or used in any way after the model building step; that should make it safe to make it fully mutable, as it's only used during the construction of the collection (just like the option bags are only used during GetAsync, SearchAsync, etc).

Hopefully this all makes sense - if not let me know, I'm happy to pull this commit out of the PR and to have a separate discussion on it.

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed each commit separately and overall it looks good to me. I've found few nits, feel free to address them before merging if you want to.

I know it's very subjective, but IMO 'VectorSearchOptions' was better name than RecordSearchOptions, but SimilaritySearchOptions would be even better.

Thank you @roji !

/// Gets or sets a value indicating whether the key should be auto-generated by the vector store.
/// </summary>
public bool AutoGenerate { get; init; }
public bool AutoGenerate { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just realized that the property was exposed, but we don't ever respect it (yet).

Copy link
Member Author

@roji roji May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, true - good catch... Maybe safer to remove it for now, until we have at least one implementation which actually makes use of it - simply because we're not 100% sure yet what the feature will look like... Made a note, will do that in a separate PR later. Will remove this.

/// </summary>
public IReadOnlyList<VectorStoreProperty> Properties { get; init; } = s_emptyFields;
[AllowNull]
public IList<VectorStoreProperty> Properties
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still believe that we should expose just the list rather than an ability to provide your own collection. Anyway, it could be just a List here as we don't really benefit from exposing an abstraction (in case of IReadOnlyList it could be an array)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mainly did IList<T> to align with MEAI - see this comment. I don't really have strong feelings on this - if the user happens to have e.g. an array, IList allows that to be assigned directly (whereas List does not). Any specific thoughts/arguments for exposing List instead of IList?

(will go ahead and merge to move forward, but let's continue the conversation and I can always change it later)

roji added 4 commits May 5, 2025 15:26
IVectorSearch -> IVectorSearchable
IKeywordHybridSearch -> IKeywordHybridSearchable
GetRecordOptions -> RecordRetrievalOptions
GetFilteredRecordOptions -> FilteredRecordRetrievalOptions
VectorSearchOptions -> RecordSearchOptions
Originally these were generic in order to support custom mappers, but
these were removed.
@roji
Copy link
Member Author

roji commented May 5, 2025

I know it's very subjective, but IMO 'VectorSearchOptions' was better name than RecordSearchOptions, but SimilaritySearchOptions would be even better.

This was proposed during API review in order to align the SearchAsync option bag with the GetAsync ones; on the GetAsync side we (now) have RecordRetrievalOptions and FilteredRecordRetrievalOptions, so on the SearchAsync side the proposal was to have RecordSearchOptions (the verb Search in MEVD implicitly means "similarity" or "vector" search).

I can see where you're coming from though - VectorSearchOptions is possibly a little bit more clear. I'll go ahead and merge for now but I've taken a note to raise this again to see what people feel - we can always change it back.

roji added 2 commits May 5, 2025 15:55
This especially makes sense now that we have a model that gets
built based on the record definition (and attributes).

Closes microsoft#10879
@roji roji force-pushed the ApiReviewPack branch from 103a79f to 6d71500 Compare May 5, 2025 13:55
@roji roji temporarily deployed to integration May 5, 2025 13:56 — with GitHub Actions Inactive
@roji roji force-pushed the ApiReviewPack branch from 6d71500 to 3a260ea Compare May 5, 2025 14:13
@roji roji temporarily deployed to integration May 5, 2025 14:14 — with GitHub Actions Inactive
@roji roji merged commit 472e1d5 into microsoft:feature-vector-data-preb3 May 5, 2025
11 checks passed
@roji roji deleted the ApiReviewPack branch May 5, 2025 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kernel Issues or pull requests impacting the core kernel memory .NET Issue or Pull requests regarding .NET code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants