Fixed CosmosSqlMetadataProvider concurrency issue.#2695
Fixed CosmosSqlMetadataProvider concurrency issue.#2695RubenCerna2079 merged 2 commits intoAzure:mainfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR fixes a concurrency issue in CosmosSqlMetadataProvider by replacing a non-thread-safe dictionary with a ConcurrentDictionary and adding argument null checks to its partition-key methods.
- Replaced
Dictionary<string, string>withConcurrentDictionary<string, string> - Added
ArgumentNullException.ThrowIfNullguards inGetPartitionKeyPathandSetPartitionKeyPath - Switched from
TryAdd/assignmentto a singleAddOrUpdatecall inSetPartitionKeyPath
Comments suppressed due to low confidence (4)
src/Core/Services/MetadataProviders/CosmosSqlMetadataProvider.cs:554
- Add unit tests that simulate concurrent calls to
SetPartitionKeyPathandGetPartitionKeyPathto verify the thread-safe behavior of the newConcurrentDictionaryimplementation.
_partitionKeyPaths.AddOrUpdate("${database}/{container}", partitionKeyPath, (key, oldValue) => partitionKeyPath);
src/Core/Services/MetadataProviders/CosmosSqlMetadataProvider.cs:554
- [nitpick] The key format
"${database}/{container}"is used in multiple places; consider extracting it into a private helper method to avoid duplication and reduce the risk of mismatches.
_partitionKeyPaths.AddOrUpdate("${database}/{container}", partitionKeyPath, (key, oldValue) => partitionKeyPath);
src/Core/Services/MetadataProviders/CosmosSqlMetadataProvider.cs:539
- Update the XML doc comments for
GetPartitionKeyPathandSetPartitionKeyPathto mention that they throwArgumentNullExceptionfor null inputs, so consumers know about these new exceptions.
public string? GetPartitionKeyPath(string database, string container)
src/Core/Services/MetadataProviders/CosmosSqlMetadataProvider.cs:540
- Introducing
ArgumentNullExceptionhere changes the behavior ofGetPartitionKeyPath(it now throws instead of returning null for null inputs). Confirm this breaking change is intended or document it in the public API.
ArgumentNullException.ThrowIfNull(database);
|
Hello @sajeetharan , I appreciate the team have lots of other work in progress already, but this fix looks pretty much good to go and should be a quick job to review thanks to the hard work of @michaelstaib. The problems this causes for us operationally are severe. We would be most grateful if you could bump this up the priority list and get it into a release very soon. Thank you! |
|
@michaelstaib Thanks @golfalot for sure , will track it with next release cc : @Aniruddh25 @sourabh1007 @JerryNixon |
|
/azp run |
|
Azure Pipelines successfully started running 6 pipeline(s). |
Aniruddh25
left a comment
There was a problem hiding this comment.
LGTM. Thank you for identifying and fixing this!
|
/azp run |
|
Azure Pipelines successfully started running 6 pipeline(s). |
## Why make this change? This change is made in order to add all of the commits for milestone 1.5 into its respective branch. ## What is this change? This change cherry-picks all of the commits that were added after the first release candidate. Cherry-picked commits: - #2648 #2657 #2617 #2659 #2655 #2633 #2667 #2673 #2650 #2695 #2702 #2688 ## How was this tested? - [ ] Integration Tests - [ ] Unit Tests ## Sample Request(s) --------- Co-authored-by: Sezal Chug <30494467+sezal98@users.noreply.github.com> Co-authored-by: sezalchug <sezalchug@microsoft.com> Co-authored-by: Tommaso Stocchi <tstocchi@microsoft.com> Co-authored-by: Aaron Powell <me@aaron-powell.com> Co-authored-by: aaronburtle <93220300+aaronburtle@users.noreply.github.com> Co-authored-by: Aniruddh Munde <anmunde@microsoft.com> Co-authored-by: Jerry Nixon <jerry.nixon@microsoft.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Michael Staib <michael@chillicream.com> Co-authored-by: souvikghosh04 <souvikofficial04@gmail.com> Co-authored-by: Souvik Ghosh <sogh@microsoft.com>
The CosmosSqlMetdataProvider uses a standard dictionary although the metadata provider is accessed and updated concurrently. This causes the issue in #2694. I have updated the implementation to use a ConcurrentDictionary.
Fixes #2694