perf: cache sorted model lists in registry to avoid per-request sorting#93
Conversation
ListModels(), ListModelsWithProvider(), and ListModelsWithProviderByCategory() were allocating and sorting a new slice on every call. For registries with hundreds of models under heavy concurrent load, this wastes CPU and adds GC pressure. Cache sorted slices internally and rebuild lazily (double-checked locking) only when Initialize(), LoadFromCache(), or EnrichModels() modify the underlying models map. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdded lazy in-memory caching for sorted models and category-filtered results in the registry, with cache invalidation on updates. Introduced asynchronous initialization and periodic background refresh capabilities with configurable intervals and optional model list fetching. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR adds performance caching to the ModelRegistry by maintaining sorted slices that are rebuilt lazily only when the underlying models change. The optimization targets three frequently-called methods (ListModels, ListModelsWithProvider, and ListModelsWithProviderByCategory) that previously allocated and sorted new slices on every invocation.
Changes:
- Added three cache fields to ModelRegistry:
sortedModels,sortedModelsWithProvider, andcategoryCache(per-category results) - Implemented double-checked locking pattern for lazy cache initialization in all three List methods
- Added
invalidateSortedCaches()helper called byInitialize(),LoadFromCache(), andEnrichModels()to clear caches when models change
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/providers/registry.go
Outdated
| func (r *ModelRegistry) ListModels() []core.Model { | ||
| r.mu.RLock() | ||
| defer r.mu.RUnlock() | ||
| if cached := r.sortedModels; cached != nil { | ||
| r.mu.RUnlock() | ||
| return cached | ||
| } | ||
| r.mu.RUnlock() | ||
|
|
||
| r.mu.Lock() | ||
| defer r.mu.Unlock() | ||
| // Double-check: another goroutine may have built it while we waited for the lock. | ||
| if r.sortedModels != nil { | ||
| return r.sortedModels | ||
| } | ||
|
|
||
| models := make([]core.Model, 0, len(r.models)) | ||
| for _, info := range r.models { | ||
| models = append(models, info.Model) | ||
| } | ||
| sort.Slice(models, func(i, j int) bool { return models[i].ID < models[j].ID }) | ||
|
|
||
| // Sort by model ID for consistent ordering across calls | ||
| sort.Slice(models, func(i, j int) bool { | ||
| return models[i].ID < models[j].ID | ||
| }) | ||
|
|
||
| r.sortedModels = models | ||
| return models |
There was a problem hiding this comment.
The cached slice is returned directly to callers without making a defensive copy. This creates a critical data race: if a caller modifies the returned slice (e.g., sorting, appending, or modifying Model fields), those modifications will affect the cached slice visible to all other goroutines. Additionally, if callers retain references to the slice and a cache invalidation occurs, they could continue reading stale data while new goroutines see updated data, leading to inconsistencies.
The same issue exists in ListModelsWithProvider() and ListModelsWithProviderByCategory().
To fix this, return a copy of the cached slice. While this adds allocation overhead, it's still much better than the original behavior of sorting on every call, and it guarantees thread safety. The copy should be shallow (copy the slice but not the Model structs themselves, since Model structs contain only immutable fields or fields protected by the registry lock).
internal/providers/registry.go
Outdated
| @@ -409,37 +445,53 @@ func (r *ModelRegistry) ListModelsWithProvider() []ModelWithProvider { | |||
| ProviderType: r.providerTypes[info.Provider], | |||
| }) | |||
| } | |||
| sort.Slice(result, func(i, j int) bool { return result[i].Model.ID < result[j].Model.ID }) | |||
|
|
|||
| sort.Slice(result, func(i, j int) bool { | |||
| return result[i].Model.ID < result[j].Model.ID | |||
| }) | |||
|
|
|||
| r.sortedModelsWithProvider = result | |||
| return result | |||
There was a problem hiding this comment.
The cached slice is returned directly to callers without making a defensive copy. This creates the same critical data race as in ListModels(): callers can modify the returned slice, affecting the cached data visible to all goroutines. Return a shallow copy of the cached slice to prevent this race condition.
| func (r *ModelRegistry) ListModelsWithProviderByCategory(category core.ModelCategory) []ModelWithProvider { | ||
| if category == core.CategoryAll { | ||
| return r.ListModelsWithProvider() | ||
| } | ||
|
|
||
| r.mu.RLock() | ||
| defer r.mu.RUnlock() | ||
| if r.categoryCache != nil { | ||
| if cached, ok := r.categoryCache[category]; ok { | ||
| r.mu.RUnlock() | ||
| return cached | ||
| } | ||
| } | ||
| r.mu.RUnlock() | ||
|
|
||
| r.mu.Lock() | ||
| defer r.mu.Unlock() | ||
| if r.categoryCache != nil { | ||
| if cached, ok := r.categoryCache[category]; ok { | ||
| return cached | ||
| } | ||
| } | ||
|
|
||
| result := make([]ModelWithProvider, 0) | ||
| for _, info := range r.models { | ||
| if category != core.CategoryAll { | ||
| if info.Model.Metadata == nil || !hasCategory(info.Model.Metadata.Categories, category) { | ||
| continue | ||
| } | ||
| if info.Model.Metadata == nil || !hasCategory(info.Model.Metadata.Categories, category) { | ||
| continue | ||
| } | ||
| result = append(result, ModelWithProvider{ | ||
| Model: info.Model, | ||
| ProviderType: r.providerTypes[info.Provider], | ||
| }) | ||
| } | ||
| sort.Slice(result, func(i, j int) bool { return result[i].Model.ID < result[j].Model.ID }) | ||
|
|
||
| sort.Slice(result, func(i, j int) bool { | ||
| return result[i].Model.ID < result[j].Model.ID | ||
| }) | ||
|
|
||
| if r.categoryCache == nil { | ||
| r.categoryCache = make(map[core.ModelCategory][]ModelWithProvider) | ||
| } | ||
| r.categoryCache[category] = result | ||
| return result |
There was a problem hiding this comment.
The cached slice is returned directly to callers without making a defensive copy. This creates the same critical data race as in the other List methods: callers can modify the returned slice, affecting the cached data visible to all goroutines. Return a shallow copy of the cached slice to prevent this race condition.
|
@coderabbitai review |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/providers/registry.go`:
- Around line 375-377: The code currently returns the internal cached slice
r.sortedModels (and other cached slices) directly which allows callers to mutate
shared state; instead, always return a defensive copy when returning cached
slices: after retrieving cached := r.sortedModels (and analogous cached
variables) make and return a copy (e.g., newSlice := append([]ModelType(nil),
cached...)) while preserving the existing locking/unlocking (r.mu.RUnlock() /
r.mu.Unlock()) and keeping the cache itself untouched; also apply the same
defensive-copy pattern to all places that return the cache after building it so
callers never receive a reference to the internal slice.
- Around line 491-494: The cache currently stores every incoming category
(r.categoryCache[category] = result), allowing unbounded growth; change it to
only cache known/expected categories by validating the category before inserting
into r.categoryCache (e.g., compare the category value against the canonical set
of core.ModelCategory values or a small allowedCategories whitelist) and skip
caching if it is not recognized; reference r.categoryCache, the category
variable and ModelWithProvider/ core.ModelCategory in your change, and keep the
existing nil-check/initialization of r.categoryCache intact.
…ry cache Two issues in the sorted-list caching introduced in the previous commit: 1. Cached slices were returned by reference, allowing callers to mutate shared internal state (core.Model contains a *ModelMetadata pointer). Now all return paths copy the cached slice before returning. 2. categoryCache accepted any ModelCategory string value, allowing unbounded map growth from arbitrary input. Now only the 6 known categories (text_generation, embedding, image, audio, video, utility) are cached; unrecognized categories still produce correct results but skip the cache. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ListModels(), ListModelsWithProvider(), and ListModelsWithProviderByCategory() were allocating and sorting a new slice on every call. For registries with hundreds of models under heavy concurrent load, this wastes CPU and adds GC pressure. Cache sorted slices internally and rebuild lazily (double-checked locking) only when Initialize(), LoadFromCache(), or EnrichModels() modify the underlying models map.
Summary by CodeRabbit
Performance Improvements
New Features