Skip to content

feat: support multi-provider model resolution with budget-based selection#953

Merged
Aureliolo merged 6 commits intomainfrom
feat/multi-provider-resolution
Mar 31, 2026
Merged

feat: support multi-provider model resolution with budget-based selection#953
Aureliolo merged 6 commits intomainfrom
feat/multi-provider-resolution

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

  • Change ModelResolver index from dict[str, ResolvedModel] to dict[str, tuple[ResolvedModel, ...]] so multiple providers can register the same model ID/alias without collision errors
  • Add ModelCandidateSelector protocol with QuotaAwareSelector (default, prefers providers with available quota then cheapest) and CheapestSelector implementations
  • Add QuotaTracker.peek_quota_available() synchronous method for lockless quota state snapshots used by the selector
  • Wire selector through ModelRouter and ProviderManagementService._build_router()
  • Harden error handling: empty-candidate guards, selector exception wrapping in resolve()/resolve_safe(), constructor validation, deterministic tie-breaking

Test plan

  • uv run python -m pytest tests/unit/providers/routing/ -n auto -v -- all routing tests pass (selector, multi-provider resolver, quota-aware integration, strategies, router)
  • uv run python -m pytest tests/unit/budget/test_quota_tracker.py -n auto -v -- peek_quota_available tests pass (fresh, exhausted, multi-window, rotation, multiple providers)
  • uv run python -m pytest tests/unit/budget/ -n auto -- all budget tests pass (enforcer, optimizer use updated resolver index type)
  • uv run ruff check src/ tests/ -- lint clean
  • uv run mypy src/ tests/ -- type-check clean
  • Verify: two providers with same model ID no longer raises ModelResolutionError
  • Verify: resolve() returns cheapest provider by default (no quota data)
  • Verify: with quota snapshot, resolve() prefers provider with available quota

Review coverage

Pre-reviewed by 7 agents (code-reviewer, type-design-analyzer, silent-failure-hunter, test-analyzer, conventions-enforcer, issue-verifier, docs-consistency). 10 findings addressed in second commit.

Closes #887

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 31, 2026

Walkthrough

The changes convert ModelResolver’s index to hold tuples of candidate ResolvedModel entries and introduce a ModelCandidateSelector protocol with QuotaAwareSelector and CheapestSelector implementations. ModelResolver accepts an injectable selector, exposes resolve_all(), and delegates selection to the selector; ModelRouter and ProviderManagementService constructors now accept an optional selector parameter. QuotaTracker.peek_quota_available() was added (synchronous snapshot) and internal loop-affinity checks/logging were introduced with QUOTA_LOOP_AFFINITY_VIOLATED. New routing events ROUTING_MULTI_PROVIDER_REGISTERED and ROUTING_CANDIDATE_SELECTED were added; tests and docs updated.

Suggested labels

autorelease: tagged

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main changes: adding multi-provider model resolution support with budget-based selection strategy, which matches the core objective of allowing multiple providers to serve the same model and selecting between them using quota-aware logic.
Description check ✅ Passed The description comprehensively covers the changeset, explaining the index restructuring, selector protocol with two implementations, quota peek method addition, and wiring through router and management service.
Linked Issues check ✅ Passed All core objectives from #887 are met: ModelResolver index changed to support multi-provider registration [resolver.py], ModelCandidateSelector protocol with QuotaAwareSelector and CheapestSelector [selector.py], QuotaTracker.peek_quota_available() for quota snapshots [quota_tracker.py], and integration through ModelRouter and ProviderManagementService [router.py, service.py].
Out of Scope Changes check ✅ Passed All changes directly address issue #887 objectives: quota tracker enhancements, resolver/router refactoring for multi-provider support, selector implementations, observability events, and comprehensive test coverage for the new functionality.
Docstring Coverage ✅ Passed Docstring coverage is 57.61% which is sufficient. The required threshold is 40.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 31, 2026 17:30 — with GitHub Actions Inactive
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 31, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA c70d0a7.
Ensure that dependencies are being submitted on PR branches. Re-running this action after a short time may resolve the issue. See the documentation for more information and troubleshooting advice.

Scanned Files

None

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements multi-provider support for model resolution, allowing the system to handle multiple providers for a single model ID or alias. It introduces a ModelCandidateSelector protocol with QuotaAwareSelector and CheapestSelector implementations to pick the best candidate based on cost and quota availability. Key changes include updates to the ModelResolver to manage candidate lists and the addition of a synchronous peek_quota_available method in the QuotaTracker. Critical syntax errors were identified in the ModelResolver exception handling blocks where multiple exceptions were not correctly formatted as tuples.

model = self._selector.select(candidates)
except ModelResolutionError:
raise
except MemoryError, RecursionError:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This line contains a syntax error in Python 3. To catch multiple exceptions in a single except block, they must be enclosed in a tuple. The current syntax except E1, E2: is invalid in Python 3 (it was used in Python 2 to mean except E1 as E2:).

        except (MemoryError, RecursionError):

return None
try:
return self._selector.select(candidates)
except MemoryError, RecursionError:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This line contains a syntax error. Multiple exceptions must be provided as a tuple: except (MemoryError, RecursionError):.

        except (MemoryError, RecursionError):

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/budget/test_quota_tracker.py`:
- Around line 541-549: The test test_fresh_tracker_all_available is declared
async but only exercises the synchronous method peek_quota_available on a
tracker created via _make_tracker inside the _patched_tracker_datetime()
context; change the test signature from async def to def so it runs
synchronously, remove any unused async machinery (no awaits), and keep the same
setup and assertions; ensure you only make this change for tests that do not
call async methods like record_usage()—leave those tests as async.

In `@tests/unit/providers/routing/test_quota_aware_selection.py`:
- Around line 25-30: The context manager _patched_tracker_datetime sets
mock_dt.side_effect = datetime unnecessarily; QuotaTracker only calls
datetime.now(), so remove the side_effect assignment from
_patched_tracker_datetime and keep only the patch that sets
mock_dt.now.return_value = _NOW to simplify the mock (reference symbols:
_patched_tracker_datetime, mock_dt, datetime.now, QuotaTracker).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a175a8aa-3228-431c-b08e-9624a7fac47a

📥 Commits

Reviewing files that changed from the base of the PR and between 1367225 and 26e48bb.

📒 Files selected for processing (14)
  • src/synthorg/budget/quota_tracker.py
  • src/synthorg/observability/events/routing.py
  • src/synthorg/providers/management/service.py
  • src/synthorg/providers/routing/__init__.py
  • src/synthorg/providers/routing/resolver.py
  • src/synthorg/providers/routing/router.py
  • src/synthorg/providers/routing/selector.py
  • tests/unit/budget/conftest.py
  • tests/unit/budget/test_enforcer.py
  • tests/unit/budget/test_quota_tracker.py
  • tests/unit/providers/routing/test_quota_aware_selection.py
  • tests/unit/providers/routing/test_resolver.py
  • tests/unit/providers/routing/test_resolver_multi_provider.py
  • tests/unit/providers/routing/test_selector.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Python version 3.14+ with PEP 649 native lazy annotations
Do NOT use from __future__ import annotations -- Python 3.14 has PEP 649
PEP 758 except syntax: use except A, B: (no parentheses) -- ruff enforces this on Python 3.14
Line length: 88 characters (ruff enforced)

Files:

  • tests/unit/providers/routing/test_resolver.py
  • tests/unit/budget/conftest.py
  • src/synthorg/observability/events/routing.py
  • tests/unit/budget/test_enforcer.py
  • tests/unit/budget/test_quota_tracker.py
  • src/synthorg/providers/management/service.py
  • src/synthorg/providers/routing/__init__.py
  • src/synthorg/budget/quota_tracker.py
  • src/synthorg/providers/routing/router.py
  • tests/unit/providers/routing/test_quota_aware_selection.py
  • tests/unit/providers/routing/test_resolver_multi_provider.py
  • tests/unit/providers/routing/test_selector.py
  • src/synthorg/providers/routing/resolver.py
  • src/synthorg/providers/routing/selector.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Async tests: asyncio_mode = "auto" -- no manual @pytest.mark.asyncio needed
Test timeout: 30 seconds per test (global in pyproject.toml -- do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Tests must use test-provider, test-small-001, etc. instead of real vendor names
Property-based testing in Python: use Hypothesis (@given + @settings); run with HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties (dev profile: 1000 examples)
Hypothesis profiles: ci (50 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var
For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins
For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number) -- it is cancellation-safe and carries no timing assumptions

Files:

  • tests/unit/providers/routing/test_resolver.py
  • tests/unit/budget/conftest.py
  • tests/unit/budget/test_enforcer.py
  • tests/unit/budget/test_quota_tracker.py
  • tests/unit/providers/routing/test_quota_aware_selection.py
  • tests/unit/providers/routing/test_resolver_multi_provider.py
  • tests/unit/providers/routing/test_selector.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict)
Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields -- including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants -- instead of manual whitespace validators
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code (exception: observability/setup.py and observability/sinks.py may use stdlib logging and print(..., file=sys.stderr) for bootstrap and handler-cleanup code)
Variable name for logger: always logger (not _logger, not log)
Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool); import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT
Structured kwargs for logging: always logger.info(EVENT, key=value) -- never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO
DEBUG logging for object creation, internal flow, entry/exit of key...

Files:

  • src/synthorg/observability/events/routing.py
  • src/synthorg/providers/management/service.py
  • src/synthorg/providers/routing/__init__.py
  • src/synthorg/budget/quota_tracker.py
  • src/synthorg/providers/routing/router.py
  • src/synthorg/providers/routing/resolver.py
  • src/synthorg/providers/routing/selector.py
src/synthorg/providers/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/providers/**/*.py: All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically; never implement retry logic in driver subclasses or calling code
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig
Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError
Non-retryable errors raise immediately without retry
Rate limiter respects RateLimitError.retry_after from providers -- automatically pauses future requests

Files:

  • src/synthorg/providers/management/service.py
  • src/synthorg/providers/routing/__init__.py
  • src/synthorg/providers/routing/router.py
  • src/synthorg/providers/routing/resolver.py
  • src/synthorg/providers/routing/selector.py
🧠 Learnings (13)
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-31T14:40:41.736Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:40:41.736Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.

Applied to files:

  • src/synthorg/providers/management/service.py
  • tests/unit/providers/routing/test_resolver_multi_provider.py
  • src/synthorg/providers/routing/resolver.py
  • src/synthorg/providers/routing/selector.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

  • src/synthorg/providers/management/service.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/api/**/*.py : API package (api/): Litestar REST + WebSocket with controllers, guards, channels, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint, provider management endpoint (CRUD + test + presets), backup endpoint, RFC 9457 structured errors, AppState hot-reload slots, service auto-wiring (Phase 1 at construction, Phase 2 on startup), lifecycle helpers

Applied to files:

  • src/synthorg/providers/management/service.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state

Applied to files:

  • src/synthorg/providers/management/service.py
📚 Learning: 2026-03-26T15:18:16.848Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-26T15:18:16.848Z
Learning: Applies to src/synthorg/api/**/*.py : Litestar API must include setup wizard, auth/, auto-wiring, and lifecycle management

Applied to files:

  • src/synthorg/providers/management/service.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

  • src/synthorg/providers/management/service.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/**/*.py : `RetryConfig` and `RateLimiterConfig` are set per-provider in `ProviderConfig`. Retryable errors: `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately.

Applied to files:

  • src/synthorg/providers/routing/resolver.py
📚 Learning: 2026-03-16T19:13:34.746Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T19:13:34.746Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately without retry. RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.

Applied to files:

  • src/synthorg/providers/routing/resolver.py
🔇 Additional comments (20)
src/synthorg/observability/events/routing.py (1)

16-17: LGTM!

New event constants follow the established naming pattern (routing.<domain>.<action>) and are consistent with existing constants in the file.

src/synthorg/budget/quota_tracker.py (1)

383-422: LGTM!

The peek_quota_available() method is well-designed:

  • Clearly documents the lock-free read semantics and TOCTOU tolerance for heuristic decisions
  • Defensive .get() for subscription lookup handles edge cases
  • Early break on first exhausted window is an efficient optimization
  • Return type and docstring are complete
src/synthorg/providers/routing/selector.py (4)

1-27: LGTM!

Module setup follows guidelines:

  • Proper logger initialization with get_logger(__name__)
  • Event constant imported from domain-specific module
  • _cost_key helper provides deterministic tie-breaking by (cost, provider_name)

29-49: LGTM!

ModelCandidateSelector protocol is well-defined:

  • @runtime_checkable enables isinstance checks
  • Clear docstring documents the non-empty tuple contract and ModelResolutionError on empty input

52-103: LGTM!

QuotaAwareSelector implementation is solid:

  • Immutable after construction via MappingProxyType
  • Defaults missing providers to available (line 103) — sensible for providers without quota tracking
  • Falls back to all candidates when none have quota (line 89)
  • Debug logging only emits when actual selection occurs (multiple candidates)

106-126: LGTM!

CheapestSelector is a clean, simple implementation following the same patterns as QuotaAwareSelector.

src/synthorg/providers/routing/__init__.py (1)

16-20: LGTM!

New selector types are correctly exported and __all__ maintains alphabetical ordering.

Also applies to: 45-45, 49-49, 54-54

src/synthorg/providers/routing/resolver.py (5)

50-82: LGTM!

Constructor changes are well-designed:

  • Validates non-empty candidate tuples with clear error message
  • Defensive copy into validated dict before freezing
  • Defaults to QuotaAwareSelector() for quota-aware behavior out of the box

89-117: LGTM!

Multi-provider registration logic handles collisions gracefully:

  • Exact duplicates are skipped (DEBUG log)
  • Distinct providers are appended (INFO log with context)
  • No more ModelResolutionError on alias collision

196-212: LGTM!

Exception handling in resolve() is robust:

  • Re-raises ModelResolutionError from selector directly
  • Re-raises MemoryError and RecursionError (critical errors that shouldn't be swallowed)
  • Wraps unexpected exceptions in ModelResolutionError with context
  • Uses PEP 758 except syntax (except MemoryError, RecursionError:) as required for Python 3.14+

241-252: LGTM!

resolve_safe() correctly mirrors resolve() exception handling but returns None instead of raising for recoverable errors.


254-279: LGTM!

  • resolve_all() provides direct access to all candidates for a ref
  • all_models() correctly deduplicates by (provider_name, model_id) to handle multi-provider scenarios
tests/unit/budget/conftest.py (1)

363-368: LGTM!

make_resolver helper correctly updated to match the new multi-candidate index structure, wrapping each ResolvedModel in a 1-tuple.

tests/unit/budget/test_enforcer.py (1)

100-106: LGTM!

_make_resolver helper correctly updated:

  • Returns empty ModelResolver({}) for None input
  • Wraps each model in a 1-tuple to match the new multi-candidate index structure
src/synthorg/providers/routing/router.py (1)

44-66: LGTM! Clean dependency injection of the selector.

The changes correctly wire the optional selector parameter through to ModelResolver.from_config(), maintaining backward compatibility with the default QuotaAwareSelector. The docstring accurately documents the default behavior.

tests/unit/providers/routing/test_resolver.py (1)

172-202: LGTM! Test correctly validates multi-provider resolution.

The test properly verifies that duplicate aliases across providers now append candidates rather than raising an error, and that resolve_all() returns all variants.

Minor observation: the assertion assert model is not None (line 198) is redundant since resolve() raises ModelResolutionError when not found, but it doesn't hurt readability.

tests/unit/providers/routing/test_quota_aware_selection.py (1)

62-126: LGTM! Comprehensive integration tests for quota-aware selection.

The tests cover the key scenarios:

  1. Selecting available provider when cheaper one is exhausted
  2. Falling back to cheapest when all providers are exhausted
  3. Defaulting to cheapest when no quota tracking is configured

The deterministic time control via patching ensures reliable test behavior.

tests/unit/providers/routing/test_resolver_multi_provider.py (1)

1-323: LGTM! Comprehensive test coverage for multi-provider resolution.

Excellent test organization covering:

  • Multi-provider index creation without collision errors
  • resolve_all() returning all candidates
  • Selector injection and default behavior
  • Immutability guarantees (MappingProxyType, tuple candidates)
  • Edge cases (exact duplicates, three providers, empty tuple rejection)

The tests thoroughly validate the new multi-provider architecture.

tests/unit/providers/routing/test_selector.py (1)

1-152: LGTM! Thorough unit tests for selector implementations.

The tests comprehensively cover:

  • Protocol satisfaction verification
  • Quota-aware selection with various availability states
  • Fallback behavior when all providers exhausted
  • Deterministic tie-breaking by provider name (critical for reproducibility)
  • Error handling for empty candidate lists
  • Both QuotaAwareSelector and CheapestSelector implementations

The test for tie-breaking stability (lines 146-147) is particularly valuable for ensuring deterministic behavior.

src/synthorg/providers/management/service.py (1)

689-712: Clarify intent of the optional selector parameter.

The _build_router method accepts an optional selector parameter, but the only call site at line 658 (router = self._build_router(new_providers)) never passes one. Since the parameter is optional with a default of None, it works correctly, but consider adding a comment near the method definition explaining whether this is an intentional extension point for future use or testing, or if selector routing is planned to be wired through later.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 31, 2026

Codecov Report

❌ Patch coverage is 85.98131% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.02%. Comparing base (497848a) to head (c70d0a7).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/synthorg/budget/quota_tracker.py 70.00% 8 Missing and 4 partials ⚠️
src/synthorg/providers/routing/resolver.py 93.33% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #953      +/-   ##
==========================================
- Coverage   92.04%   92.02%   -0.03%     
==========================================
  Files         611      612       +1     
  Lines       32894    32984      +90     
  Branches     3158     3169      +11     
==========================================
+ Hits        30277    30352      +75     
- Misses       2079     2089      +10     
- Partials      538      543       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Aureliolo and others added 3 commits March 31, 2026 22:52
…tion

Change ModelResolver index from dict[str, ResolvedModel] to
dict[str, tuple[ResolvedModel, ...]] so multiple providers can
register the same model ID or alias without collision errors.

Add ModelCandidateSelector protocol with two implementations:
- QuotaAwareSelector (default): prefers providers with available
  quota, then cheapest
- CheapestSelector: always picks cheapest regardless of quota

Add QuotaTracker.peek_quota_available() sync method for lockless
quota state snapshots used by the selector at resolution time.

Wire selector through ModelRouter and ProviderManagementService.
Update existing tests that construct ModelResolver directly.

Closes #887

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address review agent findings:
- Guard against empty candidates in QuotaAwareSelector and
  CheapestSelector (raise ModelResolutionError instead of bare
  ValueError from min())
- Validate non-empty candidate tuples in ModelResolver constructor
- Wrap selector calls in resolve() and resolve_safe() to preserve
  their error contracts (domain errors only, resolve_safe never raises)
- Use defensive .get() in QuotaTracker.peek_quota_available()
- Add secondary sort key (provider_name) for deterministic tie-breaking
- Add DEBUG logging for exact-duplicate skip in _index_ref
- Add tests: 3-provider resolution, multi-window quota, empty
  candidates, tie-breaking by provider name

Pre-reviewed by 7 agents, 10 findings addressed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…decov

Source fixes:
- resolver.py: fix selector priority docstring, add logging before
  re-raising ModelResolutionError, distinguish expected vs unexpected
  errors in resolve_safe, bind exc for structured logs, use WARNING
  not ERROR for non-fatal path, compute sorted(index) once
- selector.py: expand QuotaAwareSelector.select docstring, add WARNING
  on all-providers-exhausted fallback, add CheapestSelector class
  docstring, remove unnecessary list() allocation
- service.py: document _build_router selector extension point intent
- quota_tracker.py: fix thread-safety docstring, fix inline comment

Test fixes:
- Add 6 error-wrapping tests (resolve/resolve_safe with failing selector)
- Add ModelRouter selector passthrough test
- Strengthen mock assertions with argument verification
- Consolidate _two_provider_config helper to conftest
- Fix api_key naming (sk-a -> sk-test-a)
- Make sync-only test non-async
- Replace redundant assert is not None with specific assertion
- Remove unnecessary mock_dt.side_effect assignment

Doc fixes:
- CLAUDE.md: add routing/ to providers description, add peek to budget
  description, add routing to logging events example
- operations.md: add Multi-Provider Model Resolution section, add
  peek_quota_available to Quota Degradation section

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Aureliolo Aureliolo force-pushed the feat/multi-provider-resolution branch from 26e48bb to 6242529 Compare March 31, 2026 20:55
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 31, 2026 20:56 — with GitHub Actions Inactive
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/budget/quota_tracker.py`:
- Around line 383-424: Add a runtime guard at the start of peek_quota_available
to ensure it is called only from the tracker’s owning asyncio event loop: call
asyncio.get_running_loop() and compare it to the instance’s stored loop (e.g.
self._loop or self._event_loop that should be captured when the tracker is
created); if they differ (or get_running_loop() raises because there is no
running loop), raise a clear RuntimeError explaining peek_quota_available must
be called from the tracker’s event loop. Also ensure asyncio is imported and
that the tracker stores the owning loop on construction if it doesn’t already.

In `@tests/unit/budget/conftest.py`:
- Around line 363-367: The index-building loop in make_resolver currently
overwrites entries so duplicate candidates are lost; change the logic that
assigns to index[m.model_id] and index[m.alias] to append the new ResolvedModel
instead of replacing existing tuples (i.e., if an entry exists, create a new
tuple combining the existing tuple and (m,), otherwise set (m,)), ensuring both
model_id and alias keys preserve all candidates.

In `@tests/unit/providers/routing/test_resolver.py`:
- Around line 197-202: The test currently only asserts that a single
resolver.resolve("shared-alias") result is a member of the expected set which
won’t catch non-deterministic selection; update the test to call
resolver.resolve("shared-alias") multiple times (e.g., in a loop or repeated
calls) and assert that every returned model.model_id is identical across calls
and that this id is one of {"test-model-a","test-model-b"}; you can still keep
resolver.resolve_all("shared-alias") and the providers_seen check but add the
repeated-call consistency assertion to ensure determinism of resolver.resolve.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d0632741-dab2-4d57-9184-3acda7579cf3

📥 Commits

Reviewing files that changed from the base of the PR and between 26e48bb and 6242529.

📒 Files selected for processing (17)
  • CLAUDE.md
  • docs/design/operations.md
  • src/synthorg/budget/quota_tracker.py
  • src/synthorg/observability/events/routing.py
  • src/synthorg/providers/management/service.py
  • src/synthorg/providers/routing/__init__.py
  • src/synthorg/providers/routing/resolver.py
  • src/synthorg/providers/routing/router.py
  • src/synthorg/providers/routing/selector.py
  • tests/unit/budget/conftest.py
  • tests/unit/budget/test_enforcer.py
  • tests/unit/budget/test_quota_tracker.py
  • tests/unit/providers/routing/conftest.py
  • tests/unit/providers/routing/test_quota_aware_selection.py
  • tests/unit/providers/routing/test_resolver.py
  • tests/unit/providers/routing/test_resolver_multi_provider.py
  • tests/unit/providers/routing/test_selector.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Python version 3.14+ with PEP 649 native lazy annotations
Do NOT use from __future__ import annotations -- Python 3.14 has PEP 649
PEP 758 except syntax: use except A, B: (no parentheses) -- ruff enforces this on Python 3.14
Line length: 88 characters (ruff enforced)

Files:

  • tests/unit/providers/routing/conftest.py
  • src/synthorg/providers/routing/__init__.py
  • src/synthorg/observability/events/routing.py
  • tests/unit/providers/routing/test_resolver.py
  • tests/unit/budget/conftest.py
  • tests/unit/budget/test_quota_tracker.py
  • src/synthorg/budget/quota_tracker.py
  • src/synthorg/providers/management/service.py
  • tests/unit/budget/test_enforcer.py
  • tests/unit/providers/routing/test_quota_aware_selection.py
  • tests/unit/providers/routing/test_selector.py
  • tests/unit/providers/routing/test_resolver_multi_provider.py
  • src/synthorg/providers/routing/resolver.py
  • src/synthorg/providers/routing/router.py
  • src/synthorg/providers/routing/selector.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Async tests: asyncio_mode = "auto" -- no manual @pytest.mark.asyncio needed
Test timeout: 30 seconds per test (global in pyproject.toml -- do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Tests must use test-provider, test-small-001, etc. instead of real vendor names
Property-based testing in Python: use Hypothesis (@given + @settings); run with HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties (dev profile: 1000 examples)
Hypothesis profiles: ci (50 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var
For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins
For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number) -- it is cancellation-safe and carries no timing assumptions

Files:

  • tests/unit/providers/routing/conftest.py
  • tests/unit/providers/routing/test_resolver.py
  • tests/unit/budget/conftest.py
  • tests/unit/budget/test_quota_tracker.py
  • tests/unit/budget/test_enforcer.py
  • tests/unit/providers/routing/test_quota_aware_selection.py
  • tests/unit/providers/routing/test_selector.py
  • tests/unit/providers/routing/test_resolver_multi_provider.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict)
Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields -- including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants -- instead of manual whitespace validators
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code (exception: observability/setup.py and observability/sinks.py may use stdlib logging and print(..., file=sys.stderr) for bootstrap and handler-cleanup code)
Variable name for logger: always logger (not _logger, not log)
Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool); import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT
Structured kwargs for logging: always logger.info(EVENT, key=value) -- never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO
DEBUG logging for object creation, internal flow, entry/exit of key...

Files:

  • src/synthorg/providers/routing/__init__.py
  • src/synthorg/observability/events/routing.py
  • src/synthorg/budget/quota_tracker.py
  • src/synthorg/providers/management/service.py
  • src/synthorg/providers/routing/resolver.py
  • src/synthorg/providers/routing/router.py
  • src/synthorg/providers/routing/selector.py
src/synthorg/providers/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/providers/**/*.py: All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically; never implement retry logic in driver subclasses or calling code
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig
Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError
Non-retryable errors raise immediately without retry
Rate limiter respects RateLimitError.retry_after from providers -- automatically pauses future requests

Files:

  • src/synthorg/providers/routing/__init__.py
  • src/synthorg/providers/management/service.py
  • src/synthorg/providers/routing/resolver.py
  • src/synthorg/providers/routing/router.py
  • src/synthorg/providers/routing/selector.py
🧠 Learnings (56)
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/providers/**/*.py : Set `RetryConfig` and `RateLimiterConfig` per-provider in `ProviderConfig`

Applied to files:

  • tests/unit/providers/routing/conftest.py
📚 Learning: 2026-03-31T14:40:41.736Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:40:41.736Z
Learning: Applies to src/synthorg/providers/**/*.py : `RetryConfig` and `RateLimiterConfig` are set per-provider in `ProviderConfig`

Applied to files:

  • tests/unit/providers/routing/conftest.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Set `RetryConfig` and `RateLimiterConfig` per-provider in `ProviderConfig`.

Applied to files:

  • tests/unit/providers/routing/conftest.py
📚 Learning: 2026-03-16T19:13:36.562Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T19:13:36.562Z
Learning: Applies to src/synthorg/providers/**/*.py : RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig.

Applied to files:

  • tests/unit/providers/routing/conftest.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-31T14:40:41.736Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:40:41.736Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.

Applied to files:

  • src/synthorg/observability/events/routing.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mock `time.monotonic()` and `asyncio.sleep()` for determinism; use `asyncio.Event().wait()` for indefinite blocking instead of `asyncio.sleep(large_number)`

Applied to files:

  • tests/unit/budget/test_quota_tracker.py
  • tests/unit/providers/routing/test_quota_aware_selection.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/budget/test_quota_tracker.py
  • tests/unit/providers/routing/test_quota_aware_selection.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to {docs/design/operations.md,src/synthorg/providers/presets.py,.claude/**/*.{md,yml,yaml}} : Vendor names may appear only in: (1) Operations design page, (2) `.claude/` skill/agent files, (3) third-party import paths, (4) provider presets (`src/synthorg/providers/presets.py`), (5) tests using `test-provider`, `test-small-001`, etc.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

  • CLAUDE.md
  • src/synthorg/providers/management/service.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget tracking includes pre-flight/in-flight checks, auto-downgrade, billing periods, cost tiers, quota/subscription. CFO includes anomaly detection, efficiency analysis, downgrade recommendations.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/api/**/*.py : API package (api/): Litestar REST + WebSocket with controllers, guards, channels, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint, provider management endpoint (CRUD + test + presets), backup endpoint, RFC 9457 structured errors, AppState hot-reload slots, service auto-wiring (Phase 1 at construction, Phase 2 on startup), lifecycle helpers

Applied to files:

  • CLAUDE.md
  • src/synthorg/providers/management/service.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/api/**/*.py : REST API: Litestar framework, controllers with guards, channels for WebSocket, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint. RFC 9457 structured errors (ErrorCategory, ErrorCode, ErrorDetail, ProblemDetail, CATEGORY_TITLES, category_title, category_type_uri, content negotiation).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-26T15:18:16.848Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-26T15:18:16.848Z
Learning: Applies to src/synthorg/api/**/*.py : Litestar API must include setup wizard, auth/, auto-wiring, and lifecycle management

Applied to files:

  • CLAUDE.md
  • src/synthorg/providers/management/service.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/api/**/*.py : Use Litestar for REST + WebSocket API. Controllers, guards, channels, JWT + API key + WS ticket auth, RFC 9457 structured errors.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/communication/**/*.py : Communication package (communication/): message bus, dispatcher, messenger, channels, delegation, loop prevention, conflict resolution; meeting/ subpackage for meeting protocol (round-robin, position papers, structured phases), scheduler (frequency, participant resolver), orchestrator

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Budget: Cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError).

Applied to files:

  • CLAUDE.md
  • docs/design/operations.md
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/api/**/*.py : Use Litestar for REST API and WebSocket API with JWT + API key + WS ticket authentication, RFC 9457 structured errors, and content negotiation.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.

Applied to files:

  • CLAUDE.md
  • src/synthorg/providers/management/service.py
  • src/synthorg/providers/routing/resolver.py
  • src/synthorg/providers/routing/selector.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Settings: Runtime-editable settings persistence (DB > env > YAML > code defaults), typed definitions (9 namespaces), Fernet encryption for sensitive values, config bridge, ConfigResolver (typed composed reads for controllers), validation, registry, change notifications via message bus. Per-namespace setting definitions in definitions/ submodule (api, company, providers, memory, budget, security, coordination, observability, backup).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Documentation source in `docs/` (Markdown, built with Zensical). Design spec in `docs/design/` (7 pages: index, agents, organization, communication, engine, memory, operations). Architecture in `docs/architecture/` (overview, tech-stack, decision log). Roadmap in `docs/roadmap/`. Security in `docs/security.md`. Licensing in `docs/licensing.md`. Reference in `docs/reference/`. REST API reference in `docs/rest-api.md`. Library reference in `docs/api/` (auto-generated from docstrings). Custom templates in `docs/overrides/`. Config in `mkdocs.yml`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to docs/** : Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docs/design/*.md : Design spec pages: 7 pages in `docs/design/` — index, agents, organization, communication, engine, memory, operations

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state

Applied to files:

  • src/synthorg/providers/management/service.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

  • src/synthorg/providers/management/service.py
📚 Learning: 2026-03-31T14:40:41.736Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:40:41.736Z
Learning: Applies to tests/**/*.py : For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/providers/routing/test_quota_aware_selection.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to **/*.py : Use `except A, B:` (no parentheses) per PEP 758 exception syntax on Python 3.14

Applied to files:

  • src/synthorg/providers/routing/resolver.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (without parentheses) per PEP 758 for exception handling in Python 3.14

Applied to files:

  • src/synthorg/providers/routing/resolver.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling — PEP 758 exception syntax enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/providers/routing/resolver.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Use PEP 758 except syntax with `except A, B:` (no parentheses) for multiple exceptions—ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/providers/routing/resolver.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — enforced by ruff on Python 3.14

Applied to files:

  • src/synthorg/providers/routing/resolver.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Handle errors explicitly; never silently swallow exceptions

Applied to files:

  • src/synthorg/providers/routing/resolver.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow exceptions

Applied to files:

  • src/synthorg/providers/routing/resolver.py
📚 Learning: 2026-03-31T14:40:41.736Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:40:41.736Z
Learning: Applies to **/*.py : PEP 758 except syntax: use `except A, B:` (no parentheses) -- ruff enforces this on Python 3.14

Applied to files:

  • src/synthorg/providers/routing/resolver.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to **/*.py : Handle errors explicitly—never silently swallow exceptions.

Applied to files:

  • src/synthorg/providers/routing/resolver.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/**/*.py : `RetryConfig` and `RateLimiterConfig` are set per-provider in `ProviderConfig`. Retryable errors: `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately.

Applied to files:

  • src/synthorg/providers/routing/resolver.py
📚 Learning: 2026-03-16T19:13:34.746Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T19:13:34.746Z
Learning: Applies to src/synthorg/providers/**/*.py : Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately without retry. RetryExhaustedError signals that all retries failed — the engine layer catches this to trigger fallback chains.

Applied to files:

  • src/synthorg/providers/routing/resolver.py
🔇 Additional comments (29)
tests/unit/providers/routing/conftest.py (1)

88-117: Good multi-provider fixture coverage.

This helper cleanly models same-ref registration across two providers with distinct economics, which is exactly what selector tests need.

tests/unit/budget/test_enforcer.py (1)

103-106: Resolver helper now matches tuple-candidate index contract.

The models is None path and tuple-wrapped index construction are both aligned with the updated resolver interface.

src/synthorg/providers/routing/__init__.py (1)

16-20: Selector API exports are correctly wired.

Re-exporting selector protocol + implementations here keeps the routing public surface coherent.

Also applies to: 45-45, 49-49, 54-54

src/synthorg/observability/events/routing.py (1)

16-17: New routing event constants look correct.

Both names follow the existing domain event scheme and support the new multi-provider selection telemetry.

CLAUDE.md (1)

91-91: Documentation updates are aligned with the implementation changes.

The new budget/provider/routing-event guidance is consistent with the multi-provider selector architecture introduced in this PR.

Also applies to: 101-101, 143-143

src/synthorg/providers/routing/router.py (1)

44-66: LGTM!

The selector parameter is cleanly threaded through to ModelResolver.from_config(). The keyword-only parameter pattern (*,) is appropriate here, and the docstring correctly documents the default behavior (QuotaAwareSelector()).

tests/unit/budget/test_quota_tracker.py (1)

537-632: LGTM!

The TestPeekQuotaAvailable test class provides comprehensive coverage for the synchronous peek_quota_available() API. The test method signatures are correct:

  • test_fresh_tracker_all_available and test_empty_tracker are sync (def) since they don't call async methods.
  • Other tests are async def since they call record_usage() before checking availability.

The datetime patching approach provides deterministic test behavior.

src/synthorg/providers/management/service.py (1)

689-721: LGTM!

The _build_router method cleanly accepts and forwards the selector parameter. The docstring's Note section clearly documents this as an extension point for future quota integration at the service layer. The current caller (_validate_and_persist) correctly uses the default behavior.

tests/unit/providers/routing/test_quota_aware_selection.py (2)

25-29: LGTM!

The datetime mock has been simplified per the previous review feedback - side_effect assignment removed. The context manager now only patches datetime.now().


62-126: LGTM!

The integration tests effectively verify the complete quota-aware selection flow. The three test cases cover the essential scenarios:

  1. Quota exhaustion forcing selection of more expensive provider
  2. All providers exhausted falling back to cheapest
  3. Default behavior (no quota config) selecting cheapest
docs/design/operations.md (2)

170-192: LGTM!

The documentation clearly explains the multi-provider resolution flow, selector behaviors, and how strategies integrate with the selection mechanism. The table format makes the selector comparison easy to understand.


338-343: LGTM!

Good documentation of the peek_quota_available() API, including the explicit TOCTOU tolerance note for routing-time decisions. This accurately reflects the design trade-off between lock-free performance and snapshot consistency.

tests/unit/providers/routing/test_selector.py (2)

40-121: LGTM!

Comprehensive unit tests for QuotaAwareSelector covering all key behaviors: protocol satisfaction, quota filtering, cost-based selection, fallback when all exhausted, unknown provider handling, single candidate cases, empty input error, and deterministic tie-breaking.


126-152: LGTM!

Good coverage for CheapestSelector including protocol satisfaction, cost-based selection, single candidate, tie-breaking, and empty input error handling.

tests/unit/providers/routing/test_resolver_multi_provider.py (4)

64-196: LGTM!

Comprehensive test coverage for the multi-provider index behavior including: no collision errors for same model ID/alias across providers, resolve_all() returning all candidates, all_models() and sorted variants, three-provider scenarios, and duplicate elimination.


201-280: LGTM!

Good coverage for selector integration: default selector type verification, injected selector invocation through resolve() and resolve_safe(), and selector preservation via from_config().


309-380: LGTM!

Excellent coverage for exception handling semantics: wrapping arbitrary exceptions as ModelResolutionError, propagating MemoryError unchanged, re-raising ModelResolutionError directly, and resolve_safe() returning None on errors. These tests ensure the error handling contract is correctly implemented.


386-399: LGTM!

The test correctly verifies that ModelRouter forwards the selector to its internal ModelResolver. Accessing router._resolver.selector is acceptable here for white-box testing of the pass-through behavior.

src/synthorg/providers/routing/selector.py (3)

24-27: LGTM!

The _cost_key helper provides deterministic ordering by using total_cost_per_1k as the primary key and provider_name as the tie-breaker, ensuring stable selection across runs.


52-123: LGTM!

QuotaAwareSelector implementation is well-designed:

  • Immutable after construction via MappingProxyType
  • Correct fallback behavior: prefers quota-available providers, falls back to cheapest when all exhausted
  • Providers absent from quota map are correctly treated as available (line 122)
  • Warning log is appropriately scoped (only when quota map exists and all providers exhausted)
  • Debug logging for observability when multiple candidates exist

125-150: LGTM!

CheapestSelector is a clean, stateless implementation that correctly selects by cost alone. The debug logging for multi-candidate selections provides good observability.

src/synthorg/providers/routing/resolver.py (8)

1-33: LGTM!

Module docstring clearly explains the multi-provider behavior and selector integration. Imports are well-organized with proper event constant imports from the routing domain.


51-83: LGTM!

Constructor correctly validates non-empty candidate tuples and uses MappingProxyType for immutability. The defensive tuple(v) copy on line 72 ensures callers cannot mutate the index after construction.


90-118: LGTM!

Multi-provider registration logic correctly differentiates exact duplicates (skipped with DEBUG log) from distinct providers for the same ref (appended with INFO log). The use of a mutable list here that's converted to tuple in from_config is an appropriate internal pattern.


120-170: LGTM!

Factory method cleanly separates index building from construction. The multi_provider_refs logging at line 168 provides valuable observability for detecting multi-provider configurations at startup.


198-221: LGTM!

Exception handling is well-structured: ModelResolutionError is logged and re-raised, critical errors (MemoryError, RecursionError) propagate immediately, and unexpected exceptions are wrapped with context. The PEP 758 exception syntax on line 209 is correct. Based on learnings: "Use except A, B: (no parentheses) per PEP 758 exception syntax on Python 3.14".


231-274: LGTM!

The resolve_safe method appropriately differentiates error handling: critical errors (MemoryError, RecursionError) still propagate, expected failures (ModelResolutionError) log at DEBUG and return None, while unexpected exceptions log at WARNING with full traceback. This follows the guideline to never silently swallow exceptions. Based on learnings: "Handle errors explicitly; never silently swallow exceptions".


276-285: LGTM!

Clean implementation for accessing all candidates without triggering selection. The empty tuple default is appropriate for the "not found" case and documented in the docstring.


287-301: LGTM!

Updated deduplication using (provider_name, model_id) pairs correctly distinguishes the same model served by different providers. This is essential for the multi-provider feature to properly enumerate all available model variants.

… determinism)

- quota_tracker.py: add runtime event-loop guard to peek_quota_available
  that raises RuntimeError if called from a thread or wrong event loop
  after async methods have been used; store owning loop lazily via
  _capture_loop() on first async operation
- tests/unit/budget/conftest.py: fix make_resolver index-building to
  append candidates instead of overwriting, preserving all variants
  when multiple models share an ID or alias
- tests/unit/providers/routing/test_resolver.py: add repeated-call
  determinism assertion to collision detection test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/budget/quota_tracker.py`:
- Around line 110-127: In _assert_loop_safe(), before each RuntimeError is
raised (both the RuntimeError when no running loop but self._loop is set, and
the RuntimeError when current_loop is different from self._loop), add a
logger.warning (or logger.error per guideline) that logs the owning loop
(self._loop) and the current loop (current_loop or "None" when no running loop)
along with a short message about loop-affinity violation; keep the existing
RuntimeError raises and messages intact so behavior doesn't change, only emit
the structured log with context prior to raising in the _assert_loop_safe()
method of QuotaTracker/QuotaTracker class.
- Around line 457-461: The peek snapshot wrongly treats providers with
usage.tokens == quota.max_tokens as available because `_is_window_exhausted` is
called with `estimated_tokens=0`; update the peek logic in quota_tracker (the
block using `quota = quota_map.get(window_type)` and calling
`_is_window_exhausted(usage, quota, estimated_tokens=0)`) to treat exact-cap
providers as exhausted—either call `_is_window_exhausted` with
`estimated_tokens=1` or change the check to use a >= comparison (usage.tokens >=
quota.max_tokens) so that `QuotaAwareSelector`
(src/synthorg/providers/routing/selector.py) will not prefer providers with zero
remaining token headroom.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 61249b05-c147-4c30-aa6e-407110d1c377

📥 Commits

Reviewing files that changed from the base of the PR and between 6242529 and f61b9ee.

📒 Files selected for processing (3)
  • src/synthorg/budget/quota_tracker.py
  • tests/unit/budget/conftest.py
  • tests/unit/providers/routing/test_resolver.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Backend
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Web
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Python version 3.14+ with PEP 649 native lazy annotations
Do NOT use from __future__ import annotations -- Python 3.14 has PEP 649
PEP 758 except syntax: use except A, B: (no parentheses) -- ruff enforces this on Python 3.14
Line length: 88 characters (ruff enforced)

Files:

  • tests/unit/providers/routing/test_resolver.py
  • tests/unit/budget/conftest.py
  • src/synthorg/budget/quota_tracker.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Async tests: asyncio_mode = "auto" -- no manual @pytest.mark.asyncio needed
Test timeout: 30 seconds per test (global in pyproject.toml -- do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Tests must use test-provider, test-small-001, etc. instead of real vendor names
Property-based testing in Python: use Hypothesis (@given + @settings); run with HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties (dev profile: 1000 examples)
Hypothesis profiles: ci (50 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var
For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins
For tasks that must block indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number) -- it is cancellation-safe and carries no timing assumptions

Files:

  • tests/unit/providers/routing/test_resolver.py
  • tests/unit/budget/conftest.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict)
Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields -- including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants -- instead of manual whitespace validators
Async concurrency: prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls); prefer structured concurrency over bare create_task
Functions: < 50 lines, files < 800 lines
Errors: handle explicitly, never silently swallow
Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__)
Never use import logging / logging.getLogger() / print() in application code (exception: observability/setup.py and observability/sinks.py may use stdlib logging and print(..., file=sys.stderr) for bootstrap and handler-cleanup code)
Variable name for logger: always logger (not _logger, not log)
Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool); import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT
Structured kwargs for logging: always logger.info(EVENT, key=value) -- never logger.info("msg %s", val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO
DEBUG logging for object creation, internal flow, entry/exit of key...

Files:

  • src/synthorg/budget/quota_tracker.py
🧠 Learnings (2)
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget tracking includes pre-flight/in-flight checks, auto-downgrade, billing periods, cost tiers, quota/subscription. CFO includes anomaly detection, efficiency analysis, downgrade recommendations.

Applied to files:

  • src/synthorg/budget/quota_tracker.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)

Applied to files:

  • src/synthorg/budget/quota_tracker.py
🔇 Additional comments (2)
tests/unit/budget/conftest.py (1)

363-370: LGTM! Correctly accumulates multi-provider candidates.

The index now properly appends each ResolvedModel to existing tuples rather than overwriting, which aligns with the new ModelResolver.__init__ signature that expects dict[str, tuple[ResolvedModel, ...]]. This resolves the previously flagged issue where duplicate candidates for shared refs were being dropped.

tests/unit/providers/routing/test_resolver.py (1)

172-205: LGTM! Test correctly validates multi-provider resolution and determinism.

The test properly verifies:

  1. from_config succeeds with duplicate aliases (no longer raises)
  2. Determinism via 5 repeated resolve() calls (addresses the past review concern)
  3. resolve_all() returns both candidates
  4. Both providers are represented in the candidate set

The determinism check using only model_id comparison is sufficient here since the two providers have distinct model IDs.

…peek

- _assert_loop_safe: log at WARNING/ERROR with owning_loop and
  current_loop context before raising RuntimeError (CLAUDE.md rule)
- Add QUOTA_LOOP_AFFINITY_VIOLATED event constant
- peek_quota_available: use estimated_tokens=1 instead of 0 so
  providers at exactly the token cap are treated as exhausted
  (zero headroom means next request will fail)
- Add test_token_exact_cap_shows_unavailable test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/design/operations.md`:
- Around line 338-343: Update the QuotaTracker documentation for
peek_quota_available(): explicitly state that
QuotaTracker.peek_quota_available() enforces asyncio loop-affinity and will
raise a RuntimeError if invoked from a different thread or event loop, that it
reads cached counters without acquiring the async lock (so TOCTOU is tolerated),
and that the returned dict[str, bool] snapshot omits any providers that have no
configured quota; also mention that QuotaAwareSelector relies on this behavior
when making heuristic routing choices.
- Around line 184-187: Update the wording to avoid implying unconditional
wiring: change the sentence about QuotaAwareSelector so it states that
QuotaAwareSelector is constructed with a snapshot from
QuotaTracker.peek_quota_available() only "when injected/built with a snapshot"
rather than implying ModelResolver/ModelRouter always receive that selector;
reference ModelResolver, ModelRouter, QuotaAwareSelector, and
QuotaTracker.peek_quota_available() in the revised sentence to make the
conditional/optional nature explicit.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 04d851e1-5e06-4c32-afd6-b6a0b91c5cbc

📥 Commits

Reviewing files that changed from the base of the PR and between e0fd65d and c70d0a7.

📒 Files selected for processing (2)
  • CLAUDE.md
  • docs/design/operations.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Type Check
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Build Sandbox
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (1)
docs/design/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

When approved deviations occur, update the relevant docs/design/ page to reflect the new reality

Files:

  • docs/design/operations.md
🧠 Learnings (25)
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget tracking includes pre-flight/in-flight checks, auto-downgrade, billing periods, cost tiers, quota/subscription. CFO includes anomaly detection, efficiency analysis, downgrade recommendations.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/api/**/*.py : API package (api/): Litestar REST + WebSocket with controllers, guards, channels, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint, provider management endpoint (CRUD + test + presets), backup endpoint, RFC 9457 structured errors, AppState hot-reload slots, service auto-wiring (Phase 1 at construction, Phase 2 on startup), lifecycle helpers

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/budget/**/*.py : Budget package (budget/): cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-26T15:18:16.848Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-26T15:18:16.848Z
Learning: Applies to src/synthorg/api/**/*.py : Litestar API must include setup wizard, auth/, auto-wiring, and lifecycle management

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/api/**/*.py : REST API: Litestar framework, controllers with guards, channels for WebSocket, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint. RFC 9457 structured errors (ErrorCategory, ErrorCode, ErrorDetail, ProblemDetail, CATEGORY_TITLES, category_title, category_type_uri, content negotiation).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/communication/**/*.py : Communication package (communication/): message bus, dispatcher, messenger, channels, delegation, loop prevention, conflict resolution; meeting/ subpackage for meeting protocol (round-robin, position papers, structured phases), scheduler (frequency, participant resolver), orchestrator

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/api/**/*.py : Use Litestar for REST + WebSocket API. Controllers, guards, channels, JWT + API key + WS ticket auth, RFC 9457 structured errors.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/engine/**/*.py : Engine package (engine/): agent orchestration, parallel execution, task decomposition, routing, TaskEngine (centralized single-writer), task lifecycle/recovery/shutdown, workspace isolation, coordination (4 dispatchers: SAS/centralized/decentralized/context-dependent, wave execution), approval gates (escalation detection, context parking/resume), stagnation detection (ToolRepetitionDetector, corrective prompt injection), AgentRuntimeState (execution status), context budget management, conversation compaction (oldest-turns summarizer)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/api/**/*.py : Use Litestar for REST API and WebSocket API with JWT + API key + WS ticket authentication, RFC 9457 structured errors, and content negotiation.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to src/synthorg/providers/**/*.py : Providers: LLM provider abstraction (LiteLLM adapter), auth types (api_key/oauth/custom_header/none), presets (PROVIDER_PRESETS), runtime CRUD (ProviderManagementService with asyncio.Lock serialization), hot-reload via AppState swap.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Settings: Runtime-editable settings persistence (DB > env > YAML > code defaults), typed definitions (9 namespaces), Fernet encryption for sensitive values, config bridge, ConfigResolver (typed composed reads for controllers), validation, registry, change notifications via message bus. Per-namespace setting definitions in definitions/ submodule (api, company, providers, memory, budget, security, coordination, observability, backup).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/memory/**/*.py : Memory package (memory/): pluggable MemoryBackend protocol, backends/ (Mem0 adapter), retrieval pipeline (ranking, RRF fusion, injection, formatting, non-inferable filtering), shared org memory (org/), consolidation/archival (density-aware: DensityClassifier, AbstractiveSummarizer, ExtractivePreserver, DualModeConsolidationStrategy)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Documentation source in `docs/` (Markdown, built with Zensical). Design spec in `docs/design/` (7 pages: index, agents, organization, communication, engine, memory, operations). Architecture in `docs/architecture/` (overview, tech-stack, decision log). Roadmap in `docs/roadmap/`. Security in `docs/security.md`. Licensing in `docs/licensing.md`. Reference in `docs/reference/`. REST API reference in `docs/rest-api.md`. Library reference in `docs/api/` (auto-generated from docstrings). Custom templates in `docs/overrides/`. Config in `mkdocs.yml`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to docs/** : Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Budget: Cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError).

Applied to files:

  • docs/design/operations.md
🔇 Additional comments (3)
CLAUDE.md (3)

91-91: Accurate budget-domain scope update.

This line correctly documents the new quota snapshot usage for routing-time selection and aligns with the PR’s quota-aware resolution objective.


101-101: Provider routing description is aligned with implementation changes.

The added routing/ and selector details are clear and match the multi-provider model resolution strategy introduced in this PR.


143-143: Good observability convention update.

Including ROUTING_CANDIDATE_SELECTED in the event-constant examples keeps the logging guidance in sync with the new routing behavior.

Comment on lines +184 to +187
The selector is injected into `ModelResolver` (and transitively into `ModelRouter`)
at construction time. `QuotaAwareSelector` is constructed with a snapshot from
`QuotaTracker.peek_quota_available()`, which returns a synchronous `dict[str, bool]`
of per-provider quota availability.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Documentation overstates quota-snapshot wiring path.

Line 185-Line 187 currently reads as unconditional behavior, but the service layer still appears to default to QuotaAwareSelector() unless a selector is explicitly injected. Please reword to “when injected/built with a snapshot” to match current wiring.

Proposed doc fix
-The selector is injected into `ModelResolver` (and transitively into `ModelRouter`)
-at construction time.  `QuotaAwareSelector` is constructed with a snapshot from
-`QuotaTracker.peek_quota_available()`, which returns a synchronous `dict[str, bool]`
-of per-provider quota availability.
+The selector is injected into `ModelResolver` (and transitively into `ModelRouter`)
+at construction time. When quota-aware routing is wired at the service layer,
+`QuotaAwareSelector` can be constructed with a snapshot from
+`QuotaTracker.peek_quota_available()`, which returns a synchronous `dict[str, bool]`
+of per-provider quota availability.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
The selector is injected into `ModelResolver` (and transitively into `ModelRouter`)
at construction time. `QuotaAwareSelector` is constructed with a snapshot from
`QuotaTracker.peek_quota_available()`, which returns a synchronous `dict[str, bool]`
of per-provider quota availability.
The selector is injected into `ModelResolver` (and transitively into `ModelRouter`)
at construction time. When quota-aware routing is wired at the service layer,
`QuotaAwareSelector` can be constructed with a snapshot from
`QuotaTracker.peek_quota_available()`, which returns a synchronous `dict[str, bool]`
of per-provider quota availability.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/operations.md` around lines 184 - 187, Update the wording to
avoid implying unconditional wiring: change the sentence about
QuotaAwareSelector so it states that QuotaAwareSelector is constructed with a
snapshot from QuotaTracker.peek_quota_available() only "when injected/built with
a snapshot" rather than implying ModelResolver/ModelRouter always receive that
selector; reference ModelResolver, ModelRouter, QuotaAwareSelector, and
QuotaTracker.peek_quota_available() in the revised sentence to make the
conditional/optional nature explicit.

Comment on lines +338 to +343
`QuotaTracker` also exposes a synchronous `peek_quota_available()` method that returns
a `dict[str, bool]` snapshot of per-provider quota availability. This is used by the
`QuotaAwareSelector` at routing time to prefer providers with remaining quota. The
method reads cached counters without acquiring the async lock (safe on the single-threaded
asyncio event loop) and tolerates TOCTOU for heuristic selection decisions.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add explicit peek_quota_available() runtime constraints.

This section should also note that peek_quota_available() enforces loop-affinity and can raise RuntimeError when called from the wrong context, and that providers without configured quotas are omitted from the snapshot.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/operations.md` around lines 338 - 343, Update the QuotaTracker
documentation for peek_quota_available(): explicitly state that
QuotaTracker.peek_quota_available() enforces asyncio loop-affinity and will
raise a RuntimeError if invoked from a different thread or event loop, that it
reads cached counters without acquiring the async lock (so TOCTOU is tolerated),
and that the returned dict[str, bool] snapshot omits any providers that have no
configured quota; also mention that QuotaAwareSelector relies on this behavior
when making heuristic routing choices.

@Aureliolo Aureliolo merged commit 146b782 into main Mar 31, 2026
28 checks passed
@Aureliolo Aureliolo deleted the feat/multi-provider-resolution branch March 31, 2026 22:06
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 31, 2026 22:06 — with GitHub Actions Inactive
Aureliolo added a commit that referenced this pull request Apr 1, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.5.4](v0.5.3...v0.5.4)
(2026-04-01)


### Features

* artifact and project management UI in web dashboard
([#954](#954))
([00a0430](00a0430))
* embed MkDocs build output in React web dashboard at /docs
([#948](#948))
([f229fc2](f229fc2))
* personality preset discovery API and user-defined preset CRUD
([#952](#952))
([497848a](497848a))
* support multi-provider model resolution with budget-based selection
([#953](#953))
([146b782](146b782))
* support per-agent memory retention overrides
([#209](#209))
([#951](#951))
([020c610](020c610))


### Documentation

* write user guides and tutorials
([#949](#949))
([1367225](1367225))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Support multi-provider model resolution with budget-based selection

1 participant