Skip to content

feat(settings): implement settings change subscriptions for service hot-reload#526

Merged
Aureliolo merged 7 commits intomainfrom
feat/settings-hot-reload
Mar 17, 2026
Merged

feat(settings): implement settings change subscriptions for service hot-reload#526
Aureliolo merged 7 commits intomainfrom
feat/settings-hot-reload

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

  • Add generic mechanism for services to subscribe to #settings message bus channel and react to runtime setting changes
  • SettingsSubscriber protocol: declares (namespace, key) pairs + async callback
  • SettingsChangeDispatcher: polls #settings bus channel, dispatches to matching subscribers, skips restart_required=True settings with operator WARNING
  • ProviderSettingsSubscriber: rebuilds ModelRouter on routing_strategy change (extracts .value from SettingValue), advisory logging for default_provider/retry_max_attempts
  • MemorySettingsSubscriber: logs that non-restart settings take effect on next use
  • AppState: swappable provider_registry/model_router slots with swap_* methods for reference replacement
  • Add #settings to default message bus channels
  • Wire dispatcher into app lifecycle (start after bridge, stop before bridge)
  • Done callback on dispatcher task detects silent poll-loop death

Test plan

  • test_subscriber_protocol.py — runtime-checkable conformance, structural subtyping
  • test_dispatcher.py — lifecycle (start/stop/double-start/idempotent), dispatch routing (match/non-match/multi), restart_required skip, error isolation, metadata extraction (missing/partial/default), done_callback running flag reset
  • test_provider_subscriber.py — watched keys, strategy change swaps router (with real SettingValue mock), rebuild failure propagates UnknownStrategyError, advisory keys are no-ops
  • test_memory_subscriber.py — watched keys (excludes backend), info logging, idempotent
  • test_state_swap.py — swap replaces reference, swap from None, 503 when None, has_* flags
  • test_app.py — existing lifecycle tests updated for new dispatcher parameter
  • Full suite: 8933 passed, 94.39% coverage

Review coverage

Pre-reviewed by 12 agents: code-reviewer, python-reviewer, test-analyzer, silent-failure-hunter, type-design-analyzer, logging-audit, conventions-enforcer, async-concurrency-reviewer, resilience-audit, docs-consistency, comment-analyzer, issue-resolution-verifier. 12 findings addressed (1 critical bug: SettingValue.value extraction, 1 critical: done_callback, 1 major: error propagation, rest medium).

Closes #503

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 17, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 17, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f6e5d20e-c39d-4483-aec9-ddb2f69c8fa4

📥 Commits

Reviewing files that changed from the base of the PR and between f5a8292 and fd5c999.

📒 Files selected for processing (1)
  • tests/unit/settings/test_dispatcher.py

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Runtime settings dispatcher that watches a settings channel, starts/stops with the app, and notifies subscribers.
    • Hot-swap support for provider registry and model router at runtime.
    • Built-in memory and provider settings subscribers.
    • New API entry for runtime-editable settings (/api/v1/settings).
  • Observability

    • Added events for dispatcher/subscriber lifecycle, errors, and service swap notifications.
  • Tests

    • Comprehensive unit tests for dispatcher, subscribers, and state-swap behavior.
  • Documentation

    • Operations docs updated to describe hot-reload settings behavior and subscribers.

Walkthrough

Adds a settings hot-reload subsystem: a runtime-checkable SettingsSubscriber protocol, a SettingsChangeDispatcher that polls the #settings message bus and routes change notifications to subscribers, concrete subscribers (ProviderSettingsSubscriber, MemorySettingsSubscriber), AppState swappable slots for provider_registry and model_router, and lifecycle wiring into app startup/shutdown.

Changes

Cohort / File(s) Summary
Core settings API & dispatcher
src/synthorg/settings/subscriber.py, src/synthorg/settings/dispatcher.py, src/synthorg/settings/__init__.py
Adds SettingsSubscriber protocol and SettingsChangeDispatcher (ensures #settings channel, polls, extracts metadata, filters restart_required, dispatches to matching subscribers, provides start/stop lifecycle; exposes protocol in package).
Concrete subscribers
src/synthorg/settings/subscribers/__init__.py, src/synthorg/settings/subscribers/memory_subscriber.py, src/synthorg/settings/subscribers/provider_subscriber.py
Adds MemorySettingsSubscriber (watches memory keys, logs notifications) and ProviderSettingsSubscriber (watches provider keys; rebuilds and swaps ModelRouter on routing_strategy changes; logs advisory key changes and errors).
App lifecycle & wiring
src/synthorg/api/app.py
Builds and wires SettingsChangeDispatcher into create_app, lifecycle, startup/shutdown, and failure-cleanup via _build_settings_dispatcher; updates startup/shutdown/cleanup signatures to accept dispatcher and tracks start/stop for proper cleanup.
AppState hot-swap extensions
src/synthorg/api/state.py
Adds optional _provider_registry and _model_router slots, has_provider_registry / has_model_router, provider_registry / model_router accessors (raise 503 if absent), and swap_provider_registry / swap_model_router methods with swap logging.
Messaging & observability
src/synthorg/communication/config.py, src/synthorg/observability/events/settings.py
Adds default #settings channel and new observability constants for dispatcher/subscriber/service swap lifecycle events.
Settings definitions
src/synthorg/settings/definitions/providers.py
Makes routing_strategy enum values dynamic by deriving them from STRATEGY_MAP.keys() instead of a hard-coded list.
Tests & docs
tests/unit/api/test_app.py, tests/unit/api/test_state_swap.py, tests/unit/settings/*, docs/design/operations.md
Updates tests for updated startup/shutdown arity; adds comprehensive dispatcher and subscriber unit tests; adds docs entry for /api/v1/settings and hot-reload behavior.

Sequence Diagram

sequenceDiagram
    participant MB as MessageBus
    participant Disp as SettingsChangeDispatcher
    participant Sub as SettingsSubscriber
    participant State as AppState

    MB->>Disp: publish change to `#settings`
    Disp->>Disp: ensure channel / poll message
    Disp->>Disp: extract metadata (namespace,key,restart_required)
    alt restart_required == True
        Disp->>Disp: log restart-required and skip notify
    else
        Disp->>Disp: select subscribers matching (namespace,key)
        Disp->>Sub: on_settings_changed(namespace, key)
        Sub->>Sub: read settings / rebuild service (if applicable)
        Sub->>State: swap_* (replace instance in AppState)
        State-->>Sub: acknowledge swap
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.93% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: implementing settings change subscriptions for service hot-reload, which is the primary focus of this PR.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, covering all key components (SettingsSubscriber, SettingsChangeDispatcher, concrete subscribers, AppState hot-swap support) and linking to the resolved issue.
Linked Issues check ✅ Passed The code changes fully implement all objectives from issue #503: generic subscription mechanism, dispatcher listening to #settings channel, subscriber declarations, atomic swaps, restart_required filtering, and graceful error handling.
Out of Scope Changes check ✅ Passed All code changes are directly scoped to implementing settings change subscriptions, hot-reload dispatcher, concrete subscribers, and AppState swap support as defined in issue #503; no unrelated changes detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/settings-hot-reload
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch feat/settings-hot-reload
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust system for dynamic, runtime configuration updates within the application. By establishing a generic subscription mechanism via a new message bus channel, services can now react immediately to changes in specific settings without requiring a full application restart. This significantly enhances the application's flexibility and operational efficiency, particularly for settings related to provider routing and memory configuration, enabling a more agile response to evolving operational needs.

Highlights

  • Settings Change Subscription Mechanism: Implemented a generic mechanism allowing services to subscribe to the '#settings' message bus channel and react to runtime setting changes.
  • SettingsSubscriber Protocol: Introduced a SettingsSubscriber protocol that defines (namespace, key) pairs and an asynchronous callback for handling setting changes.
  • SettingsChangeDispatcher: Developed a SettingsChangeDispatcher to poll the '#settings' bus channel, dispatch changes to matching subscribers, and skip settings marked as restart_required=True with a warning.
  • ProviderSettingsSubscriber: Added a ProviderSettingsSubscriber that rebuilds the ModelRouter when the routing_strategy setting changes, extracting the value from SettingValue.
  • MemorySettingsSubscriber: Included a MemorySettingsSubscriber to log advisory messages for non-restart-required memory settings, indicating they take effect on next use.
  • AppState Hot-Reload Slots: Enhanced AppState with swappable provider_registry and model_router slots, along with swap_* methods for reference replacement to support hot-reloading.
  • Message Bus Integration: Added the '#settings' channel to the default message bus channels and wired the settings dispatcher into the application lifecycle, starting it after the bridge and stopping it before.
  • Dispatcher Robustness: Implemented a done callback on the dispatcher task to detect silent poll-loop termination, improving system resilience.
Changelog
  • CLAUDE.md
    • Updated documentation to reflect new AppState hot-reload slots and settings dispatcher lifecycle.
  • src/synthorg/api/app.py
    • Imported SettingsChangeDispatcher and concrete subscribers.
    • Added settings_dispatcher parameter to lifecycle functions (_build_lifecycle, _safe_startup, _safe_shutdown, _cleanup_on_failure).
    • Integrated settings_dispatcher startup and shutdown into the application's safe lifecycle management.
    • Added a new helper function _build_settings_dispatcher to instantiate the dispatcher with its subscribers.
  • src/synthorg/api/state.py
    • Imported new observability events and provider-related types (ProviderRegistry, ModelRouter).
    • Added _model_router and _provider_registry attributes to AppState's __slots__ and __init__.
    • Implemented has_provider_registry, provider_registry, swap_provider_registry properties and methods for hot-reloading the provider registry.
    • Implemented has_model_router, model_router, swap_model_router properties and methods for hot-reloading the model router.
  • src/synthorg/communication/config.py
    • Added '#settings' to the list of default message bus channels.
  • src/synthorg/observability/events/settings.py
    • Added new event constants for settings dispatcher and subscriber activities, including start/stop, errors, notifications, and service swaps.
  • src/synthorg/settings/init.py
    • Exported the new SettingsSubscriber protocol.
  • src/synthorg/settings/dispatcher.py
    • Added SettingsChangeDispatcher class to poll the '#settings' channel and dispatch messages to registered subscribers.
    • Implemented start and stop methods for managing the dispatcher's polling loop.
    • Included logic to handle restart_required settings by logging warnings and skipping subscriber calls.
    • Provided error isolation for individual subscriber failures and a done callback for unexpected loop exits.
    • Added _extract_metadata helper function to parse message metadata for settings changes.
  • src/synthorg/settings/subscriber.py
    • Added SettingsSubscriber protocol defining the interface for services reacting to settings changes, including watched_keys, subscriber_name, and on_settings_changed.
  • src/synthorg/settings/subscribers/init.py
    • Created an __init__.py file to expose MemorySettingsSubscriber and ProviderSettingsSubscriber.
  • src/synthorg/settings/subscribers/memory_subscriber.py
    • Added MemorySettingsSubscriber to log advisory messages for changes to non-restart-required memory settings.
  • src/synthorg/settings/subscribers/provider_subscriber.py
    • Added ProviderSettingsSubscriber to rebuild the ModelRouter when the providers/routing_strategy setting changes, and to log advisory messages for other provider settings.
  • tests/unit/api/test_app.py
    • Updated _safe_startup and _safe_shutdown calls in tests to include the new settings_dispatcher parameter.
  • tests/unit/api/test_state_swap.py
    • Added new unit tests for AppState's provider_registry and model_router swap methods, covering availability checks, return values, and reference replacement.
  • tests/unit/settings/test_dispatcher.py
    • Added comprehensive unit tests for SettingsChangeDispatcher, covering lifecycle, dispatch routing, error isolation, metadata extraction, and done callback behavior.
  • tests/unit/settings/test_memory_subscriber.py
    • Added unit tests for MemorySettingsSubscriber, verifying protocol conformance, watched keys, subscriber name, and idempotent behavior of on_settings_changed.
  • tests/unit/settings/test_provider_subscriber.py
    • Added unit tests for ProviderSettingsSubscriber, confirming protocol conformance, watched keys, subscriber name, and the logic for rebuilding the ModelRouter on strategy changes, including error propagation and advisory keys.
  • tests/unit/settings/test_subscriber_protocol.py
    • Added unit tests to verify the structural conformance of classes to the SettingsSubscriber protocol, including checks for missing methods and property types.
Activity
  • Pre-reviewed by 12 agents: code-reviewer, python-reviewer, test-analyzer, silent-failure-hunter, type-design-analyzer, logging-audit, conventions-enforcer, async-concurrency-reviewer, resilience-audit, docs-consistency, comment-analyzer, issue-resolution-verifier.
  • 12 findings addressed, including 1 critical bug related to SettingValue.value extraction, 1 critical issue with done_callback, and 1 major issue with error propagation, with the rest being medium severity.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 17, 2026 19:11 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust mechanism for hot-reloading settings, which is an excellent feature for runtime configuration management. The implementation includes a new SettingsChangeDispatcher, a SettingsSubscriber protocol, and concrete subscribers for provider and memory settings, all well-integrated into the application lifecycle. The accompanying tests are comprehensive and cover many edge cases. However, I've identified several critical syntax errors in the new SettingsChangeDispatcher that use outdated Python 2 exception handling syntax. These will cause SyntaxError exceptions in Python 3 and must be fixed.

await self._dispatch(envelope.message)
except asyncio.CancelledError:
break
except MemoryError, RecursionError:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This except syntax is for Python 2. In Python 3, multiple exceptions must be specified as a tuple. This will cause a SyntaxError at runtime.

            except (MemoryError, RecursionError):

break
except MemoryError, RecursionError:
raise
except OSError, TimeoutError:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Similar to the previous comment, this except syntax is for Python 2 and will cause a SyntaxError in Python 3.

            except (OSError, TimeoutError):

namespace=namespace,
key=key,
)
except MemoryError, RecursionError:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This is another instance of Python 2 except syntax. It should be updated for Python 3 compatibility to avoid a SyntaxError.

Suggested change
except MemoryError, RecursionError:
except (MemoryError, RecursionError):

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 96.41256% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.18%. Comparing base (4603c9e) to head (fd5c999).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/synthorg/settings/dispatcher.py 95.19% 3 Missing and 2 partials ⚠️
src/synthorg/api/app.py 91.30% 1 Missing and 1 partial ⚠️
...nthorg/settings/subscribers/provider_subscriber.py 97.14% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #526      +/-   ##
==========================================
+ Coverage   93.15%   93.18%   +0.02%     
==========================================
  Files         517      522       +5     
  Lines       24936    25157     +221     
  Branches     2369     2386      +17     
==========================================
+ Hits        23230    23443     +213     
- Misses       1351     1356       +5     
- Partials      355      358       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/settings/test_dispatcher.py`:
- Around line 471-501: The test is timing-sensitive and uses a fixed sleep;
replace the sleep with awaiting the dispatcher task directly: after starting the
dispatcher and asserting d._task is not None, await d._task (or await
asyncio.wait_for(d._task, timeout=...) to bound test runtime) so the test
deterministically waits for the poll loop to exit on the ValueError from the
overridden _ErrorBus.receive; then assert d._running is False. Reference
SettingsChangeDispatcher, TestDoneCallback, _ErrorBus.receive, and
d._task/d._running when locating the change.

In `@tests/unit/settings/test_provider_subscriber.py`:
- Around line 94-115: Move the local import of UnknownStrategyError out of the
test body and add it to the module-level imports; specifically, import
UnknownStrategyError from synthorg.providers.routing.errors at the top of the
test module and remove the inline "from synthorg.providers.routing.errors import
UnknownStrategyError" from inside test_rebuild_failure_propagates so the test
references UnknownStrategyError directly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: dbe2b4c4-71bf-469e-bf46-7883ebb765d8

📥 Commits

Reviewing files that changed from the base of the PR and between 181eb8a and 5277967.

📒 Files selected for processing (17)
  • CLAUDE.md
  • src/synthorg/api/app.py
  • src/synthorg/api/state.py
  • src/synthorg/communication/config.py
  • src/synthorg/observability/events/settings.py
  • src/synthorg/settings/__init__.py
  • src/synthorg/settings/dispatcher.py
  • src/synthorg/settings/subscriber.py
  • src/synthorg/settings/subscribers/__init__.py
  • src/synthorg/settings/subscribers/memory_subscriber.py
  • src/synthorg/settings/subscribers/provider_subscriber.py
  • tests/unit/api/test_app.py
  • tests/unit/api/test_state_swap.py
  • tests/unit/settings/test_dispatcher.py
  • tests/unit/settings/test_memory_subscriber.py
  • tests/unit/settings/test_provider_subscriber.py
  • tests/unit/settings/test_subscriber_protocol.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (go)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649 with native lazy annotations
Use except A, B: syntax (no parentheses) for exception handling in Python 3.14, as enforced by ruff PEP 758
All public functions and methods require type hints; mypy strict mode enforced
Docstrings: Google style, required on all public classes and functions, enforced by ruff D rules
Create new objects instead of mutating existing ones; never mutate. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 with adopted conventions: @computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr from core.types for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (enforced by ruff)
Functions must be < 50 lines; files must be < 800 lines
Handle errors explicitly, never silently swallow them
Validate at system boundaries (user input, external APIs, config files)

Files:

  • src/synthorg/api/app.py
  • src/synthorg/settings/__init__.py
  • src/synthorg/settings/subscribers/memory_subscriber.py
  • src/synthorg/settings/subscribers/provider_subscriber.py
  • tests/unit/api/test_state_swap.py
  • src/synthorg/observability/events/settings.py
  • tests/unit/settings/test_provider_subscriber.py
  • tests/unit/settings/test_memory_subscriber.py
  • src/synthorg/communication/config.py
  • tests/unit/settings/test_subscriber_protocol.py
  • src/synthorg/settings/subscriber.py
  • src/synthorg/settings/subscribers/__init__.py
  • src/synthorg/settings/dispatcher.py
  • src/synthorg/api/state.py
  • tests/unit/api/test_app.py
  • tests/unit/settings/test_dispatcher.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every module with business logic MUST have from synthorg.observability import get_logger then logger = get_logger(__name__); never use import logging / logging.getLogger() / print() in application code
Logger variable name must always be logger (not _logger, not log)
Event names must always use constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT
Always use structured logging: logger.info(EVENT, key=value) — never logger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
DEBUG logging for object creation, internal flow, and entry/exit of key functions
All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig; retryable errors (is_retryable=True) include RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError
Non-retryable errors raise immediately without retry; RetryExhaustedError signals that all retries failed — engine layer catches this to trigger fallback chains
Rate limiter respects RateLimitError.retry_after from providers — automatically pauses future requests
Library reference: auto-generated from docstrings via mkdocstrings + Griffe (AST-based, no imports) in docs/api/

Files:

  • src/synthorg/api/app.py
  • src/synthorg/settings/__init__.py
  • src/synthorg/settings/subscribers/memory_subscriber.py
  • src/synthorg/settings/subscribers/provider_subscriber.py
  • src/synthorg/observability/events/settings.py
  • src/synthorg/communication/config.py
  • src/synthorg/settings/subscriber.py
  • src/synthorg/settings/subscribers/__init__.py
  • src/synthorg/settings/dispatcher.py
  • src/synthorg/api/state.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow for test categorization
Use asyncio_mode = 'auto' in pytest — no manual @pytest.mark.asyncio needed
Per-test timeout: 30 seconds
Prefer @pytest.mark.parametrize for testing similar cases
Use test-provider, test-small-001, etc. in tests instead of real vendor names
Use Python Hypothesis for property-based testing with @given + @settings decorators
NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic

Files:

  • tests/unit/api/test_state_swap.py
  • tests/unit/settings/test_provider_subscriber.py
  • tests/unit/settings/test_memory_subscriber.py
  • tests/unit/settings/test_subscriber_protocol.py
  • tests/unit/api/test_app.py
  • tests/unit/settings/test_dispatcher.py
🧠 Learnings (5)
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/observability/events/settings.py
  • src/synthorg/api/state.py
📚 Learning: 2026-03-17T17:24:41.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T17:24:41.990Z
Learning: Applies to src/synthorg/**/*.py : Event names must always use constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • src/synthorg/observability/events/settings.py
  • src/synthorg/api/state.py
📚 Learning: 2026-03-17T17:24:41.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T17:24:41.990Z
Learning: Applies to src/synthorg/**/*.py : Library reference: auto-generated from docstrings via mkdocstrings + Griffe (AST-based, no imports) in `docs/api/`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T17:24:41.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T17:24:41.990Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`; never use `import logging` / `logging.getLogger()` / `print()` in application code

Applied to files:

  • src/synthorg/api/state.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • src/synthorg/api/state.py
🧬 Code graph analysis (10)
src/synthorg/api/app.py (3)
src/synthorg/settings/dispatcher.py (3)
  • SettingsChangeDispatcher (39-215)
  • stop (93-105)
  • start (68-91)
src/synthorg/settings/subscribers/memory_subscriber.py (1)
  • MemorySettingsSubscriber (19-57)
src/synthorg/settings/subscribers/provider_subscriber.py (1)
  • ProviderSettingsSubscriber (25-110)
src/synthorg/settings/__init__.py (1)
src/synthorg/settings/subscriber.py (1)
  • SettingsSubscriber (11-55)
src/synthorg/settings/subscribers/memory_subscriber.py (4)
src/synthorg/settings/subscribers/provider_subscriber.py (3)
  • watched_keys (57-59)
  • subscriber_name (62-64)
  • on_settings_changed (66-89)
tests/unit/settings/test_dispatcher.py (4)
  • watched_keys (62-63)
  • subscriber_name (66-67)
  • on_settings_changed (69-70)
  • on_settings_changed (76-78)
src/synthorg/settings/subscriber.py (3)
  • watched_keys (31-33)
  • subscriber_name (36-38)
  • on_settings_changed (40-55)
tests/unit/settings/test_subscriber_protocol.py (5)
  • watched_keys (12-13)
  • watched_keys (31-32)
  • subscriber_name (16-17)
  • subscriber_name (35-36)
  • on_settings_changed (19-24)
tests/unit/api/test_state_swap.py (2)
src/synthorg/api/errors.py (1)
  • ServiceUnavailableError (222-231)
src/synthorg/api/state.py (6)
  • provider_registry (300-305)
  • has_provider_registry (295-297)
  • swap_provider_registry (307-327)
  • model_router (335-337)
  • has_model_router (330-332)
  • swap_model_router (339-359)
tests/unit/settings/test_provider_subscriber.py (3)
src/synthorg/api/state.py (3)
  • AppState (36-359)
  • model_router (335-337)
  • settings_service (246-248)
src/synthorg/settings/subscriber.py (4)
  • SettingsSubscriber (11-55)
  • watched_keys (31-33)
  • subscriber_name (36-38)
  • on_settings_changed (40-55)
src/synthorg/settings/subscribers/provider_subscriber.py (4)
  • ProviderSettingsSubscriber (25-110)
  • watched_keys (57-59)
  • subscriber_name (62-64)
  • on_settings_changed (66-89)
tests/unit/settings/test_memory_subscriber.py (3)
src/synthorg/settings/subscriber.py (4)
  • SettingsSubscriber (11-55)
  • watched_keys (31-33)
  • subscriber_name (36-38)
  • on_settings_changed (40-55)
src/synthorg/settings/subscribers/memory_subscriber.py (4)
  • MemorySettingsSubscriber (19-57)
  • watched_keys (31-33)
  • subscriber_name (36-38)
  • on_settings_changed (40-57)
src/synthorg/settings/subscribers/provider_subscriber.py (3)
  • watched_keys (57-59)
  • subscriber_name (62-64)
  • on_settings_changed (66-89)
src/synthorg/settings/subscribers/__init__.py (2)
src/synthorg/settings/subscribers/memory_subscriber.py (1)
  • MemorySettingsSubscriber (19-57)
src/synthorg/settings/subscribers/provider_subscriber.py (1)
  • ProviderSettingsSubscriber (25-110)
src/synthorg/settings/dispatcher.py (4)
src/synthorg/settings/subscriber.py (4)
  • SettingsSubscriber (11-55)
  • watched_keys (31-33)
  • on_settings_changed (40-55)
  • subscriber_name (36-38)
src/synthorg/settings/subscribers/memory_subscriber.py (3)
  • watched_keys (31-33)
  • on_settings_changed (40-57)
  • subscriber_name (36-38)
src/synthorg/settings/subscribers/provider_subscriber.py (3)
  • watched_keys (57-59)
  • on_settings_changed (66-89)
  • subscriber_name (62-64)
tests/unit/settings/test_subscriber_protocol.py (5)
  • watched_keys (12-13)
  • watched_keys (31-32)
  • on_settings_changed (19-24)
  • subscriber_name (16-17)
  • subscriber_name (35-36)
tests/unit/api/test_app.py (1)
src/synthorg/api/app.py (2)
  • _safe_startup (357-457)
  • _safe_shutdown (460-509)
tests/unit/settings/test_dispatcher.py (4)
src/synthorg/settings/dispatcher.py (3)
  • SettingsChangeDispatcher (39-215)
  • start (68-91)
  • stop (93-105)
src/synthorg/settings/subscribers/memory_subscriber.py (3)
  • watched_keys (31-33)
  • subscriber_name (36-38)
  • on_settings_changed (40-57)
src/synthorg/settings/subscribers/provider_subscriber.py (3)
  • watched_keys (57-59)
  • subscriber_name (62-64)
  • on_settings_changed (66-89)
src/synthorg/settings/subscriber.py (3)
  • watched_keys (31-33)
  • subscriber_name (36-38)
  • on_settings_changed (40-55)
🔇 Additional comments (42)
src/synthorg/communication/config.py (1)

23-31: Default channel update is correct.

"#settings" is added in the right place and remains compatible with existing channel validation flow.

src/synthorg/settings/subscribers/__init__.py (1)

3-13: Package exports look good.

The re-exports and __all__ are consistent and expose both concrete subscribers as intended.

src/synthorg/api/state.py (1)

26-29: Hot-swap state additions are solid.

The new provider/model router slots, guarded accessors, atomic swap methods, and structured swap logs are implemented cleanly.

Also applies to: 65-69, 94-95, 112-113, 294-359

tests/unit/api/test_state_swap.py (1)

33-106: Great coverage for swap semantics.

These tests validate both guard behavior and reference replacement paths for provider_registry and model_router.

src/synthorg/settings/__init__.py (1)

22-22: Public API export is correct.

Exposing SettingsSubscriber at package level is consistent with the new settings subscription architecture.

Also applies to: 38-38

tests/unit/settings/test_memory_subscriber.py (1)

11-46: Memory subscriber tests are well targeted.

Protocol shape, watched-key contract, and callback behavior are all covered with clear unit assertions.

tests/unit/settings/test_subscriber_protocol.py (1)

39-69: Protocol conformance tests look good.

The suite validates both valid and invalid runtime protocol behavior and checks key interface properties.

tests/unit/api/test_app.py (1)

110-110: Lifecycle test signature updates are correct.

The modified _safe_startup / _safe_shutdown calls align with the new dispatcher-aware parameter order.

Also applies to: 128-128, 158-158, 203-214

tests/unit/settings/test_provider_subscriber.py (3)

1-54: LGTM! Well-structured test helpers and fixtures.

The helper functions provide clean abstractions for creating test fixtures. The _setting_value helper correctly constructs SettingValue objects matching the expected return type from SettingsService.get().


56-72: LGTM! Protocol conformance tests are comprehensive.

The tests correctly verify that ProviderSettingsSubscriber conforms to the SettingsSubscriber protocol, including the isinstance check (enabled by @runtime_checkable), watched keys coverage, and subscriber name.


117-146: LGTM! Advisory settings correctly tested as no-ops.

These tests properly verify that default_provider and retry_max_attempts changes do not trigger router rebuilds, which aligns with the design that these are advisory-only settings read through ConfigResolver at use time.

src/synthorg/settings/dispatcher.py (7)

1-36: LGTM! Module setup follows guidelines.

Proper module docstring, imports organized correctly, logger setup follows the required pattern (get_logger(__name__)), and event constants are imported from the domain-specific module.


39-66: LGTM! Class design is well-documented.

The class docstring clearly explains the dispatcher's behavior, including restart_required filtering and error isolation semantics. The constructor signature correctly accepts an immutable tuple of subscribers.


68-91: LGTM! Start method with proper lifecycle guarding.

The method correctly prevents double-start with a clear error message, ensures the channel exists before subscribing, and attaches a done callback to detect silent task death.


93-127: LGTM! Clean shutdown and done callback implementation.

The stop() method is properly idempotent, and _on_task_done correctly handles both cancelled tasks (no-op) and unexpected exits (resets _running flag and logs appropriately).


139-177: LGTM! Robust poll loop with appropriate error handling.

The loop correctly uses PEP 758 except syntax (except MemoryError, RecursionError: and except OSError, TimeoutError:), re-raises unrecoverable errors, tracks consecutive errors with a sensible threshold, and breaks on unexpected exceptions while logging context.


179-215: LGTM! Dispatch logic correctly isolates subscriber errors.

The method properly skips missing metadata, handles restart_required filtering with a warning, and catches subscriber exceptions without breaking the loop. The PEP 758 except syntax is correctly applied.


218-243: LGTM! Metadata extraction with sensible defaults.

The helper correctly validates required fields (namespace, key), logs a warning for malformed messages with diagnostic context, and defaults restart_required to False when absent.

src/synthorg/settings/subscriber.py (1)

1-55: LGTM! Clean protocol definition.

The SettingsSubscriber protocol is well-designed with:

  • @runtime_checkable enabling isinstance checks for protocol conformance testing
  • Clear docstrings explaining the dispatcher's responsibility for restart_required filtering
  • Minimal interface surface with just the essential properties and callback method
CLAUDE.md (1)

117-133: LGTM! Documentation accurately reflects new components.

The package structure documentation is properly updated to include:

  • AppState hot-reload slots and swap methods in the api/ description
  • SettingsSubscriber protocol and SettingsChangeDispatcher in the settings/ description
  • New subscribers/ subdirectory with concrete subscriber implementations
tests/unit/settings/test_dispatcher.py (8)

1-47: LGTM! Well-designed test helpers.

The _settings_message and _envelope helpers provide clean abstractions for constructing test messages matching the expected SettingsService format. The message structure correctly includes restart_required as a string in metadata extra.


49-79: LGTM! Test doubles are minimal and focused.

_FakeSubscriber and _ErrorSubscriber correctly implement the SettingsSubscriber protocol interface with just enough behavior for testing. The error subscriber cleanly raises RuntimeError for error isolation tests.


81-151: LGTM! Controllable fake message bus.

_FakeBus provides a clean test double with queue-based message delivery, subscription tracking, and all required MessageBus protocol methods. The implementation is minimal but sufficient for dispatcher testing.


208-246: LGTM! Lifecycle tests are thorough.

Tests correctly verify subscription setup, double-start prevention, idempotent stop, and safe stop-without-start behavior. The try/finally patterns ensure proper cleanup even if assertions fail.


251-331: LGTM! Dispatch routing tests cover key scenarios.

Tests comprehensively verify:

  • Routing to matching subscribers
  • Filtering non-matching subscribers
  • Multiple matching subscribers receiving the same notification
  • Skipping restart_required=True settings
  • Dispatching non-restart-required settings

337-379: LGTM! Error isolation tests verify fault tolerance.

Tests confirm that:

  • A failing subscriber doesn't prevent other subscribers from being notified
  • The poll loop survives subscriber errors and continues processing subsequent messages

385-465: LGTM! Metadata extraction edge cases covered.

Tests verify handling of:

  • Missing metadata (both namespace and key absent)
  • Partial metadata (namespace only)
  • Missing restart_required defaulting to False

186-203: 🧹 Nitpick | 🔵 Trivial

Consider mocking time for deterministic drain behavior.

The _drain helper uses real time with asyncio.sleep(), which could introduce flakiness under system load. Per coding guidelines, timing-sensitive tests should mock time.monotonic() and asyncio.sleep() for determinism.

However, for this integration-style test helper that processes async queues, the current approach with generous timeouts (2.0s default) is pragmatic and unlikely to cause issues in practice.

⛔ Skipped due to learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T17:24:41.990Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic
src/synthorg/settings/subscribers/provider_subscriber.py (4)

1-22: LGTM! Module setup and constants are well-organized.

Proper logger setup, TYPE_CHECKING guards for type-only imports, and the watched keys are clearly defined as a module-level constant with all three provider-namespace settings.


25-64: LGTM! Class design with clear documentation.

The class docstring thoroughly explains:

  • Which setting triggers a rebuild (routing_strategy)
  • Why other settings are advisory-only (read through ConfigResolver)
  • Error propagation semantics (dispatcher handles logging)

66-89: LGTM! Settings change handler with appropriate branching.

The on_settings_changed method correctly:

  • Triggers _rebuild_router() only for routing_strategy
  • Logs advisory notifications for other keys using structured logging with the event constant

91-110: LGTM! Router rebuild follows immutability guidelines.

The _rebuild_router method correctly:

  • Reads the new strategy value from SettingsService
  • Uses model_copy(update=...) to create a new routing config (per coding guidelines)
  • Creates a copy of providers with dict(self._config.providers)
  • Atomically swaps the new router into AppState

Errors intentionally propagate to the dispatcher for logging, which aligns with the documented design.

src/synthorg/observability/events/settings.py (1)

19-33: LGTM! Event constants follow established patterns.

The new constants are well-organized with a section comment and follow the existing SETTINGS_<COMPONENT>_<EVENT> naming convention. Coverage includes all dispatcher lifecycle events, subscriber notification/error events, and service swap events.

Based on learnings: "Event names must always use constants from domain-specific modules under synthorg.observability.events."

src/synthorg/settings/subscribers/memory_subscriber.py (2)

1-16: LGTM! Clear module documentation and watched keys.

The comment explaining that memory/backend has restart_required=True provides valuable context for why it's not in the watched set. The watched keys are appropriately limited to settings that can take effect without restart.


19-57: LGTM! Clean advisory subscriber implementation.

The MemorySettingsSubscriber correctly implements the protocol with:

  • No __init__ needed (stateless subscriber)
  • Proper watched_keys and subscriber_name properties
  • on_settings_changed that logs notifications using structured logging

The design appropriately defers actual behavior changes to use-time through ConfigResolver.

src/synthorg/api/app.py (7)

63-67: LGTM!

The imports are correctly structured and align with the new settings dispatcher integration.


173-239: LGTM!

The settings_dispatcher parameter is correctly threaded through the lifecycle hooks, maintaining proper startup/shutdown ordering.


261-312: LGTM!

The cleanup logic correctly handles the settings dispatcher in reverse startup order, ensuring proper cleanup on failure. The use of _try_stop provides consistent error handling.


357-457: LGTM!

The settings dispatcher is correctly integrated into the startup sequence after the bridge (ensuring the message bus bridge is ready before the dispatcher starts polling). The error handling and cleanup-on-failure logic properly propagate the new started_settings_dispatcher flag.


460-509: LGTM!

The shutdown sequence correctly stops the settings dispatcher after the task engine but before the bridge and message bus, maintaining the proper reverse-of-startup ordering. The docstring accurately reflects the updated shutdown flow.


648-671: LGTM!

The settings dispatcher is constructed at the appropriate point in create_app after app_state is available (needed by ProviderSettingsSubscriber) and correctly passed to the lifecycle builder.


713-731: LGTM!

The factory function correctly:

  1. Guards against missing message_bus or settings_service by returning None
  2. Constructs subscribers with the appropriate dependencies
  3. Returns a properly configured SettingsChangeDispatcher with both subscribers

The implementation aligns with the dispatcher and subscriber interfaces documented in the relevant code snippets.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
tests/unit/settings/test_dispatcher.py (1)

202-217: ⚠️ Potential issue | 🟠 Major

_drain() still races on slow CI.

qsize() == 0 only means receive() dequeued the envelope; it does not mean _dispatch() or the subscriber callback finished. The fixed sleeps are masking that race, so these assertions can still flap under load. Please switch this helper to an explicit completion signal instead of queue polling plus sleep.

Based on learnings, "NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/settings/test_dispatcher.py` around lines 202 - 217, The test
helper _drain races because it polls bus._queue.qsize() and uses fixed sleeps;
instead, change it to await an explicit completion signal from the
dispatcher/bus: add an asyncio.Event or Future to _FakeBus (or have
SettingsChangeDispatcher set an event at the end of _dispatch/when subscriber
callback completes), have receive() or the dispatcher set that event when an
envelope is fully processed, and then make _drain await that event with a
timeout (instead of polling qsize and sleeping); keep the timeout behavior and
use loop.time()/deadline as before for safety.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/settings/dispatcher.py`:
- Around line 204-223: The loop in _dispatch currently accesses
subscriber.watched_keys and subscriber.subscriber_name outside the isolation
try, so attribute errors or property-side effects can bubble up and kill the
dispatcher; move the watched_keys membership check and retrieval of
subscriber.subscriber_name into the same try block that calls
subscriber.on_settings_changed (or fetch them into locals inside the try), and
preserve the existing exception behavior by re-raising MemoryError and
RecursionError while catching/logging any other Exception (using exc_info=True)
and continuing to the next subscriber so that failures in subscriber properties
or logging never escape into _poll_loop.
- Around line 239-255: The current code calls .lower() on
extra.get("restart_required", "True") which assumes a string and can raise
AttributeError for non-string values; update the logic in the dispatcher (where
restart_required is computed and _ChangeMetadata is returned) to safely handle
non-string values: first grab the raw value (e.g. value =
extra.get("restart_required", "True")), then if isinstance(value, str) compute
restart_required = value.lower() != "false", elif isinstance(value, bool) set
restart_required = value, else set restart_required = True (fail-safe); ensure
you return _ChangeMetadata(namespace, key, restart_required).

In `@src/synthorg/settings/subscribers/provider_subscriber.py`:
- Around line 105-127: The SettingsService.get call is outside the try so
read/storage errors bypass the SETTINGS_SERVICE_SWAP_FAILED log and result.value
may be undefined; wrap the await
self._settings_service.get("providers","routing_strategy") call inside the try
(or add a preceding try/except around it) so any exception logs
SETTINGS_SERVICE_SWAP_FAILED via logger.error with appropriate context
(service="model_router" and attempted_strategy set to result.value if available
or None), then re-raise; ensure ModelRouter instantiation and
config.routing.model_copy remain inside the same try block and do not reference
result.value when the get failed.

In `@tests/unit/settings/test_dispatcher.py`:
- Around line 193-195: The fixture started_dispatcher has an incomplete
AsyncGenerator annotation; change its return type from
AsyncGenerator[SettingsChangeDispatcher] to
AsyncGenerator[SettingsChangeDispatcher, None] so it supplies the required
YieldType and SendType parameters for mypy strict mode; update the signature of
the async def started_dispatcher(...) to use
AsyncGenerator[SettingsChangeDispatcher, None].

---

Duplicate comments:
In `@tests/unit/settings/test_dispatcher.py`:
- Around line 202-217: The test helper _drain races because it polls
bus._queue.qsize() and uses fixed sleeps; instead, change it to await an
explicit completion signal from the dispatcher/bus: add an asyncio.Event or
Future to _FakeBus (or have SettingsChangeDispatcher set an event at the end of
_dispatch/when subscriber callback completes), have receive() or the dispatcher
set that event when an envelope is fully processed, and then make _drain await
that event with a timeout (instead of polling qsize and sleeping); keep the
timeout behavior and use loop.time()/deadline as before for safety.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 86bdf3f0-4a9f-4d8a-883a-c4444ea2111d

📥 Commits

Reviewing files that changed from the base of the PR and between 5277967 and 1772717.

📒 Files selected for processing (8)
  • docs/design/operations.md
  • src/synthorg/api/state.py
  • src/synthorg/settings/definitions/providers.py
  • src/synthorg/settings/dispatcher.py
  • src/synthorg/settings/subscribers/provider_subscriber.py
  • tests/unit/api/test_app.py
  • tests/unit/settings/test_dispatcher.py
  • tests/unit/settings/test_provider_subscriber.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Build Backend
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Web
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: No from __future__ import annotations — Python 3.14 has PEP 649 with native lazy annotations
Use except A, B: syntax (no parentheses) for exception handling in Python 3.14, as enforced by ruff PEP 758
All public functions and methods require type hints; mypy strict mode enforced
Docstrings: Google style, required on all public classes and functions, enforced by ruff D rules
Create new objects instead of mutating existing ones; never mutate. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 with adopted conventions: @computed_field for derived values instead of storing + validating redundant fields; use NotBlankStr from core.types for all identifier/name fields (including optional and tuple variants) instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Line length: 88 characters (enforced by ruff)
Functions must be < 50 lines; files must be < 800 lines
Handle errors explicitly, never silently swallow them
Validate at system boundaries (user input, external APIs, config files)

Files:

  • tests/unit/settings/test_provider_subscriber.py
  • src/synthorg/api/state.py
  • src/synthorg/settings/dispatcher.py
  • src/synthorg/settings/definitions/providers.py
  • tests/unit/settings/test_dispatcher.py
  • tests/unit/api/test_app.py
  • src/synthorg/settings/subscribers/provider_subscriber.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow for test categorization
Use asyncio_mode = 'auto' in pytest — no manual @pytest.mark.asyncio needed
Per-test timeout: 30 seconds
Prefer @pytest.mark.parametrize for testing similar cases
Use test-provider, test-small-001, etc. in tests instead of real vendor names
Use Python Hypothesis for property-based testing with @given + @settings decorators
NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic

Files:

  • tests/unit/settings/test_provider_subscriber.py
  • tests/unit/settings/test_dispatcher.py
  • tests/unit/api/test_app.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every module with business logic MUST have from synthorg.observability import get_logger then logger = get_logger(__name__); never use import logging / logging.getLogger() / print() in application code
Logger variable name must always be logger (not _logger, not log)
Event names must always use constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT
Always use structured logging: logger.info(EVENT, key=value) — never logger.info('msg %s', val)
All error paths must log at WARNING or ERROR with context before raising
All state transitions must log at INFO level
DEBUG logging for object creation, internal flow, and entry/exit of key functions
All provider calls go through BaseCompletionProvider which applies retry + rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig; retryable errors (is_retryable=True) include RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError
Non-retryable errors raise immediately without retry; RetryExhaustedError signals that all retries failed — engine layer catches this to trigger fallback chains
Rate limiter respects RateLimitError.retry_after from providers — automatically pauses future requests
Library reference: auto-generated from docstrings via mkdocstrings + Griffe (AST-based, no imports) in docs/api/

Files:

  • src/synthorg/api/state.py
  • src/synthorg/settings/dispatcher.py
  • src/synthorg/settings/definitions/providers.py
  • src/synthorg/settings/subscribers/provider_subscriber.py
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Docs: Markdown in docs/, built with Zensical; Design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations); Architecture in docs/architecture/ (overview, tech-stack, decision log); Roadmap in docs/roadmap/; Security in docs/security.md; Licensing in docs/licensing.md; Reference in docs/reference/

Files:

  • docs/design/operations.md
🧠 Learnings (8)
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/api/state.py
📚 Learning: 2026-03-17T17:24:41.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T17:24:41.990Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`; never use `import logging` / `logging.getLogger()` / `print()` in application code

Applied to files:

  • src/synthorg/api/state.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • src/synthorg/api/state.py
📚 Learning: 2026-03-17T17:24:41.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T17:24:41.990Z
Learning: Applies to src/synthorg/**/*.py : Event names must always use constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • src/synthorg/api/state.py
📚 Learning: 2026-03-17T17:24:41.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T17:24:41.990Z
Learning: Applies to **/*.py : Use `except A, B:` syntax (no parentheses) for exception handling in Python 3.14, as enforced by ruff PEP 758

Applied to files:

  • src/synthorg/settings/dispatcher.py
📚 Learning: 2026-03-17T17:24:41.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T17:24:41.990Z
Learning: Applies to **/*.py : Handle errors explicitly, never silently swallow them

Applied to files:

  • src/synthorg/settings/dispatcher.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/settings/dispatcher.py
📚 Learning: 2026-03-17T17:24:41.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T17:24:41.990Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic

Applied to files:

  • tests/unit/settings/test_dispatcher.py
🧬 Code graph analysis (5)
tests/unit/settings/test_provider_subscriber.py (2)
src/synthorg/settings/subscriber.py (4)
  • SettingsSubscriber (11-55)
  • watched_keys (31-33)
  • subscriber_name (36-38)
  • on_settings_changed (40-55)
src/synthorg/settings/subscribers/provider_subscriber.py (4)
  • ProviderSettingsSubscriber (28-128)
  • watched_keys (60-62)
  • subscriber_name (65-67)
  • on_settings_changed (69-92)
src/synthorg/settings/dispatcher.py (4)
src/synthorg/settings/subscriber.py (4)
  • SettingsSubscriber (11-55)
  • watched_keys (31-33)
  • on_settings_changed (40-55)
  • subscriber_name (36-38)
src/synthorg/settings/subscribers/provider_subscriber.py (3)
  • watched_keys (60-62)
  • on_settings_changed (69-92)
  • subscriber_name (65-67)
src/synthorg/settings/subscribers/memory_subscriber.py (3)
  • watched_keys (31-33)
  • on_settings_changed (40-57)
  • subscriber_name (36-38)
tests/unit/settings/test_subscriber_protocol.py (5)
  • watched_keys (12-13)
  • watched_keys (31-32)
  • on_settings_changed (19-24)
  • subscriber_name (16-17)
  • subscriber_name (35-36)
tests/unit/settings/test_dispatcher.py (4)
src/synthorg/settings/dispatcher.py (3)
  • SettingsChangeDispatcher (48-224)
  • start (77-100)
  • stop (102-114)
src/synthorg/settings/subscribers/provider_subscriber.py (3)
  • watched_keys (60-62)
  • subscriber_name (65-67)
  • on_settings_changed (69-92)
src/synthorg/settings/subscribers/memory_subscriber.py (3)
  • watched_keys (31-33)
  • subscriber_name (36-38)
  • on_settings_changed (40-57)
src/synthorg/settings/subscriber.py (3)
  • watched_keys (31-33)
  • subscriber_name (36-38)
  • on_settings_changed (40-55)
tests/unit/api/test_app.py (3)
src/synthorg/api/app.py (2)
  • _safe_startup (357-457)
  • _safe_shutdown (460-509)
src/synthorg/api/state.py (2)
  • persistence (142-144)
  • AppState (36-361)
tests/unit/api/conftest.py (2)
  • root_config (178-179)
  • approval_store (173-174)
src/synthorg/settings/subscribers/provider_subscriber.py (2)
tests/unit/settings/test_dispatcher.py (4)
  • watched_keys (64-65)
  • subscriber_name (68-69)
  • on_settings_changed (71-72)
  • on_settings_changed (78-80)
src/synthorg/settings/subscriber.py (3)
  • watched_keys (31-33)
  • subscriber_name (36-38)
  • on_settings_changed (40-55)

Comment on lines +204 to +223
for subscriber in self._subscribers:
if (namespace, key) not in subscriber.watched_keys:
continue
try:
await subscriber.on_settings_changed(namespace, key)
logger.info(
SETTINGS_SUBSCRIBER_NOTIFIED,
subscriber=subscriber.subscriber_name,
namespace=namespace,
key=key,
)
except MemoryError, RecursionError:
raise
except Exception:
logger.error(
SETTINGS_SUBSCRIBER_ERROR,
subscriber=subscriber.subscriber_name,
namespace=namespace,
key=key,
exc_info=True,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Broaden the subscriber isolation block.

subscriber.watched_keys and subscriber.subscriber_name are still evaluated outside the truly isolated path. If either property raises, _dispatch() falls into _poll_loop()'s fatal except Exception and the dispatcher dies, which breaks the isolation contract documented in src/synthorg/settings/subscriber.py.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/settings/dispatcher.py` around lines 204 - 223, The loop in
_dispatch currently accesses subscriber.watched_keys and
subscriber.subscriber_name outside the isolation try, so attribute errors or
property-side effects can bubble up and kill the dispatcher; move the
watched_keys membership check and retrieval of subscriber.subscriber_name into
the same try block that calls subscriber.on_settings_changed (or fetch them into
locals inside the try), and preserve the existing exception behavior by
re-raising MemoryError and RecursionError while catching/logging any other
Exception (using exc_info=True) and continuing to the next subscriber so that
failures in subscriber properties or logging never escape into _poll_loop.

Comment on lines +105 to +127
result = await self._settings_service.get(
"providers",
"routing_strategy",
)
try:
config = self._app_state.config
new_routing = config.routing.model_copy(
update={"strategy": result.value},
)
new_router = ModelRouter(
new_routing,
dict(config.providers),
)
except MemoryError, RecursionError:
raise
except Exception:
logger.error(
SETTINGS_SERVICE_SWAP_FAILED,
service="model_router",
attempted_strategy=result.value,
exc_info=True,
)
raise
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Log settings-read failures as swap failures too.

SettingsService.get() sits outside the try, so storage/read failures bypass SETTINGS_SERVICE_SWAP_FAILED entirely. That drops the service-specific context this class promises, and the current handler cannot safely reference result.value if the read never succeeded.

🛠️ Suggested fix
-        result = await self._settings_service.get(
-            "providers",
-            "routing_strategy",
-        )
+        attempted_strategy: str | None = None
         try:
+            result = await self._settings_service.get(
+                "providers",
+                "routing_strategy",
+            )
+            attempted_strategy = result.value
             config = self._app_state.config
             new_routing = config.routing.model_copy(
-                update={"strategy": result.value},
+                update={"strategy": attempted_strategy},
             )
             new_router = ModelRouter(
                 new_routing,
                 dict(config.providers),
             )
@@
             logger.error(
                 SETTINGS_SERVICE_SWAP_FAILED,
                 service="model_router",
-                attempted_strategy=result.value,
+                attempted_strategy=attempted_strategy,
                 exc_info=True,
             )
             raise

As per coding guidelines, "All error paths must log at WARNING or ERROR with context before raising".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/settings/subscribers/provider_subscriber.py` around lines 105 -
127, The SettingsService.get call is outside the try so read/storage errors
bypass the SETTINGS_SERVICE_SWAP_FAILED log and result.value may be undefined;
wrap the await self._settings_service.get("providers","routing_strategy") call
inside the try (or add a preceding try/except around it) so any exception logs
SETTINGS_SERVICE_SWAP_FAILED via logger.error with appropriate context
(service="model_router" and attempted_strategy set to result.value if available
or None), then re-raise; ensure ModelRouter instantiation and
config.routing.model_copy remain inside the same try block and do not reference
result.value when the get failed.

Comment on lines +193 to +195
async def started_dispatcher(
dispatcher: SettingsChangeDispatcher,
) -> AsyncGenerator[SettingsChangeDispatcher]:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In Python typing, what is the correct generic form of typing.AsyncGeneratorfor an async pytest fixture that yieldsSettingsChangeDispatcher and does not receive sent values?

💡 Result:

Use None for the SendType:

from typing import AsyncGenerator

async def fixture(...) -> AsyncGenerator[SettingsChangeDispatcher, None]:
    yield dispatcher

AsyncGenerator is parameterized as AsyncGenerator[YieldType, SendType], and SendType defaults to None when you don’t accept values via asend()/send(). [1]


Fix the fixture's AsyncGenerator type annotation.

The started_dispatcher fixture uses an incomplete AsyncGenerator parameterization. AsyncGenerator requires both YieldType and SendType parameters: AsyncGenerator[YieldType, SendType]. Since this fixture yields a value but doesn't accept sent values, the correct annotation is:

async def started_dispatcher(
    dispatcher: SettingsChangeDispatcher,
) -> AsyncGenerator[SettingsChangeDispatcher, None]:

This is required by mypy strict mode as stated in the coding guidelines.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/settings/test_dispatcher.py` around lines 193 - 195, The fixture
started_dispatcher has an incomplete AsyncGenerator annotation; change its
return type from AsyncGenerator[SettingsChangeDispatcher] to
AsyncGenerator[SettingsChangeDispatcher, None] so it supplies the required
YieldType and SendType parameters for mypy strict mode; update the signature of
the async def started_dispatcher(...) to use
AsyncGenerator[SettingsChangeDispatcher, None].

…ot-reload

Add a generic mechanism for services to subscribe to #settings message
bus channel and react to runtime setting changes by rebuilding affected
service instances.

- SettingsSubscriber protocol: declares (namespace, key) pairs + callback
- SettingsChangeDispatcher: polls #settings bus channel, dispatches to
  subscribers, skips restart_required settings with operator warning
- ProviderSettingsSubscriber: rebuilds ModelRouter on routing_strategy
  change, advisory-only for default_provider/retry_max_attempts
- MemorySettingsSubscriber: logs that non-restart settings take effect
  on next use (restart_required handled by dispatcher)
- AppState: add swappable provider_registry/model_router slots with
  swap methods for atomic reference replacement
- Add #settings to default message bus channels
- Wire dispatcher into app lifecycle (start after bridge, stop before)

Closes #503
- Fix critical bug: extract .value from SettingValue in provider
  subscriber (SettingsService.get returns SettingValue, not str)
- Remove inner try/except from _rebuild_router — let dispatcher
  handle errors with full subscriber context
- Add done_callback on dispatcher poll task to detect silent death
  and reset _running flag
- Log warning on malformed messages in _extract_metadata
- Log warning before raising in start() when already running
- Remove dead ("memory", "backend") from MemorySettingsSubscriber
  watched_keys (restart_required=True, filtered by dispatcher)
- Fix misleading docstrings (atomically → replace, dispatcher
  filtering behavior, _extract_metadata default note)
- Update test mocks to return SettingValue (not raw strings)
- Add missing tests: partial metadata, restart_required default,
  done_callback running flag reset
- Update CLAUDE.md package structure for settings/ and api/

Pre-reviewed by 12 agents, 12 findings addressed
…and CodeRabbit

Critical: restart_required default changed from False to True (fail-safe).
Major: CancelledError re-raised in poll loop (fixes spurious shutdown WARNING),
_rebuild_router error handling with SETTINGS_SERVICE_SWAP_FAILED event,
strategy enum_values aligned with STRATEGY_MAP, _drain helper fixes,
deterministic test timing, started_dispatcher fixture, new tests for
consecutive errors/ensure_channel/dispatcher failure/service.get failure.
Medium: _ERROR_BACKOFF constant, NamedTuple for metadata, docstring fixes,
stale config reference fix, design page updates.
- Broaden subscriber isolation in _dispatch (watched_keys/subscriber_name
  inside try block, getattr fallback for error logging)
- Non-string restart_required safety via str() cast
- Move SettingsService.get() inside try in _rebuild_router so read
  failures also log SETTINGS_SERVICE_SWAP_FAILED
- Revert AsyncGenerator annotation (ruff UP043: None is default)
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/synthorg/api/app.py (1)

357-457: 🛠️ Refactor suggestion | 🟠 Major

Refactor _safe_startup into smaller helpers to meet function-size policy.

This function now combines too many responsibilities and exceeds the 50-line limit. Split per-service startup steps into focused helpers and keep _safe_startup as orchestration-only.

♻️ Refactor shape (example)
+async def _start_settings_dispatcher(
+    settings_dispatcher: SettingsChangeDispatcher | None,
+) -> bool:
+    if settings_dispatcher is None:
+        return False
+    await settings_dispatcher.start()
+    return True
+
 async def _safe_startup(...):
-    # full startup logic inline
+    # orchestrate small startup helpers + centralized rollback flags
+    started_settings_dispatcher = await _start_settings_dispatcher(
+        settings_dispatcher
+    )

As per coding guidelines: "Functions must be less than 50 lines."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/api/app.py` around lines 357 - 457, The _safe_startup function
is too large—extract per-service startup helpers (e.g., _start_persistence,
_start_message_bus, _start_bridge, _start_settings_dispatcher,
_start_task_engine, _start_meeting_scheduler) that encapsulate the try/except,
logger.exception calls and set the corresponding started_* boolean, and call
_init_persistence from within the persistence helper; keep _safe_startup as an
orchestration routine that awaits/invokes those helpers in order and still calls
_cleanup_on_failure on any exception. Ensure each helper accepts the same
parameters used in the diff (persistence, message_bus, bridge,
settings_dispatcher, task_engine, meeting_scheduler, app_state) or just the
specific component and returns its started flag (or raises) so _safe_startup
only sequences calls and aggregates started_* flags for the existing
_cleanup_on_failure call.
♻️ Duplicate comments (1)
tests/unit/settings/test_dispatcher.py (1)

202-218: ⚠️ Potential issue | 🟠 Major

Replace sleep/poll waits with explicit synchronization primitives.

Line [215], Line [217], and Line [560] still rely on fixed sleeps/polling. These waits can flake under load; use event-driven synchronization (asyncio.Event/Condition) plus asyncio.wait_for(...) for deterministic completion.

#!/bin/bash
# Inspect timing-sensitive polling/sleep patterns in this test file.
rg -n -C2 'asyncio\.sleep\(|loop\.time\(\)|qsize\(\)' tests/unit/settings/test_dispatcher.py

As per coding guidelines: "NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins."

Also applies to: 555-560

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/settings/test_dispatcher.py` around lines 202 - 218, The _drain
helper currently polls bus._queue and uses fixed asyncio.sleep calls which cause
flakiness; replace this with an event-driven synchronization: add an
asyncio.Event/Condition to _FakeBus (e.g., queue_empty_event) that the
dispatcher/consumer sets when the bus has processed the last message, then
rewrite _drain to await that event using asyncio.wait_for(timeout) instead of
loop.time/qsize/polling and remove the extra fixed sleep; ensure the
dispatcher/consumer (SettingsChangeDispatcher message handler) triggers the
event when the queue becomes empty so tests deterministically wake.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/synthorg/api/app.py`:
- Around line 357-457: The _safe_startup function is too large—extract
per-service startup helpers (e.g., _start_persistence, _start_message_bus,
_start_bridge, _start_settings_dispatcher, _start_task_engine,
_start_meeting_scheduler) that encapsulate the try/except, logger.exception
calls and set the corresponding started_* boolean, and call _init_persistence
from within the persistence helper; keep _safe_startup as an orchestration
routine that awaits/invokes those helpers in order and still calls
_cleanup_on_failure on any exception. Ensure each helper accepts the same
parameters used in the diff (persistence, message_bus, bridge,
settings_dispatcher, task_engine, meeting_scheduler, app_state) or just the
specific component and returns its started flag (or raises) so _safe_startup
only sequences calls and aggregates started_* flags for the existing
_cleanup_on_failure call.

---

Duplicate comments:
In `@tests/unit/settings/test_dispatcher.py`:
- Around line 202-218: The _drain helper currently polls bus._queue and uses
fixed asyncio.sleep calls which cause flakiness; replace this with an
event-driven synchronization: add an asyncio.Event/Condition to _FakeBus (e.g.,
queue_empty_event) that the dispatcher/consumer sets when the bus has processed
the last message, then rewrite _drain to await that event using
asyncio.wait_for(timeout) instead of loop.time/qsize/polling and remove the
extra fixed sleep; ensure the dispatcher/consumer (SettingsChangeDispatcher
message handler) triggers the event when the queue becomes empty so tests
deterministically wake.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5c0f19e6-4f5f-49b8-a691-1fcd52ba58f1

📥 Commits

Reviewing files that changed from the base of the PR and between 1772717 and 43e3456.

📒 Files selected for processing (19)
  • CLAUDE.md
  • docs/design/operations.md
  • src/synthorg/api/app.py
  • src/synthorg/api/state.py
  • src/synthorg/communication/config.py
  • src/synthorg/observability/events/settings.py
  • src/synthorg/settings/__init__.py
  • src/synthorg/settings/definitions/providers.py
  • src/synthorg/settings/dispatcher.py
  • src/synthorg/settings/subscriber.py
  • src/synthorg/settings/subscribers/__init__.py
  • src/synthorg/settings/subscribers/memory_subscriber.py
  • src/synthorg/settings/subscribers/provider_subscriber.py
  • tests/unit/api/test_app.py
  • tests/unit/api/test_state_swap.py
  • tests/unit/settings/test_dispatcher.py
  • tests/unit/settings/test_memory_subscriber.py
  • tests/unit/settings/test_provider_subscriber.py
  • tests/unit/settings/test_subscriber_protocol.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Deploy Preview
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Backend
  • GitHub Check: Build Web
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do NOT use from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14.
Type hints required on all public functions. Use mypy strict mode.
Google-style docstrings required on all public classes and functions. Enforced by ruff D rules.
Line length: 88 characters. Enforced by ruff.
Functions must be less than 50 lines. Files must be less than 800 lines.

Files:

  • src/synthorg/settings/subscribers/memory_subscriber.py
  • src/synthorg/settings/definitions/providers.py
  • src/synthorg/api/state.py
  • src/synthorg/observability/events/settings.py
  • tests/unit/settings/test_subscriber_protocol.py
  • src/synthorg/settings/subscriber.py
  • src/synthorg/communication/config.py
  • tests/unit/api/test_state_swap.py
  • src/synthorg/settings/subscribers/__init__.py
  • tests/unit/settings/test_dispatcher.py
  • tests/unit/settings/test_memory_subscriber.py
  • src/synthorg/settings/dispatcher.py
  • src/synthorg/settings/__init__.py
  • tests/unit/settings/test_provider_subscriber.py
  • tests/unit/api/test_app.py
  • src/synthorg/settings/subscribers/provider_subscriber.py
  • src/synthorg/api/app.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Create new objects for immutability, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement. For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 conventions: BaseModel, model_validator, computed_field, ConfigDict. For derived values use @computed_field instead of storing + validating redundant fields. Use NotBlankStr (from core.types) for all identifier/name fields — including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants — instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task. Existing code is being migrated incrementally.
Handle errors explicitly, never silently swallow. Validate at system boundaries (user input, external APIs, config files).
Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__). Never use import logging / logging.getLogger() / print() in application code.
Variable name for logger: always logger (not _logger, not log).
Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool, GIT_COMMAND_START from events.git, `CONTEXT_BUDGET_FILL_U...

Files:

  • src/synthorg/settings/subscribers/memory_subscriber.py
  • src/synthorg/settings/definitions/providers.py
  • src/synthorg/api/state.py
  • src/synthorg/observability/events/settings.py
  • src/synthorg/settings/subscriber.py
  • src/synthorg/communication/config.py
  • src/synthorg/settings/subscribers/__init__.py
  • src/synthorg/settings/dispatcher.py
  • src/synthorg/settings/__init__.py
  • src/synthorg/settings/subscribers/provider_subscriber.py
  • src/synthorg/api/app.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Vendor names may only appear in: (1) Operations design page provider list (docs/design/operations.md), (2) .claude/ skill/agent files, (3) third-party import paths/module names. Tests must use test-provider, test-small-001, etc.

Files:

  • src/synthorg/settings/subscribers/memory_subscriber.py
  • src/synthorg/settings/definitions/providers.py
  • src/synthorg/api/state.py
  • src/synthorg/observability/events/settings.py
  • tests/unit/settings/test_subscriber_protocol.py
  • src/synthorg/settings/subscriber.py
  • src/synthorg/communication/config.py
  • tests/unit/api/test_state_swap.py
  • src/synthorg/settings/subscribers/__init__.py
  • tests/unit/settings/test_dispatcher.py
  • tests/unit/settings/test_memory_subscriber.py
  • src/synthorg/settings/dispatcher.py
  • src/synthorg/settings/__init__.py
  • tests/unit/settings/test_provider_subscriber.py
  • tests/unit/api/test_app.py
  • src/synthorg/settings/subscribers/provider_subscriber.py
  • src/synthorg/api/app.py
src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

REST API: Litestar framework, controllers with guards, channels for WebSocket, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint. RFC 9457 structured errors (ErrorCategory, ErrorCode, ErrorDetail, ProblemDetail, CATEGORY_TITLES, category_title, category_type_uri, content negotiation).

Files:

  • src/synthorg/api/state.py
  • src/synthorg/api/app.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow. Coverage: 80% minimum. Async: asyncio_mode = 'auto' — no manual @pytest.mark.asyncio needed. Timeout: 30 seconds per test. Parallelism: pytest-xdist via -n auto — ALWAYS include -n auto when running pytest, never run tests sequentially.
Prefer @pytest.mark.parametrize for testing similar cases.
Property-based testing: use Hypothesis (@given + @settings). Hypothesis profiles: ci (200 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties.
NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins.

Files:

  • tests/unit/settings/test_subscriber_protocol.py
  • tests/unit/api/test_state_swap.py
  • tests/unit/settings/test_dispatcher.py
  • tests/unit/settings/test_memory_subscriber.py
  • tests/unit/settings/test_provider_subscriber.py
  • tests/unit/api/test_app.py
🧠 Learnings (19)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.502Z
Learning: Settings: Runtime-editable settings persistence (DB > env > YAML > code defaults), typed definitions (9 namespaces), Fernet encryption for sensitive values, config bridge, ConfigResolver (typed composed reads for controllers), validation, registry, change notifications via message bus. Per-namespace setting definitions in definitions/ submodule (api, company, providers, memory, budget, security, coordination, observability, backup).
📚 Learning: 2026-03-17T21:13:19.501Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.501Z
Learning: Applies to src/synthorg/api/**/*.py : REST API: Litestar framework, controllers with guards, channels for WebSocket, JWT + API key + WS ticket auth, approval gate integration, coordination endpoint, collaboration endpoint, settings endpoint. RFC 9457 structured errors (ErrorCategory, ErrorCode, ErrorDetail, ProblemDetail, CATEGORY_TITLES, category_title, category_type_uri, content negotiation).

Applied to files:

  • CLAUDE.md
  • docs/design/operations.md
📚 Learning: 2026-03-17T21:13:19.502Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.502Z
Learning: Package structure: all code organized under `src/synthorg/` (src layout). Tests in `tests/`. Web dashboard in `web/`. CLI in `cli/`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T21:13:19.502Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.502Z
Learning: Python CLI module (`src/synthorg/cli/`) is superseded by top-level `cli/` Go binary.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T21:13:19.502Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.502Z
Learning: Documentation source in `docs/` (Markdown, built with Zensical). Design spec in `docs/design/` (7 pages: index, agents, organization, communication, engine, memory, operations). Architecture in `docs/architecture/` (overview, tech-stack, decision log). Roadmap in `docs/roadmap/`. Security in `docs/security.md`. Licensing in `docs/licensing.md`. Reference in `docs/reference/`. REST API reference in `docs/rest-api.md`. Library reference in `docs/api/` (auto-generated from docstrings). Custom templates in `docs/overrides/`. Config in `mkdocs.yml`.

Applied to files:

  • CLAUDE.md
  • docs/design/operations.md
📚 Learning: 2026-03-17T21:13:19.502Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.502Z
Learning: Settings: Runtime-editable settings persistence (DB > env > YAML > code defaults), typed definitions (9 namespaces), Fernet encryption for sensitive values, config bridge, ConfigResolver (typed composed reads for controllers), validation, registry, change notifications via message bus. Per-namespace setting definitions in definitions/ submodule (api, company, providers, memory, budget, security, coordination, observability, backup).

Applied to files:

  • CLAUDE.md
  • docs/design/operations.md
  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T21:13:19.502Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.502Z
Learning: Persistence backend: pluggable PersistenceBackend protocol in `src/synthorg/persistence/`, SQLite initial, SettingsRepository (namespaced settings CRUD).

Applied to files:

  • CLAUDE.md
  • src/synthorg/api/app.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/api/state.py
  • src/synthorg/observability/events/settings.py
📚 Learning: 2026-03-17T21:13:19.501Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.501Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code.

Applied to files:

  • src/synthorg/api/state.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • src/synthorg/api/state.py
📚 Learning: 2026-03-17T21:13:19.501Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.501Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`, `GIT_COMMAND_START` from `events.git`, `CONTEXT_BUDGET_FILL_UPDATED` from `events.context_budget`). Each domain has its own module. Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • src/synthorg/api/state.py
  • src/synthorg/observability/events/settings.py
📚 Learning: 2026-03-17T21:13:19.502Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.502Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • docs/design/operations.md
  • src/synthorg/api/app.py
📚 Learning: 2026-03-17T21:13:19.502Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.502Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-17T21:13:19.502Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.502Z
Learning: Budget: Cost tracking, budget enforcement (pre-flight/in-flight checks, auto-downgrade), billing periods, cost tiers, quota/subscription tracking, CFO cost optimization (anomaly detection, efficiency analysis, downgrade recommendations, approval decisions), spending reports, budget errors (BudgetExhaustedError, DailyLimitExceededError, QuotaExhaustedError).

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-17T21:13:19.502Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.502Z
Learning: HR engine: hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration sampling, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping).

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-17T21:13:19.501Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.501Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins.

Applied to files:

  • tests/unit/settings/test_dispatcher.py
📚 Learning: 2026-03-15T11:48:14.867Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T11:48:14.867Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/settings/dispatcher.py
📚 Learning: 2026-03-17T21:13:19.501Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.501Z
Learning: Applies to **/*.py : Use PEP 758 except syntax: `except A, B:` (no parentheses) — ruff enforces this on Python 3.14.

Applied to files:

  • src/synthorg/settings/dispatcher.py
📚 Learning: 2026-03-17T21:13:19.502Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.502Z
Learning: Communication: Message bus, dispatcher, messenger, channels, delegation, loop prevention, conflict resolution. Meeting protocol (round-robin, position papers, structured phases), scheduler (frequency, participant resolver), orchestrator in meeting/ submodule.

Applied to files:

  • src/synthorg/api/app.py
🧬 Code graph analysis (9)
tests/unit/settings/test_subscriber_protocol.py (1)
src/synthorg/settings/subscriber.py (4)
  • SettingsSubscriber (11-55)
  • watched_keys (31-33)
  • subscriber_name (36-38)
  • on_settings_changed (40-55)
src/synthorg/settings/subscriber.py (4)
src/synthorg/settings/subscribers/provider_subscriber.py (3)
  • watched_keys (60-62)
  • subscriber_name (65-67)
  • on_settings_changed (69-92)
tests/unit/settings/test_dispatcher.py (4)
  • watched_keys (64-65)
  • subscriber_name (68-69)
  • on_settings_changed (71-72)
  • on_settings_changed (78-80)
src/synthorg/settings/subscribers/memory_subscriber.py (3)
  • watched_keys (31-33)
  • subscriber_name (36-38)
  • on_settings_changed (40-57)
tests/unit/settings/test_subscriber_protocol.py (5)
  • watched_keys (12-13)
  • watched_keys (31-32)
  • subscriber_name (16-17)
  • subscriber_name (35-36)
  • on_settings_changed (19-24)
tests/unit/api/test_state_swap.py (2)
src/synthorg/api/errors.py (1)
  • ServiceUnavailableError (222-231)
src/synthorg/api/state.py (7)
  • AppState (36-361)
  • provider_registry (298-303)
  • has_provider_registry (293-295)
  • swap_provider_registry (305-329)
  • model_router (337-339)
  • has_model_router (332-334)
  • swap_model_router (341-361)
src/synthorg/settings/subscribers/__init__.py (2)
src/synthorg/settings/subscribers/memory_subscriber.py (1)
  • MemorySettingsSubscriber (19-57)
src/synthorg/settings/subscribers/provider_subscriber.py (1)
  • ProviderSettingsSubscriber (28-130)
tests/unit/settings/test_dispatcher.py (1)
src/synthorg/settings/dispatcher.py (3)
  • SettingsChangeDispatcher (48-224)
  • start (77-100)
  • stop (102-114)
src/synthorg/settings/dispatcher.py (3)
src/synthorg/settings/subscriber.py (4)
  • SettingsSubscriber (11-55)
  • watched_keys (31-33)
  • on_settings_changed (40-55)
  • subscriber_name (36-38)
src/synthorg/settings/subscribers/provider_subscriber.py (3)
  • watched_keys (60-62)
  • on_settings_changed (69-92)
  • subscriber_name (65-67)
src/synthorg/settings/subscribers/memory_subscriber.py (3)
  • watched_keys (31-33)
  • on_settings_changed (40-57)
  • subscriber_name (36-38)
src/synthorg/settings/__init__.py (1)
src/synthorg/settings/subscriber.py (1)
  • SettingsSubscriber (11-55)
tests/unit/api/test_app.py (2)
src/synthorg/api/app.py (2)
  • _safe_startup (357-457)
  • _safe_shutdown (460-509)
src/synthorg/api/state.py (2)
  • persistence (142-144)
  • AppState (36-361)
src/synthorg/api/app.py (3)
src/synthorg/settings/dispatcher.py (3)
  • SettingsChangeDispatcher (48-224)
  • stop (102-114)
  • start (77-100)
src/synthorg/settings/subscribers/memory_subscriber.py (1)
  • MemorySettingsSubscriber (19-57)
src/synthorg/settings/subscribers/provider_subscriber.py (1)
  • ProviderSettingsSubscriber (28-130)
🔇 Additional comments (31)
src/synthorg/settings/definitions/providers.py (1)

3-3: LGTM — Dynamic enum_values from STRATEGY_MAP.

Deriving enum_values from STRATEGY_MAP.keys() ensures the setting definition stays synchronized with available strategies. The sorted() call provides deterministic ordering.

Also applies to: 29-29

src/synthorg/communication/config.py (1)

30-30: LGTM — Adding #settings channel.

The new channel supports the settings change notification mechanism. Placement and naming are consistent with existing channels.

src/synthorg/settings/__init__.py (1)

22-22: LGTM — Public API export of SettingsSubscriber.

Clean addition following the existing pattern of re-exports in this module.

Also applies to: 38-38

src/synthorg/settings/subscribers/__init__.py (1)

1-13: LGTM — Package initialization with clean exports.

Standard re-export pattern for the concrete subscriber implementations.

src/synthorg/observability/events/settings.py (1)

19-33: LGTM — Comprehensive event constants for settings subsystem.

The new constants cover the full lifecycle of dispatchers, subscribers, and service swaps. Naming follows the established settings.<component>.<action> convention. Based on learnings, event names should use constants from domain-specific modules, and this implementation correctly extends the settings domain.

src/synthorg/settings/subscribers/memory_subscriber.py (1)

1-57: LGTM — Clean MemorySettingsSubscriber implementation.

The implementation correctly follows the SettingsSubscriber protocol with proper logging setup, structured logging using event constants, and idempotent behavior. The docstrings clearly explain that restart-required keys are filtered by the dispatcher.

tests/unit/api/test_state_swap.py (2)

1-106: LGTM — Comprehensive tests for AppState swap methods.

Good coverage of the hot-reload swap functionality:

  • Accessor behavior when None vs configured
  • has_* presence flags
  • Swap replaces reference correctly
  • Swap from None initial state

The @pytest.mark.unit markers are correctly applied per testing guidelines.


13-19: 🧹 Nitpick | 🔵 Trivial

Consider adding return type hint to _make_state.

The helper functions _make_registry and _make_router have return type hints, but _make_state is missing one. For consistency:

✏️ Suggested type hint addition
-def _make_state(**overrides: object) -> AppState:
+def _make_state(**overrides: object) -> AppState:

Actually, looking again, the return type is already present. Disregard.

tests/unit/settings/test_memory_subscriber.py (1)

1-46: LGTM!

Tests are well-structured and appropriately cover:

  • Protocol conformance via isinstance check
  • watched_keys contents including negative case for restart_required keys
  • subscriber_name property
  • on_settings_changed behavior (no-raise, idempotent)

The async tests correctly rely on asyncio_mode = 'auto' without manual markers.

docs/design/operations.md (2)

976-976: LGTM!

API surface table correctly updated with the new /api/v1/settings endpoint.


1045-1045: LGTM!

Hot-reload documentation is comprehensive and accurately describes:

  • SettingsChangeDispatcher polling behavior
  • #settings bus channel routing
  • restart_required=True filtering with WARNING log
  • Concrete subscriber behaviors (ProviderSettingsSubscriber rebuilds ModelRouter, MemorySettingsSubscriber for advisory logging)
tests/unit/settings/test_subscriber_protocol.py (1)

1-69: LGTM!

Excellent protocol conformance tests:

  • Runtime-checkable verification with both conforming and non-conforming test doubles
  • Property return type validation (frozenset, str)
  • Async method awaitable check

The test doubles (_ConformingSubscriber, _MissingMethod) are appropriately minimal and well-documented.

tests/unit/settings/test_provider_subscriber.py (1)

1-162: LGTM!

Comprehensive test coverage for ProviderSettingsSubscriber:

  • Protocol conformance (isinstance, watched_keys, subscriber_name)
  • Router rebuild on routing_strategy change with swap verification
  • Error propagation for UnknownStrategyError and SettingsService failures
  • No-op behavior for advisory keys (default_provider, retry_max_attempts)

Helper functions (_setting_value, _make_state, _make_subscriber) reduce boilerplate effectively.

CLAUDE.md (2)

117-117: LGTM!

Package structure documentation correctly updated to include AppState hot-reload slots (provider_registry, model_router with swap methods) and settings dispatcher lifecycle integration.


131-133: LGTM!

Settings subsystem documentation accurately reflects the new components:

  • SettingsSubscriber protocol in subscriber.py
  • SettingsChangeDispatcher in dispatcher.py with polling, routing, and restart_required filtering
  • subscribers/ subpackage with concrete implementations
src/synthorg/settings/dispatcher.py (5)

204-224: Subscriber isolation looks good.

The try block now properly encompasses:

  • subscriber.watched_keys access (line 206)
  • subscriber.on_settings_changed call (line 208)
  • subscriber.subscriber_name access for logging (line 211)

The fallback to "unknown" via getattr in the exception handler (line 220) handles cases where subscriber_name access itself fails. This addresses the isolation contract documented in the protocol.


251-256: Fail-safe restart_required handling is correct.

Using str(restart_raw).lower() (line 255) safely handles non-string values (bool, int, etc.) without raising AttributeError. The fail-safe default to True when metadata is corrupted prevents accidental hot-reload of restart-required settings.


1-46: LGTM!

Well-structured module setup:

  • Proper imports with TYPE_CHECKING guard
  • Module logger via get_logger(__name__)
  • Clear constants for configuration (_POLL_TIMEOUT, _ERROR_BACKOFF, _MAX_CONSECUTIVE_ERRORS)
  • _ChangeMetadata NamedTuple for structured metadata extraction

48-101: LGTM!

SettingsChangeDispatcher class is well-designed:

  • Clear docstring with behavior documentation
  • start() properly guards against double-start
  • Done callback (_on_task_done) for detecting poll-loop death
  • Correct use of asyncio.create_task with descriptive name

148-186: LGTM!

Poll loop implements proper resilience:

  • Consecutive error tracking with configurable threshold
  • MemoryError/RecursionError re-raised immediately (non-recoverable)
  • OSError/TimeoutError handled with backoff and max-error limit
  • Generic Exception breaks the loop with logging (fail-safe)
  • PEP 758 except syntax correctly used throughout
tests/unit/api/test_app.py (6)

164-206: LGTM!

New test test_settings_dispatcher_failure_cleans_up properly verifies:

  • When dispatcher start fails, persistence and bus are cleaned up
  • Uses AsyncMock for the dispatcher's async start/stop methods
  • Follows the established pattern from other failure tests in this class

110-110: LGTM!

Call site correctly updated with the new parameter order matching _safe_startup signature.


128-128: LGTM!

_safe_shutdown call correctly updated with 6 parameters.


158-162: LGTM!

Task engine failure test correctly updated with new parameter order.


218-218: LGTM!

Shutdown test correctly updated for new arity.


247-259: LGTM!

Meeting scheduler lifecycle test correctly updated for both _safe_startup and _safe_shutdown calls with the new parameter positions.

src/synthorg/settings/subscribers/provider_subscriber.py (3)

94-130: LGTM!

_rebuild_router correctly implements the suggested fix from the past review:

  • attempted_strategy initialized to None before the try block (line 105)
  • SettingsService.get() call is inside the try block (lines 107-110)
  • Error logging includes attempted_strategy context even when it's None (line 126)
  • swap_model_router only called after successful construction (line 130)
  • PEP 758 except syntax correctly used (line 120)

1-68: LGTM!

Well-structured subscriber implementation:

  • Clear module docstring and class docstring documenting behavior
  • _WATCHED constant with all three provider keys
  • Proper TYPE_CHECKING guard for type-only imports
  • Module logger via get_logger(__name__)
  • Properties correctly return immutable types (frozenset, str)

69-93: LGTM!

on_settings_changed correctly routes:

  • routing_strategy_rebuild_router() for actual rebuild
  • Other keys → INFO log with advisory note

This matches the documented behavior in the class docstring.

src/synthorg/api/state.py (1)

305-361: Hot-swap service methods are well implemented.

The swap methods keep replacement atomic (no await points) and emit structured swap events with before/after context.

src/synthorg/api/app.py (1)

412-421: Dispatcher lifecycle integration order looks correct.

Starting the settings dispatcher after bridge startup and stopping it before bridge shutdown is the right dependency order.

Also applies to: 486-491

…scriber

Event-driven synchronization: _FakeSubscriber.notified asyncio.Event
fires when on_settings_changed completes. Tests await the event
deterministically instead of polling qsize() + fixed sleeps.
Negative-case tests use _wait_for_queue_drain (queue empty + one tick).
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
tests/unit/settings/test_dispatcher.py (3)

195-197: ⚠️ Potential issue | 🟠 Major

Fix fixture return type: AsyncGenerator is missing SendType.

Line 197 should use both generic args for strict typing; otherwise mypy strict will complain.

♻️ Proposed fix
 `@pytest.fixture`
 async def started_dispatcher(
     dispatcher: SettingsChangeDispatcher,
-) -> AsyncGenerator[SettingsChangeDispatcher]:
+) -> AsyncGenerator[SettingsChangeDispatcher, None]:
#!/bin/bash
# Verify AsyncGenerator parameterization in this test module
rg -n "AsyncGenerator\\[" tests/unit/settings/test_dispatcher.py -C2

As per coding guidelines "Type hints required on all public functions. Use mypy strict mode."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/settings/test_dispatcher.py` around lines 195 - 197, The fixture
started_dispatcher's return type uses AsyncGenerator with only the yield type;
add the missing SendType generic (use None for fixtures that don't accept sent
values) so the signature becomes AsyncGenerator[SettingsChangeDispatcher, None];
update the type annotation on the started_dispatcher function (and any related
imports if necessary) to use AsyncGenerator[SettingsChangeDispatcher, None].

577-583: ⚠️ Potential issue | 🟠 Major

Use existing event-driven helper instead of manual time polling.

Lines 578-583 reintroduce sleep-based polling; this test can deterministically wait with _wait_for_subscriber.

♻️ Proposed fix
         await d.start()
         try:
-            # Wait for the subscriber to be called
-            loop = asyncio.get_running_loop()
-            deadline = loop.time() + 10.0
-            while len(sub.calls) == 0:
-                if loop.time() > deadline:
-                    pytest.fail("Timed out waiting for subscriber call")
-                await asyncio.sleep(0.05)
+            await _wait_for_subscriber(sub, timeout=10.0)
             assert ("ns", "k") in sub.calls
         finally:
             await d.stop()

Based on learnings "For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/settings/test_dispatcher.py` around lines 577 - 583, Replace the
manual polling loop that uses asyncio.get_running_loop(), deadline, loop.time()
and asyncio.sleep with the existing event-driven test helper
_wait_for_subscriber; specifically remove the while loop that checks
len(sub.calls) and instead call await _wait_for_subscriber(sub) (or the helper's
correct async signature) to wait deterministically for the subscriber call;
ensure any imports or fixtures providing _wait_for_subscriber remain available
and delete the unused loop, deadline and sleep-related code paths.

220-240: ⚠️ Potential issue | 🟠 Major

Replace queue-drain polling with an event-driven wait.

Lines 233-239 use time-based polling and sleeps, which can still flake under CI load. Prefer waiting on a dequeue event signaled by _FakeBus.receive().

♻️ Proposed fix
 class _FakeBus:
@@
     def __init__(self) -> None:
         self._running = True
         self._queue: asyncio.Queue[DeliveryEnvelope | None] = asyncio.Queue()
+        self._dequeued = asyncio.Event()
@@
     def enqueue(self, envelope: DeliveryEnvelope) -> None:
+        self._dequeued.clear()
         self._queue.put_nowait(envelope)
@@
     async def receive(
@@
     ) -> DeliveryEnvelope | None:
         try:
-            return await asyncio.wait_for(
+            item = await asyncio.wait_for(
                 self._queue.get(),
                 timeout=timeout,
             )
+            self._dequeued.set()
+            return item
         except TimeoutError:
             return None
@@
 async def _wait_for_queue_drain(
@@
 ) -> None:
@@
-    loop = asyncio.get_running_loop()
-    deadline = loop.time() + timeout
-    while bus._queue.qsize() > 0:
-        if loop.time() > deadline:
-            msg = "Queue drain timed out"
-            raise TimeoutError(msg)
-        await asyncio.sleep(0.01)
-    # One event-loop tick for the dispatcher to finish processing
+    await asyncio.wait_for(bus._dequeued.wait(), timeout=timeout)
+    bus._dequeued.clear()
+    # one event-loop tick for the dispatcher to finish processing
     await asyncio.sleep(0)

As per coding guidelines "For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/settings/test_dispatcher.py` around lines 220 - 240, The polling
loop in _wait_for_queue_drain should be replaced with an event-driven wait that
relies on the fake bus signaling rather than time.sleep; update
_wait_for_queue_drain to await a dequeue notification from _FakeBus (e.g., await
asyncio.wait_for(bus.receive() or await bus._dequeue_event.wait(), using the
existing receive()/dequeue event provided by _FakeBus) with the same timeout and
raise TimeoutError on timeout, and remove the qsize polling and sleep(0) tick;
also ensure tests that are timing-sensitive mock time.monotonic() and
asyncio.sleep() as per guidelines so the new event-driven wait is deterministic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@tests/unit/settings/test_dispatcher.py`:
- Around line 195-197: The fixture started_dispatcher's return type uses
AsyncGenerator with only the yield type; add the missing SendType generic (use
None for fixtures that don't accept sent values) so the signature becomes
AsyncGenerator[SettingsChangeDispatcher, None]; update the type annotation on
the started_dispatcher function (and any related imports if necessary) to use
AsyncGenerator[SettingsChangeDispatcher, None].
- Around line 577-583: Replace the manual polling loop that uses
asyncio.get_running_loop(), deadline, loop.time() and asyncio.sleep with the
existing event-driven test helper _wait_for_subscriber; specifically remove the
while loop that checks len(sub.calls) and instead call await
_wait_for_subscriber(sub) (or the helper's correct async signature) to wait
deterministically for the subscriber call; ensure any imports or fixtures
providing _wait_for_subscriber remain available and delete the unused loop,
deadline and sleep-related code paths.
- Around line 220-240: The polling loop in _wait_for_queue_drain should be
replaced with an event-driven wait that relies on the fake bus signaling rather
than time.sleep; update _wait_for_queue_drain to await a dequeue notification
from _FakeBus (e.g., await asyncio.wait_for(bus.receive() or await
bus._dequeue_event.wait(), using the existing receive()/dequeue event provided
by _FakeBus) with the same timeout and raise TimeoutError on timeout, and remove
the qsize polling and sleep(0) tick; also ensure tests that are timing-sensitive
mock time.monotonic() and asyncio.sleep() as per guidelines so the new
event-driven wait is deterministic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3452a607-9960-4a2d-820d-27459c89e6ec

📥 Commits

Reviewing files that changed from the base of the PR and between 43e3456 and f5a8292.

📒 Files selected for processing (1)
  • tests/unit/settings/test_dispatcher.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do NOT use from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14.
Type hints required on all public functions. Use mypy strict mode.
Google-style docstrings required on all public classes and functions. Enforced by ruff D rules.
Line length: 88 characters. Enforced by ruff.
Functions must be less than 50 lines. Files must be less than 800 lines.

Files:

  • tests/unit/settings/test_dispatcher.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow. Coverage: 80% minimum. Async: asyncio_mode = 'auto' — no manual @pytest.mark.asyncio needed. Timeout: 30 seconds per test. Parallelism: pytest-xdist via -n auto — ALWAYS include -n auto when running pytest, never run tests sequentially.
Prefer @pytest.mark.parametrize for testing similar cases.
Property-based testing: use Hypothesis (@given + @settings). Hypothesis profiles: ci (200 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties.
NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins.

Files:

  • tests/unit/settings/test_dispatcher.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases. Vendor names may only appear in: (1) Operations design page provider list (docs/design/operations.md), (2) .claude/ skill/agent files, (3) third-party import paths/module names. Tests must use test-provider, test-small-001, etc.

Files:

  • tests/unit/settings/test_dispatcher.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.502Z
Learning: Settings: Runtime-editable settings persistence (DB > env > YAML > code defaults), typed definitions (9 namespaces), Fernet encryption for sensitive values, config bridge, ConfigResolver (typed composed reads for controllers), validation, registry, change notifications via message bus. Per-namespace setting definitions in definitions/ submodule (api, company, providers, memory, budget, security, coordination, observability, backup).
📚 Learning: 2026-03-17T21:13:19.501Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T21:13:19.501Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins.

Applied to files:

  • tests/unit/settings/test_dispatcher.py
🧬 Code graph analysis (1)
tests/unit/settings/test_dispatcher.py (1)
src/synthorg/settings/dispatcher.py (3)
  • SettingsChangeDispatcher (48-224)
  • start (77-100)
  • stop (102-114)

Replace manual polling loop with the event-driven helper.
@Aureliolo Aureliolo merged commit 53f908e into main Mar 17, 2026
25 of 27 checks passed
@Aureliolo Aureliolo deleted the feat/settings-hot-reload branch March 17, 2026 22:02
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 17, 2026 22:02 — with GitHub Actions Inactive
Aureliolo added a commit that referenced this pull request Mar 17, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.3.1](v0.3.0...v0.3.1)
(2026-03-17)


### Features

* **api:** RFC 9457 Phase 2 — ProblemDetail and content negotiation
([#496](#496))
([30f7c49](30f7c49))
* **cli:** verify container image signatures and SLSA provenance on pull
([#492](#492))
([bef272d](bef272d)),
closes [#491](#491)
* **engine:** implement context budget management in execution loops
([#520](#520))
([181eb8a](181eb8a)),
closes [#416](#416)
* implement settings persistence layer (DB-backed config)
([#495](#495))
([4bd99f7](4bd99f7)),
closes [#450](#450)
* **memory:** implement dual-mode archival in memory consolidation
([#524](#524))
([4603c9e](4603c9e)),
closes [#418](#418)
* migrate config consumers to read through SettingsService
([#510](#510))
([32f553d](32f553d))
* **settings:** implement settings change subscriptions for service
hot-reload ([#526](#526))
([53f908e](53f908e)),
closes [#503](#503)
* **settings:** register API config in SettingsService with 2-phase init
([#518](#518))
([29f7481](29f7481))
* **tools:** add SSRF prevention for git clone URLs
([#505](#505))
([492dd0d](492dd0d))
* **tools:** wire RootConfig.git_clone to GitCloneTool instantiation
([#519](#519))
([b7d8172](b7d8172))


### Bug Fixes

* **api:** replace JWT query parameter with one-time ticket for
WebSocket auth
([#493](#493))
([22a25f6](22a25f6)),
closes [#343](#343)


### Documentation

* add uv cache lock contention handling to worktree skill
([#500](#500))
([bd85a8d](bd85a8d))
* document RFC 9457 dual response formats in OpenAPI schema
([#506](#506))
([8dd2524](8dd2524))


### Maintenance

* upgrade jsdom from 28 to 29
([#499](#499))
([1ea2249](1ea2249))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: implement settings change subscriptions for service hot-reload

1 participant