Comparing changes

## Summary Bridge `search_memory` and `recall_memory` tools into the standard `ToolRegistry`/`ToolInvoker` dispatch pipeline. The `ToolBasedInjectionStrategy` already handles memory retrieval logic; this adds `BaseTool` wrappers that delegate to it, following the established `registry_with_approval_tool` pattern. ## Changes ### New files - **`src/synthorg/memory/tools.py`** -- `SearchMemoryTool`, `RecallMemoryTool` (`BaseTool` subclasses), `create_memory_tools` factory, `registry_with_memory_tools` augmentation function - **`tests/unit/memory/test_memory_tools.py`** -- 48 unit tests ### Modified files - **`src/synthorg/core/enums.py`** -- Added `ToolCategory.MEMORY`, `ActionType.MEMORY_READ` - **`src/synthorg/engine/agent_engine.py`** -- Added `memory_injection_strategy` parameter, wired into `_make_tool_invoker` - **`src/synthorg/memory/tool_retriever.py`** -- Made schema/name constants public, wrapped schemas with `MappingProxyType`, extracted error message constants - **`src/synthorg/security/action_type_mapping.py`** -- Added `MEMORY -> MEMORY_READ` mapping - **`src/synthorg/security/action_types.py`** -- Added `ActionTypeCategory.MEMORY` - **`src/synthorg/security/rules/risk_classifier.py`** -- Added `MEMORY_READ` as LOW risk - **`src/synthorg/security/timeout/risk_tier_classifier.py`** -- Added `MEMORY_READ` as LOW risk - **`src/synthorg/tools/permissions.py`** -- Added `MEMORY` to SANDBOXED/RESTRICTED/STANDARD access levels - **`docs/design/memory.md`** -- Added ToolRegistry integration details to Tool-Based Retrieval section - **`docs/design/operations.md`** -- Added MEMORY to Tool Categories table and `memory:read` to Action Type Taxonomy - **`CLAUDE.md`** -- Updated memory/ package description ## Design decisions - **Per-agent tool binding**: Tools are constructed with `agent_id` baked in (via `NotBlankStr`), preventing cross-agent memory leakage - **Thin wrappers**: `BaseTool.execute()` delegates entirely to `ToolBasedInjectionStrategy.handle_tool_call()` -- no duplicate logic - **Shared error constants**: Error message strings defined once in `tool_retriever.py`, imported by `tools.py` for `_is_error_response` detection - **Graceful degradation**: `registry_with_memory_tools` catches construction failures and returns the original registry - **MEMORY at all access levels**: Read-only, agent-scoped -- safe from SANDBOXED upward ## Test plan - 48 new tests covering: tool properties, delegation, error paths, schema isolation, factory, registry augmentation, round-trips, error response detection, generic exception paths, graceful degradation - Updated existing parametrize lists for MEMORY category in permissions, risk classifiers, action type mapping, and enum count tests - Full suite: 13151 passed, 0 failed ## Review coverage Pre-reviewed by 8 agents (code-reviewer, conventions-enforcer, silent-failure-hunter, type-design-analyzer, docs-consistency, test-quality-reviewer, issue-resolution-verifier, logging-audit). 15 findings addressed. Closes #207

## Summary Restructure Hypothesis property-based testing to be deterministic in CI and capture failures persistently across worktrees. ### Changes - **CI profile** (`phases=[Phase.explicit]`): only runs explicit `@example()` cases -- fully deterministic and reproducible. No random generation in CI. - **Dev profile** (1000 examples): local random fuzzing with failure capture. - **Fuzz profile** (10,000 examples, no deadline): dedicated long-running fuzzing sessions. - **Write-only shared DB** at `~/.synthorg/hypothesis-examples/`: captures every failing example from dev/fuzz runs to a central location outside any worktree. Failures are logged for analysis but **never replayed** automatically (avoids blocking all test runs until fixed). - **`_WriteOnlyDatabase`** wrapper: custom `ExampleDatabase` subclass that writes to the shared DB but returns empty on `fetch()`. ### Motivation Previously, Hypothesis ran random generation in CI (non-deterministic, caused spurious failures) and the local `.hypothesis/` example database was inside each worktree (lost on worktree deletion). Now: - CI is fully reproducible -- same inputs every run - Random fuzzing happens locally where developers can investigate failures - Failures are captured centrally for periodic analysis and conversion to `@example()` decorators ### Test plan - `uv run python -m pytest tests/ -m unit -n 8 -k properties` -- 66 passed (explicit examples), 89 skipped (no random gen in CI profile) - `HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n 8 -k properties` -- 155 passed (full random fuzzing) - `uv run mypy src/ tests/` -- clean - Full suite: 13005 passed

…#1040) ## Summary Implements workflow execution: when a user activates a `WorkflowDefinition`, the engine creates concrete `Task` instances following the graph topology, respecting dependencies and conditional edges. ### What changed **Core execution engine** (`src/synthorg/engine/workflow/`): - `execution_service.py` -- `WorkflowExecutionService` with `activate`, `get_execution`, `list_executions`, `cancel_execution` - `execution_models.py` -- `WorkflowExecution` and `WorkflowNodeExecution` frozen Pydantic models with cross-field validators - `condition_eval.py` -- Safe string-based condition evaluator (no `eval`/`exec`) supporting boolean literals, key lookups, equality, inequality - `graph_utils.py` -- Shared topological sort (Kahn's algorithm) and adjacency map construction, extracted from yaml_export.py **Activation algorithm** (topological walk): 1. Validate the definition via `validate_workflow()` 2. Build adjacency maps and topological sort via shared `graph_utils` 3. Walk nodes in topological order: START/END/SPLIT/JOIN marked COMPLETED, AGENT_ASSIGNMENT propagates agent name, TASK creates concrete task via TaskEngine with upstream dependencies wired, CONDITIONAL evaluates expression and skips untaken branch 4. Persist execution in RUNNING status **API** (`src/synthorg/api/controllers/workflow_executions.py`): - `POST /activate/{workflow_id}` -- Activate a workflow definition (201) - `GET /by-definition/{workflow_id}` -- List executions for a definition - `GET /{execution_id}` -- Get a specific execution - `POST /{execution_id}/cancel` -- Cancel an execution (with status guard) **Persistence** (`src/synthorg/persistence/`): - `WorkflowExecutionRepository` protocol with CRUD + `list_by_status` - `SQLiteWorkflowExecutionRepository` with optimistic concurrency, JSON-serialized node executions, composite indexes - `workflow_executions` table with FK to `workflow_definitions`, CHECK constraints, 6 indexes **Observability**: - `workflow_execution.py` event constants (activation, task creation, condition evaluation, lifecycle) - Persistence event constants for execution CRUD ### Test plan - **13186 tests pass** (11 new tests added) - `test_condition_eval.py` -- 22 tests covering all expression types, safety, edge cases - `test_execution_models.py` -- 16 tests covering validation, immutability, cross-field invariants, NaN rejection - `test_execution_service.py` -- 14 tests covering simple/sequential/parallel/conditional workflows, agent assignment, errors, cancel (including terminal status guard) - `test_workflow_executions.py` -- 7 API controller tests (activate, list, get, cancel endpoints) - `test_workflow_execution_repo.py` -- 10 SQLite CRUD tests (roundtrip, version conflict, list, delete) - `test_graph_utils.py` -- 8 tests (linear, diamond, cycle detection, empty, disconnected, plain dict return) - Protocol compliance test for `WorkflowExecutionRepository` - Migration test updated for new table + 6 indexes ### Review coverage Pre-reviewed by 16 specialized agents, 33 findings addressed: - 5 critical (dependency wiring bug, cancel status guard, cross-field validators, docs drift, package structure) - 16 major (model_copy, audit trail, function lengths, error handling, logging, indexes, rollback) - 12 medium (magic numbers, silent fallbacks, unbounded input, docstrings, test quality) Closes #1004 Release-As: 0.6.0

## Summary Local providers (Ollama, LM Studio) have management APIs that SynthOrg didn't use. The setup wizard showed wrong model counts (LiteLLM static DB vs live discovery), and there was no way to download, delete, or configure local models from the dashboard. ### Bug fix - **Model count mismatch**: For presets with `auth_type=NONE` (Ollama, LM Studio, vLLM), skip `models_from_litellm()` and always use live discovery. The static DB returned stale/wrong models (Ollama got old entries; LM Studio/vLLM got OpenAI cloud models via `litellm_provider="openai"`). ### Backend - **Preset capability flags**: `supports_model_pull`, `supports_model_delete`, `supports_model_config` on `ProviderPreset` (Ollama: all true; LM Studio/vLLM: deferred) - **`LocalModelParams`**: Per-model launch parameters (`num_ctx`, `num_gpu_layers`, `num_threads`, `num_batch`, `repeat_penalty`) stored on `ProviderModelConfig` - **`preset_name`** on `ProviderConfig` for capability flag resolution in API responses - **`OllamaModelManager`**: Pull via streaming `/api/pull` (newline-delimited JSON), delete via `/api/delete`, with `LocalModelManager` protocol - **Service layer**: `pull_model` (async generator), `delete_model` (with auto-refresh), `update_model_config` (immutable update under lock) - **API endpoints**: `POST /{name}/models/pull` (SSE streaming), `DELETE /{name}/models/{model_id}` (204), `PUT /{name}/models/{model_id}/config` (returns updated model) - **Error handling**: httpx transport errors caught in pull stream, SSE generator catch-all, terminal events on stream disconnect, malformed JSON logging ### Dashboard - **Model pull dialog**: Input + ProgressGauge + cancel via AbortController, proper overlay with a11y (role=dialog, aria-modal, Escape handler) - **Model delete**: ConfirmDialog (destructive variant) per model row - **Model config drawer**: Per-model parameter editor (num_ctx, num_gpu_layers, num_threads, num_batch, repeat_penalty) with NaN-safe parsing - **Refresh button**: RefreshCw with spin animation, calls discover-models - **All controls gated** by preset capability flags (`supports_model_pull/delete/config`) - **Storybook stories** for ModelPullDialog and ModelConfigDrawer ### Documentation - Updated CLAUDE.md package structure with local model management - Updated docs/design/operations.md: API Surface table, Provider Management section, preset capability flags, Web UI features ## Test plan - 13,070 Python tests pass (30 new tests for local model management) - 2,342 web tests pass (195 test files) - Zero lint warnings (ruff, mypy, ESLint) - Clean TypeScript type check ### Visual testing - After bug fix: re-run setup wizard, verify auto-detect and configured provider show same model count - After pull endpoint: `curl -X POST localhost:3001/api/v1/providers/ollama/models/pull -H "Content-Type: application/json" -d '{"model_name":"llama3.2:1b"}' -N` - After dashboard: navigate to Providers > Ollama detail, verify real models, test pull/delete/config controls ## Review coverage Pre-reviewed by 8 agents (code-reviewer, conventions-enforcer, frontend-reviewer, silent-failure-hunter, type-design-analyzer, api-contract-drift, docs-consistency, issue-resolution-verifier). 22 findings identified and fixed. Closes #1030

## Summary Add ceremony policy configuration to the web dashboard at all 3 resolution levels (project, department, per-ceremony), with strategy-specific config and visual feedback. ## What changed ### Backend - **Ceremony policy API controller** (`/ceremony-policy`) with 3 endpoints: project policy query, resolved policy with field-level origin tracking (`PolicySourceBadge`), active sprint strategy (for warning banners) - **Department ceremony policy endpoints** on `/departments/{name}/ceremony-policy` (GET/PUT/DELETE) with data-loss protection (raises on failed reads instead of silently returning empty state) - **7 ceremony settings** registered in coordination namespace: strategy, strategy_config, velocity_calculator, auto_transition, transition_threshold, dept_ceremony_policies, ceremony_policy_overrides - **`active_strategy` property** added to `CeremonyScheduler` - **Observability event constants** for ceremony policy API operations ### Frontend - **CeremonyPolicyPage** at `/settings/coordination/ceremony-policy` with: - Strategy picker (all 8 strategies) with descriptions and velocity unit indicator - Strategy-specific config panels for all 8 strategies (task-driven, calendar, hybrid, event-driven, budget-driven, throughput-adaptive, external-trigger, milestone-driven) - Auto-transition toggle and threshold controls - Strategy change warning banner when pending strategy differs from active sprint - `PolicySourceBadge` showing resolved field origins (project/department/default) - **DepartmentOverridesPanel** with per-department inherit/override toggles and expandable config forms - **CeremonyListPanel** with per-ceremony inherit/override for individual ceremonies (sprint_planning, standup, sprint_review, retrospective) - **DepartmentCeremonyOverride** integrated into DepartmentEditDrawer in org-edit page - **PolicySourceBadge** and **InheritToggle** shared UI components with Storybook stories - **Zustand store**, API endpoint module, TypeScript types, ceremony constants ### Review fixes (pre-reviewed by 10 agents, 30 findings addressed) - Fixed boolean-to-integer bug in StrEnum serialization (`isinstance` instead of `hasattr`) - Fixed data-loss chain in read-modify-write pattern (raises on failed reads) - Added `deepcopy` at system boundary for department policy storage - Added error handling for invalid settings values with structured logging - Added `co-occurrence` validator on `ActiveCeremonyStrategyResponse` - Used `NotBlankStr` for identifier fields, tightened `value: Any` to union type - Added `HTTP_204_NO_CONTENT` to DELETE endpoint - Added `aria-label`, `aria-expanded` for accessibility - Added `.catch()` handlers and error state display for all async operations - Added JSON parse error feedback in CodeMirror editors - Fixed design tokens (`space-y-section-gap`, `p-card`) - Added `readonly` to frozen response types ### Documentation - Updated CLAUDE.md Package Structure (`api/` description) - Updated web/CLAUDE.md component inventory (PolicySourceBadge, InheritToggle) and stores list - Updated `docs/design/ceremony-scheduling.md` roadmap (moved #979 to shipped) ## Test plan - `uv run python -m pytest tests/ -m unit -n 8 -k ceremony` -- 31 tests pass - `npm --prefix web run type-check` -- zero errors - `npm --prefix web run lint` -- zero warnings - `npm --prefix web run test` -- 2342 tests pass - Visual: navigate to Settings > Coordination > Ceremony Policy link - Visual: strategy picker shows 8 options with descriptions and velocity unit - Visual: strategy change warning banner appears when strategy differs from active sprint - Visual: department overrides with inherit/override toggles - Visual: per-ceremony overrides for individual ceremonies - Visual: org-edit department drawer has ceremony policy section Closes #979

🤖 I have created a release *beep* *boop* --- ## [0.6.0](v0.5.9...v0.6.0) (2026-04-03) ### Features * dashboard UI for ceremony policy settings ([#1038](#1038)) ([865554c](865554c)), closes [#979](#979) * implement tool-based memory retrieval injection strategy ([#1039](#1039)) ([329270e](329270e)), closes [#207](#207) * local model management for Ollama and LM Studio ([#1037](#1037)) ([e1b14d3](e1b14d3)), closes [#1030](#1030) * workflow execution -- instantiate tasks from WorkflowDefinition ([#1040](#1040)) ([e9235e3](e9235e3)), closes [#1004](#1004) ### Maintenance * shared Hypothesis failure DB + deterministic CI profile ([#1041](#1041)) ([901ae92](901ae92)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparing changes

Open a pull request

Commits on Apr 3, 2026

This comparison is taking too long to generate.

Uh oh!