Skip to content

Commit 0611b53

Browse files
authored
feat: implement core tool categories and granular sub-constraints (#1101)
## Summary Implements #1034 (core tool categories) and #220 (granular tool access sub-constraints) as a single cohesive change. Adds 6 new tools across 3 categories, a shared SSRF prevention layer, and a granular sub-constraint enforcement pipeline integrated into the existing tool invoker. ## Tool Categories (#1034) ### Web tools (`tools/web/`) - **HttpRequestTool**: GET/POST/PUT/DELETE via httpx with SSRF prevention (shared `NetworkPolicy`, IP blocklist, DNS validation). Redirects disabled to prevent SSRF bypass. Response truncation at configurable max bytes. - **WebSearchTool**: Vendor-agnostic search via `WebSearchProvider` protocol -- no concrete implementation shipped (inject via MCP bridge or custom provider). - **HtmlParserTool**: Text/links/metadata extraction via stdlib `html.parser`. Strips script/style tags. Operates on pre-fetched content (no HTTP). ### Database tools (`tools/database/`) - **SqlQueryTool**: Parameterized SQL execution via aiosqlite. Read-only by default with defense-in-depth: statement keyword classification AND SQLite URI-based `?mode=ro`. Query timeout via `asyncio.wait_for`. Table name validation regex for PRAGMA queries. - **SchemaInspectTool**: `list_tables` (sqlite_master) + `describe_table` (PRAGMA table_info with safe identifier validation). ### Terminal tools (`tools/terminal/`) - **ShellCommandTool**: Sandboxed command execution via `SandboxBackend` delegation. Command allow/blocklist. Working directory support. Output truncation. Returns error when no sandbox configured. ### Shared infrastructure - **`network_validator.py`**: Extracted SSRF blocklist and DNS validation from `git_url_validator` (backward compatible). `NetworkPolicy` model reusable across tool categories. Case-insensitive scheme validation. ## Granular Sub-Constraints (#220) ### Models (`sub_constraints.py`) - Five constraint dimension enums: `FileSystemScope`, `NetworkMode`, `GitAccess`, `CodeExecutionIsolation`, `TerminalAccess` - `ToolSubConstraints` frozen Pydantic model with per-level defaults matching operations.md section 11.2 - `get_sub_constraints()` resolution with custom override support ### Enforcement (`sub_constraint_enforcer.py`) - `SubConstraintEnforcer` checks network (blocks WEB when NONE), terminal (blocks TERMINAL when NONE), git (blocks push for LOCAL_ONLY/READ_AND_BRANCH, blocks clone for LOCAL_ONLY), and requires_approval (escalation for matching action type prefixes) - Integrated into `ToolPermissionChecker` (optional `sub_constraints` param) - Wired into `ToolInvoker` pipeline between permission check and param validation ### Agent model - `ToolPermissions` gains `sub_constraints: ToolSubConstraints | None` field for per-agent overrides ## Integration - Tool factory extended with `_build_web/database/terminal_tools` builders - `RootConfig` gains optional `web`, `database`, `terminal` config fields - Event constants for web, database, terminal, sub_constraint domains - New dependency: `httpx==0.28.1` (async HTTP client) ## Security Highlights - SSRF: shared IP blocklist (IPv4+IPv6), DNS resolution validation, scheme restriction, redirect disabled, fail-closed on unparseable IPs - SQL injection: parameterized queries + table name regex + SQLite read-only URI mode - Command injection: sandbox delegation + allow/blocklist (documented as best-effort safety net, sandbox is primary defense) - Sub-constraints: network=NONE blocks web tools, terminal=NONE blocks shell tools, git=LOCAL_ONLY blocks clone+push ## Test Plan - 170+ new unit tests across 22 test files - All 14,827 existing unit tests continue to pass - Pre-reviewed by 10 agents, 25 findings addressed (security fixes, conventions, docs) ## Review Coverage - code-reviewer, security-reviewer, type-design-analyzer, silent-failure-hunter, pr-test-analyzer, async-concurrency-reviewer, conventions-enforcer, logging-audit, issue-resolution-verifier, docs-consistency Closes #1034 Closes #220
1 parent 31e7273 commit 0611b53

58 files changed

Lines changed: 5995 additions & 80 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CLAUDE.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ src/synthorg/
106106
settings/ # Runtime-editable settings (DB > env > YAML > code), Fernet encryption, ConfigResolver, definitions/, subscribers/ (SecuritySubscriber for discovery allowlist hot-reload)
107107
security/ # Rule engine, audit log, output scanner, progressive trust, autonomy levels, timeout policies, LLM fallback evaluator, custom policy rules, risk scoring (pluggable RiskScorer protocol, multi-dimensional RiskScore, DefaultRiskScorer), enforcement modes (active/shadow/disabled via SecurityEnforcementMode), risk override (SecOps risk tier reclassification via RiskTierOverride + SecOpsRiskClassifier), SSRF violation tracking (SsrfViolation model, pending/allowed/denied status for self-healing discovery allowlist)
108108
templates/ # Pre-built company templates (inheritance tree), template merge engine, personality presets, preset discovery/CRUD service, model requirements, tier-to-model matching, locale-aware name generation, workflow config rendering, pack_loader (additive team packs), packs/ (built-in pack YAMLs), uses_packs composition
109-
tools/ # Tool registry, built-in tools, git SSRF prevention, MCP bridge, sandbox factory (gVisor default overrides via merge_gvisor_defaults), invocation tracking, sandbox/ (4-domain SandboxPolicy model (filesystem/network/process/inference), SandboxRuntimeResolver (gVisor probe + per-category runtime resolution with fallback), SandboxCredentialManager (env var credential stripping), SandboxAuthProxy (LLM traffic auth proxy stub))
109+
tools/ # Tool registry, built-in tools, git SSRF prevention, MCP bridge, sandbox factory (gVisor default overrides via merge_gvisor_defaults), invocation tracking, network_validator (shared SSRF), sub_constraints (per-level constraint models), sub_constraint_enforcer (granular enforcement), web/ (HTTP requests, HTML parsing, web search), database/ (SQL query, schema inspection), terminal/ (sandboxed shell commands), sandbox/ (4-domain SandboxPolicy model (filesystem/network/process/inference), SandboxRuntimeResolver (gVisor probe + per-category runtime resolution with fallback), SandboxCredentialManager (env var credential stripping), SandboxAuthProxy (LLM traffic auth proxy stub))
110110
111111
web/src/ # React 19 dashboard (see web/CLAUDE.md for full structure)
112112
cli/ # Go CLI binary (see cli/CLAUDE.md for full structure)
@@ -146,7 +146,7 @@ See `web/CLAUDE.md` for the full component inventory, design token rules, and po
146146
- **Every module** with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`
147147
- **Never** use `import logging` / `logging.getLogger()` / `print()` in application code (exception: `observability/setup.py`, `observability/sinks.py`, `observability/syslog_handler.py`, and `observability/http_handler.py` may use stdlib `logging` and `print(..., file=sys.stderr)` for handler construction, bootstrap, and error reporting code that runs before or during logging system configuration)
148148
- **Variable name**: always `logger` (not `_logger`, not `log`)
149-
- **Event names**: always use constants from the domain-specific module under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`, `GIT_COMMAND_START` from `events.git`, `CONTEXT_BUDGET_FILL_UPDATED` from `events.context_budget`, `BACKUP_STARTED` from `events.backup`, `SETUP_COMPLETED` from `events.setup`, `ROUTING_CANDIDATE_SELECTED` from `events.routing`, `SHIPPING_HTTP_BATCH_SENT` from `events.shipping`, `EVAL_REPORT_COMPUTED` from `events.evaluation`, `PROMPT_PROFILE_SELECTED` from `events.prompt`, `PROCEDURAL_MEMORY_START` from `events.procedural_memory`, `PERF_LLM_JUDGE_STARTED` from `events.performance`, `TASK_ENGINE_OBSERVER_FAILED` from `events.task_engine`, `WORKFLOW_EXEC_COMPLETED` from `events.workflow_execution`, `BLUEPRINT_INSTANTIATE_START` from `events.blueprint`, `WORKFLOW_DEF_ROLLED_BACK` from `events.workflow_definition`, `WORKFLOW_VERSION_SAVED` from `events.workflow_version`, `MEMORY_FINE_TUNE_STARTED` from `events.memory`, `REPORTING_GENERATION_STARTED` from `events.reporting`, `RISK_BUDGET_SCORE_COMPUTED` from `events.risk_budget`, `LLM_STRATEGY_SYNTHESIZED` and `DISTILLATION_CAPTURED` from `events.consolidation`, `MEMORY_DIVERSITY_RERANKED`, `MEMORY_DIVERSITY_RERANK_FAILED`, and `MEMORY_REFORMULATION_ROUND` from `events.memory`, `NOTIFICATION_DISPATCHED` and `NOTIFICATION_DISPATCH_FAILED` from `events.notification`, `QUALITY_STEP_CLASSIFIED` from `events.quality`, `HEALTH_TICKET_EMITTED` from `events.health`, `TRAJECTORY_SCORING_START` from `events.trajectory`, `COORD_METRICS_AMDAHL_COMPUTED` from `events.coordination_metrics`). Each domain has its own module -- see `src/synthorg/observability/events/` for the full inventory of constants. Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
149+
- **Event names**: always use constants from the domain-specific module under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`, `GIT_COMMAND_START` from `events.git`, `CONTEXT_BUDGET_FILL_UPDATED` from `events.context_budget`, `BACKUP_STARTED` from `events.backup`, `SETUP_COMPLETED` from `events.setup`, `ROUTING_CANDIDATE_SELECTED` from `events.routing`, `SHIPPING_HTTP_BATCH_SENT` from `events.shipping`, `EVAL_REPORT_COMPUTED` from `events.evaluation`, `PROMPT_PROFILE_SELECTED` from `events.prompt`, `PROCEDURAL_MEMORY_START` from `events.procedural_memory`, `PERF_LLM_JUDGE_STARTED` from `events.performance`, `TASK_ENGINE_OBSERVER_FAILED` from `events.task_engine`, `WORKFLOW_EXEC_COMPLETED` from `events.workflow_execution`, `BLUEPRINT_INSTANTIATE_START` from `events.blueprint`, `WORKFLOW_DEF_ROLLED_BACK` from `events.workflow_definition`, `WORKFLOW_VERSION_SAVED` from `events.workflow_version`, `MEMORY_FINE_TUNE_STARTED` from `events.memory`, `REPORTING_GENERATION_STARTED` from `events.reporting`, `RISK_BUDGET_SCORE_COMPUTED` from `events.risk_budget`, `LLM_STRATEGY_SYNTHESIZED` and `DISTILLATION_CAPTURED` from `events.consolidation`, `MEMORY_DIVERSITY_RERANKED`, `MEMORY_DIVERSITY_RERANK_FAILED`, and `MEMORY_REFORMULATION_ROUND` from `events.memory`, `NOTIFICATION_DISPATCHED` and `NOTIFICATION_DISPATCH_FAILED` from `events.notification`, `QUALITY_STEP_CLASSIFIED` from `events.quality`, `HEALTH_TICKET_EMITTED` from `events.health`, `TRAJECTORY_SCORING_START` from `events.trajectory`, `COORD_METRICS_AMDAHL_COMPUTED` from `events.coordination_metrics`, `WEB_REQUEST_START` and `WEB_SSRF_BLOCKED` from `events.web`, `DB_QUERY_START` and `DB_WRITE_BLOCKED` from `events.database`, `TERMINAL_COMMAND_START` and `TERMINAL_COMMAND_BLOCKED` from `events.terminal`, `SUB_CONSTRAINT_RESOLVED` and `SUB_CONSTRAINT_DENIED` from `events.sub_constraint`). Each domain has its own module -- see `src/synthorg/observability/events/` for the full inventory of constants. Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`
150150
- **Structured kwargs**: always `logger.info(EVENT, key=value)` -- never `logger.info("msg %s", val)`
151151
- **All error paths** must log at WARNING or ERROR with context before raising
152152
- **All state transitions** must log at INFO
@@ -240,7 +240,7 @@ See `web/CLAUDE.md` for the full component inventory, design token rules, and po
240240

241241
- **Pinned**: all versions use `==` in `pyproject.toml`
242242
- **Groups**: `test` (pytest + plugins, hypothesis), `dev` (includes test + ruff, mypy, pre-commit, commitizen, pip-audit)
243-
- **Required**: `mem0ai` (Mem0 memory backend -- the default backend), `mmh3` (murmurhash3 for BM25 sparse vector encoding in hybrid search), `cryptography` (Fernet encryption for sensitive settings at rest), `faker` (multi-locale agent name generation for templates and setup wizard)
243+
- **Required**: `mem0ai` (Mem0 memory backend -- the default backend), `mmh3` (murmurhash3 for BM25 sparse vector encoding in hybrid search), `cryptography` (Fernet encryption for sensitive settings at rest), `faker` (multi-locale agent name generation for templates and setup wizard), `httpx` (async HTTP client for web tools)
244244
- **Install**: `uv sync` installs everything (dev group is default)
245245
- **Web dashboard**: Node.js 22+, TypeScript 6.0+, dependencies in `web/package.json` (React 19, react-router, shadcn/ui, Base UI, Tailwind CSS 4, Zustand, @tanstack/react-query, @xyflow/react, @dagrejs/dagre, d3-force, @dnd-kit, Recharts, Framer Motion, cmdk-base, js-yaml, Axios, Lucide React, @fontsource-variable/geist, @fontsource-variable/geist-mono, @fontsource-variable/jetbrains-mono, @fontsource-variable/inter, @fontsource/ibm-plex-mono, @fontsource/ibm-plex-sans, CodeMirror 6, Storybook 10, MSW, msw-storybook-addon, Vitest, @vitest/coverage-v8, @testing-library/react, fast-check, ESLint, @eslint-react/eslint-plugin, eslint-plugin-security, Playwright, @lhci/cli, rollup-plugin-visualizer, cross-env)
246246
- **CLI**: Go 1.26+, dependencies in `cli/go.mod` (Cobra, charmbracelet/huh, charmbracelet/lipgloss, sigstore-go, go-containerregistry, go-tuf)

docs/design/operations.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -818,9 +818,12 @@ triggered by engine-level operations. No LLM in the security classification path
818818
description: "Per-agent custom configuration."
819819
```
820820

821-
The current `ToolPermissionChecker` implements **category-level gating only** -- each access
822-
level maps to a set of permitted `ToolCategory` values. The granular sub-constraints shown
823-
above (network mode, containerization) are planned for Docker/K8s sandbox backends.
821+
The `ToolPermissionChecker` implements two layers of enforcement: **category-level gating**
822+
(each access level maps to permitted `ToolCategory` values) and **granular sub-constraints**
823+
(`SubConstraintEnforcer`) checking file system scope, network mode, terminal access, git access,
824+
code execution isolation, and approval requirements against each tool invocation. Per-agent
825+
overrides can customize all six dimensions via `ToolPermissions.sub_constraints`. K8s sandbox
826+
backend integration is planned for Phase 3-4.
824827

825828
### Progressive Trust
826829

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ dependencies = [
1818
"argon2-cffi==25.1.0",
1919
"cryptography==46.0.6",
2020
"faker==40.12.0",
21+
"httpx==0.28.1",
2122
"jinja2==3.1.6",
2223
"jsonschema==4.26.0",
2324
"litellm==1.83.0",

src/synthorg/config/defaults.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,4 +45,7 @@ def default_config_dict() -> dict[str, object]:
4545
"backup": {},
4646
"workflow": {},
4747
"notifications": {},
48+
"web": None,
49+
"database": None,
50+
"terminal": None,
4851
}

src/synthorg/config/schema.py

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,12 @@
3737
from synthorg.providers.enums import AuthType
3838
from synthorg.security.config import SecurityConfig
3939
from synthorg.security.trust.config import TrustConfig
40+
from synthorg.tools.database.config import DatabaseConfig # noqa: TC001
4041
from synthorg.tools.git_url_validator import GitCloneNetworkPolicy
4142
from synthorg.tools.mcp.config import MCPConfig
4243
from synthorg.tools.sandbox.sandboxing_config import SandboxingConfig
44+
from synthorg.tools.terminal.config import TerminalConfig # noqa: TC001
45+
from synthorg.tools.web.config import WebToolsConfig # noqa: TC001
4346

4447
logger = get_logger(__name__)
4548

@@ -573,6 +576,11 @@ class RootConfig(BaseModel):
573576
backup: Backup and restore configuration.
574577
workflow: Workflow type configuration.
575578
notifications: Notification subsystem configuration.
579+
web: Web tool configuration (``None`` = default web config).
580+
database: Database tool configuration (``None`` = no database
581+
tools).
582+
terminal: Terminal tool configuration (``None`` = default
583+
terminal config).
576584
"""
577585

578586
model_config = ConfigDict(frozen=True, allow_inf_nan=False)
@@ -708,6 +716,18 @@ class RootConfig(BaseModel):
708716
default_factory=NotificationConfig,
709717
description="Notification subsystem configuration",
710718
)
719+
web: WebToolsConfig | None = Field(
720+
default=None,
721+
description="Web tool configuration (None = default web config)",
722+
)
723+
database: DatabaseConfig | None = Field(
724+
default=None,
725+
description="Database tool configuration (None = no database tools)",
726+
)
727+
terminal: TerminalConfig | None = Field(
728+
default=None,
729+
description="Terminal tool configuration (None = default terminal config)",
730+
)
711731

712732
@model_validator(mode="after")
713733
def _validate_unique_agent_names(self) -> Self:

src/synthorg/core/agent.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
from synthorg.core.types import ModelTier, NotBlankStr # noqa: TC001
2525
from synthorg.observability import get_logger
2626
from synthorg.observability.events.config import CONFIG_VALIDATION_FAILED
27+
from synthorg.tools.sub_constraints import ToolSubConstraints # noqa: TC001
2728

2829
logger = get_logger(__name__)
2930

@@ -298,6 +299,9 @@ class ToolPermissions(BaseModel):
298299
are available.
299300
allowed: Explicitly allowed tool names.
300301
denied: Explicitly denied tool names.
302+
sub_constraints: Optional per-agent sub-constraints overriding
303+
the access level defaults. When ``None``, the checker
304+
resolves defaults from the access level.
301305
"""
302306

303307
model_config = ConfigDict(frozen=True, allow_inf_nan=False)
@@ -314,6 +318,10 @@ class ToolPermissions(BaseModel):
314318
default=(),
315319
description="Explicitly denied tools",
316320
)
321+
sub_constraints: ToolSubConstraints | None = Field(
322+
default=None,
323+
description="Per-agent sub-constraint overrides",
324+
)
317325

318326
@model_validator(mode="after")
319327
def _validate_no_overlap(self) -> Self:
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
"""Event constants for database tool operations."""
2+
3+
from typing import Final
4+
5+
DB_QUERY_START: Final[str] = "db.query.start"
6+
DB_QUERY_SUCCESS: Final[str] = "db.query.success"
7+
DB_QUERY_FAILED: Final[str] = "db.query.failed"
8+
DB_QUERY_TIMEOUT: Final[str] = "db.query.timeout"
9+
DB_SCHEMA_INSPECT_START: Final[str] = "db.schema.inspect.start"
10+
DB_SCHEMA_INSPECT_SUCCESS: Final[str] = "db.schema.inspect.success"
11+
DB_SCHEMA_INSPECT_FAILED: Final[str] = "db.schema.inspect.failed"
12+
DB_WRITE_BLOCKED: Final[str] = "db.write.blocked"
13+
DB_CONNECTION_OPENED: Final[str] = "db.connection.opened"
14+
DB_CONNECTION_FAILED: Final[str] = "db.connection.failed"
15+
DB_CONNECTION_CLOSED: Final[str] = "db.connection.closed"
16+
DB_CONFIG_DEFAULT_MISSING: Final[str] = "db.config.default_missing"
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
"""Event constants for tool sub-constraint enforcement."""
2+
3+
from typing import Final
4+
5+
SUB_CONSTRAINT_RESOLVED: Final[str] = "sub_constraint.resolved"
6+
SUB_CONSTRAINT_ENFORCED: Final[str] = "sub_constraint.enforced"
7+
SUB_CONSTRAINT_DENIED: Final[str] = "sub_constraint.denied"
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
"""Event constants for terminal/shell tool operations."""
2+
3+
from typing import Final
4+
5+
TERMINAL_COMMAND_START: Final[str] = "terminal.command.start"
6+
TERMINAL_COMMAND_SUCCESS: Final[str] = "terminal.command.success"
7+
TERMINAL_COMMAND_FAILED: Final[str] = "terminal.command.failed"
8+
TERMINAL_COMMAND_TIMEOUT: Final[str] = "terminal.command.timeout"
9+
TERMINAL_COMMAND_BLOCKED: Final[str] = "terminal.command.blocked"
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
"""Event constants for web tool operations."""
2+
3+
from typing import Final
4+
5+
WEB_REQUEST_START: Final[str] = "web.request.start"
6+
WEB_REQUEST_SUCCESS: Final[str] = "web.request.success"
7+
WEB_REQUEST_FAILED: Final[str] = "web.request.failed"
8+
WEB_REQUEST_TIMEOUT: Final[str] = "web.request.timeout"
9+
WEB_SSRF_BLOCKED: Final[str] = "web.ssrf.blocked"
10+
WEB_SSRF_DISABLED: Final[str] = "web.ssrf.disabled"
11+
WEB_DNS_FAILED: Final[str] = "web.dns.failed"
12+
WEB_SEARCH_START: Final[str] = "web.search.start"
13+
WEB_SEARCH_SUCCESS: Final[str] = "web.search.success"
14+
WEB_SEARCH_FAILED: Final[str] = "web.search.failed"
15+
WEB_PARSE_START: Final[str] = "web.parse.start"
16+
WEB_PARSE_SUCCESS: Final[str] = "web.parse.success"
17+
WEB_PARSE_FAILED: Final[str] = "web.parse.failed"

0 commit comments

Comments
 (0)