Skip to content

Implement progressive trust system with pluggable TrustStrategy protocol (DESIGN_SPEC §11.3) #43

@Aureliolo

Description

@Aureliolo

Context

Implement the progressive trust system defined in DESIGN_SPEC §11.3. Agents can earn higher tool access over time through configurable trust strategies. The trust system implements a TrustStrategy protocol, making it extensible — new strategies can be added without modifying existing ones.

Trust Strategies (§11.3)

Strategy 1: Disabled (Static Access)

Trust is disabled. Agents receive their configured access level at hire time and it never changes. Useful when the human manages permissions manually.

Strategy 2: Weighted Score (Single Track)

A single trust score computed from weighted factors: task difficulty completed, error rate, time active, and human feedback. One global trust level per agent. Human approval required for promotion to elevated regardless of score.

Strategy 3: Per-Category Trust Tracks

Separate trust tracks per tool category (filesystem, git, deployment, database, network). An agent can be "standard" for files but "sandboxed" for deployment. Human approval gate required for any production-touching category.

Strategy 4: Milestone Gates (ATF-Inspired)

Explicit capability milestones aligned with the Cloud Security Alliance Agentic Trust Framework. Automated promotion for low-risk levels. Human approval gates for elevated access. Trust is time-bound and subject to periodic re-verification — trust decays if the agent is idle for extended periods or error rate increases.

Acceptance Criteria

Protocol Interface

  • TrustStrategy protocol defined with standard operations (evaluate, promote, demote, check_access)
  • Strategy selection configurable via YAML (trust.strategy: "disabled" | "weighted" | "per_category" | "milestone")
  • New strategies addable without modifying existing ones

Implementations

  • Disabled strategy — static access levels, no trust tracking
  • Weighted strategy — single score, weighted factors, promotion thresholds
  • Per-category strategy — per-tool-category trust tracks, separate promotion criteria
  • Milestone strategy — explicit milestones, auto-promote for low-risk, human gates for elevated, trust decay + re-verification

Common Requirements

  • Trust level tracking per agent (sandboxed, restricted, standard, elevated)
  • Human approval required for promotion to elevated (all strategies except disabled)
  • Trust level determines available tools (tool access matrix)
  • Demotion support (trust can be reduced by human or on policy violation)
  • Trust level change audit trail
  • Unit tests for each strategy and promotion path (>80% coverage)

Dependencies

Design Spec Reference

  • §11.3 — Progressive Trust (4 strategies behind TrustStrategy protocol)

Updated 2026-03-06: Rewritten to reflect DESIGN_SPEC §11.3 expansion from simple 3-level progression to 4 pluggable strategies behind TrustStrategy protocol.


Design Decisions Finalized

  • D2 — Quality Scoring: Pluggable QualityScoringStrategy protocol. Initial: layered combination (Layer 1: CI signals free, Layer 2: LLM judge, Layer 3: human override). Start with Layer 1 only.
  • D3 — Collaboration Scoring: Pluggable CollaborationScoringStrategy protocol. Initial: automated behavioral telemetry (delegation_success_rate, response_latency, conflict_resolution_constructiveness, meeting_contribution_rate, loop_prevention_score, handoff_completeness).

Common pattern: All strategies use pluggable protocol interfaces with one initial implementation. Alternative strategies are documented in DESIGN_SPEC.md for future.

Metadata

Metadata

Assignees

No one assigned

    Labels

    prio:mediumShould do, but not blockingscope:medium1-3 days of workspec:agent-systemDESIGN_SPEC Section 3 - Agent Systemspec:company-structureDESIGN_SPEC Section 4 - Company Structurespec:hrDESIGN_SPEC Section 8 - HR & Workforce Managementspec:human-interactionDESIGN_SPEC Section 13 - Human Interaction Layerspec:securityDESIGN_SPEC Section 12 - Security & Approval Systemspec:toolsDESIGN_SPEC Section 11 - Tool & Capability Systemtype:featureNew feature implementation

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions