Skip to content

feat: capability-aware prompt profiles for model tier adaptation #805

@Aureliolo

Description

@Aureliolo

Context

Research (2026-03-24): OpenPencil uses capability-aware prompt engineering with full/standard/basic profiles per model tier. Deep dive revealed a consequential gap in SynthOrg: auto-downgrade changes the model tier but never adapts the prompt.

A small model receives the same verbose personality, nested acceptance criteria, and org policies as a large model. This likely degrades output quality on cheaper models.

Current State

  • ModelRequirement has capabilities: tuple[str, ...] marked "Future-use", never read
  • build_system_prompt() has token-budget trimming but zero model-tier awareness
  • Single Jinja2 DEFAULT_TEMPLATE for all tiers
  • Auto-downgrade (budget/enforcer.py) changes model but prompt stays identical

Scope

PromptProfile registry (engine/prompt_profiles.py)

  • PromptProfile frozen Pydantic model: tier, max_personality_tokens, include_org_policies, simplify_acceptance_criteria, autonomy_detail_level (full/summary/minimal), personality_mode (full/condensed/minimal)
  • PromptProfileRegistry maps ModelTier -> PromptProfile
  • Three built-in profiles:
    • full (large): all sections, full personality, full criteria, full org policies
    • standard (medium): condensed personality (2-3 key traits), bullet-point criteria, org policies included
    • basic (small): minimal personality (role + 1-line), single-sentence criteria, no org policies/company context
  • Authority and security sections NEVER stripped regardless of profile

Template adaptation (engine/prompt_template.py)

  • Add profile-conditional Jinja2 sections (conditionals, not template duplication)

Integration (engine/prompt.py, engine/agent_engine.py)

  • build_system_prompt() gains model_tier: ModelTier | None parameter
  • Engine passes resolved tier (post-downgrade) to prompt builder

Optional: personality preset variants (templates/presets.py)

  • condensed_description and minimal_description fields on personality presets
  • Prompt builder selects based on PromptProfile.personality_mode

Optional: activate capabilities field

  • ModelRequirement.capabilities tags (reasoning, tool_use, long_context) influence profile selection beyond tier

Deliverables

  • PromptProfile model and registry with 3 built-in profiles
  • Jinja2 template conditionals for profile-driven rendering
  • build_system_prompt() model-tier awareness
  • Engine wiring (pass tier from identity to prompt builder)
  • Event constants for profile selection logging
  • Unit tests for profile selection and tier-aware prompt rendering
  • Design spec update (docs/design/engine.md)

Research

  • Deep dive: research/capability-aware-prompts.md (project memory)
  • Source: OpenPencil -- MIT, 1.5k stars

Additional Research (2026-03-26)

Cognitive Gating via Answer Separability

Source: SpecEyes (arXiv:2603.23483)

Speculative cognitive gating: a lightweight model pre-checks if the full agent tool chain is actually needed for a given task. Bypasses 30-70% of tasks for 1.1-3.35x throughput improvement.

Answer Separability Metric (S_sep):

  • Measures the margin between the top logit and its top-K competitors, normalized by their standard deviation
  • Scale-invariant and calibration-free
  • Uses min-token aggregation (worst-case guard) across all generated tokens
  • Sharp bimodal separation between correct and incorrect answers

Heterogeneous Parallel Serving Funnel:

  • Lightweight stateless model runs batched in parallel for "easy" tasks
  • Only the residual set (low S_sep) hits the full sequential agentic pipeline
  • Throughput speedup approx 1 / (1 - beta * alpha) where beta = screening ratio, alpha = acceptance rate

Application to SynthOrg: integrate with model routing strategies -- a fast pre-check using the "small" tier model could determine whether to invoke the full tool chain or return a direct answer, reducing both latency and cost.

Metadata

Metadata

Assignees

No one assigned

    Labels

    prio:mediumShould do, but not blockingscope:medium1-3 days of workspec:providersDESIGN_SPEC Section 9 - Model Provider Layerspec:task-workflowDESIGN_SPEC Section 6 - Task & Workflow Enginespec:templatesDESIGN_SPEC Section 14 - Templates & Buildertype:featureNew feature implementationv0.6Minor version v0.6v0.6.2Patch release v0.6.2

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions