Skip to content

feat: implement procedural memory auto-generation from agent failures #420

@Aureliolo

Description

@Aureliolo

Summary

When an agent fails a task, automatically generate a structured procedural memory entry ("next time, do X when encountering Y") using a separate proposer analysis — the failed agent should not write its own lesson learned.

Motivation

From EvoSkill (arXiv:2603.02766): three-agent separation (Executor/Proposer/Skill-Builder) prevents self-modification bias and produces reusable skills. SynthOrg's MemoryCategory.PROCEDURAL ("Skills & how-to") exists in the design but has no auto-generation mechanism. Current engine/recovery.py classifies failures for reassignment but never extracts learnings.

Design

Failure Analysis Emission

Extend engine/recovery.py to emit a structured failure analysis payload:

  • What went wrong (error category, context)
  • What capability was missing
  • What steps would help next time

Proposer Analysis (Separate from Failed Agent)

A separate LLM call (NOT the failed agent) analyzes the failure and produces a structured procedural memory entry.

SKILL.md Materialization Format

Store as structured entry following Agent Skills format:

  • Description + trigger conditions + steps
  • Three-tier progressive disclosure (discovery ~100 tokens, activation <5000, execution on-demand)
  • Git-native versioning, portable to 30+ agent platforms

Retrieval Integration

Procedural memories surface via memory/retriever.py ranking when agent encounters similar task context in future.

Affected Files

  • src/ai_company/engine/recovery.py (failure analysis payload)
  • src/ai_company/memory/ (PROCEDURAL category storage + retrieval)
  • New: proposer analysis module

Research


Additional Research (2026-03-26)

Self-Improvement Patterns

XSkill Continual Learning (arXiv:2603.12056):

  • Dual-stream architecture: experiences (action-level granularity) + skills (task-level abstraction) extracted from past trajectories
  • Reinforces the PROCEDURAL memory category design -- experiences feed skill distillation

Hyperagents Compounding Meta-Improvement (arXiv:2603.19461, Meta AI):

  • Self-referential agents where the improvement process itself is editable (DGM-Hyperagent)
  • Task agent + meta agent in a single modifiable program
  • Autonomously discovers persistent memory and performance tracking
  • Results: paper review 0.0->0.710, robotics 0.060->0.372, zero-shot Olympiad math 0.630
  • Meta-improvements transfer across domains and compound across runs
  • Caution: full autonomous self-modification conflicts with progressive trust; scope to procedural memory/skill layer only

EvoSkill Pareto Selection (already referenced):

  • Add detail: Pareto frontier selection retains only non-dominated skill configurations
  • +7.3% on OfficeQA, +12.1% on SealQA, +5.3% zero-shot transfer to BrowseComp

Metadata

Metadata

Assignees

No one assigned

    Labels

    prio:mediumShould do, but not blockingscope:medium1-3 days of workspec:memoryDESIGN_SPEC Section 7 - Memory & Persistencespec:task-workflowDESIGN_SPEC Section 6 - Task & Workflow Enginetype:featureNew feature implementationv0.6Minor version v0.6v0.6.3Patch release v0.6.3

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions