Skip to content

dimenwarper/sigil

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sigil

A framework for LLM-guided code optimization.

Sigil treats code optimization as search:

  • Specs declare what code may change
  • Evals declare how candidates are measured
  • Optimizers propose candidates via LLM and select the best
  • Workspaces record every candidate with full provenance

Installation

# Clone the repository
git clone https://github.com/your-org/sigil.git
cd sigil

# Install with uv (recommended)
uv venv
source .venv/bin/activate
uv pip install -e .

# Or with pip
pip install -e .

Quick Start

1. Create a Spec

A spec defines what code Sigil can modify. Use EVOLVE-BLOCK markers in your source:

# fibonacci.py
def fibonacci(n: int) -> int:
    # SIGIL:EVOLVE-BLOCK-START fib_impl
    if n <= 1:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)
    # SIGIL:EVOLVE-BLOCK-END fib_impl

Generate a spec scaffold:

sigil generate-spec my_optimizer

Or create .sigil/my_optimizer.sigil.yaml manually:

version: "0.1"
name: my_optimizer
repo_root: ".."

pins:
  - id: fib_impl
    language: python
    files:
      - fibonacci.py
    evolve_block: fib_impl

evals:
  - my_eval.eval.yaml

2. Create an Eval

An eval defines how to measure candidates, the following generates an eval specification

sigil generate-eval --spec my_optimizer my_eval

Or create .sigil/my_eval.eval.yaml:

version: "0.1"
name: my_eval

# Cascade: stages run in order, short-circuit on failure
stages:
  - id: syntax
    command: python -m py_compile fibonacci.py
    accept: "exit_code==0"
    timeout_s: 10

  - id: tests
    command: pytest test_fibonacci.py -v
    accept: "exit_code==0"
    timeout_s: 60

# Metrics: measured after stages pass
metrics:
  - id: benchmark_ms
    kind: timer
    command: python benchmark.py
    parse: "regex:Time: ([0-9.]+)ms"

# Optimization objective
aggregate:
  objective: "minimize:benchmark_ms"

budgets:
  candidate_timeout_s: 120

3. Run Optimization

# With stub provider (for testing)
sigil run --spec my_optimizer --workspace dev --num 5 --provider stub

# With real LLM (requires API key)
export ANTHROPIC_API_KEY=your-key
sigil run --spec my_optimizer --workspace prod --num 10 --provider anthropic

4. Inspect Results

# View candidates and metrics
sigil inspect --spec my_optimizer --workspace dev

# Show the lineage tree
sigil inspect --spec my_optimizer --workspace dev --show-lineage

# Show diff for best candidate
sigil inspect --spec my_optimizer --workspace dev --show-diff

5. Select the Best

sigil select --spec my_optimizer --workspace dev \
    --candidate <candidate_id> --name v1_optimized

CLI Reference

Spec Commands

# Validate a spec file
sigil validate-spec .sigil/my_spec.sigil.yaml

# Generate a spec scaffold
sigil generate-spec <name> [--language python] [--files "src/**/*.py"]

# Show extracted context for a pin
sigil show-pin --spec <name> --pin <pin_id> [--redact/--no-redact]

Eval Commands

# Validate an eval file
sigil validate-eval .sigil/my_eval.eval.yaml

# Generate an eval scaffold
sigil generate-eval --spec <spec> <name> [--type cascade|metrics]

Run Commands

sigil run --spec <spec> --workspace <workspace> \
    [--eval <eval>] \
    [--optimizer simple] \
    [--provider stub|anthropic|openai] \
    [--backend local] \
    [--num <candidates>] \
    [--max-repair <attempts>] \
    [--seed <seed>] \
    [--parallel <workers>]

Workspace Commands

# Inspect a workspace
sigil inspect --spec <spec> --workspace <workspace> \
    [--run <run_id>] \
    [--show-lineage] \
    [--show-diff] \
    [--show-artifacts] \
    [--candidate <id>]

# Select a candidate
sigil select --spec <spec> --workspace <workspace> \
    --candidate <id> --name <name>

Diagnostics

# Check environment and providers
sigil doctor

# List available providers
sigil providers

# Validate a patch file
sigil validate-patch --spec <spec> --patch-file <file> [--apply]

Concepts

Pins

Pins identify mutable code regions. Sigil supports three types:

  1. EVOLVE-BLOCK: Marked regions in source files

    # SIGIL:EVOLVE-BLOCK-START my_block
    # ... code that can be modified ...
    # SIGIL:EVOLVE-BLOCK-END my_block
  2. Symbol: Function or class by name (Python only)

    pins:
      - id: my_func
        language: python
        files: ["module.py"]
        symbol: my_function
  3. File: Entire file is mutable

    pins:
      - id: whole_file
        language: python
        files: ["target.py"]

Evals

Evals define how candidates are measured:

  • Stages: Sequential checks that short-circuit on failure (e.g., syntax → tests → benchmark)
  • Metrics: Numeric measurements (timer, checker, numeric)
  • Aggregate: How to compare candidates (minimize/maximize objective)
  • Budgets: Timeouts and limits

Workspaces

Workspaces store optimization runs:

.sigil/
  <spec>/
    workspaces/
      <workspace>/
        runs/
          <run_id>/
            run.json           # Config
            index.json         # Candidate index
            traces.jsonl       # Event trace
            candidates/
              <candidate_id>/
                patch.diff
                result.json
                ...
        selected/
          <name> -> runs/<run_id>/candidates/<id>

Providers

LLM providers for generating proposals:

  • stub: Deterministic responses for testing
  • anthropic: Claude API (requires ANTHROPIC_API_KEY)
  • openai: OpenAI API (requires OPENAI_API_KEY)

Backends

Execution backends for evaluation:

  • local: Process-based isolation with parallel workers

Examples

See the examples/ directory:

Development

# Install dev dependencies
uv pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=sigil

# Type checking
mypy sigil

Project Structure

sigil/
├── sigil/
│   ├── __init__.py      # Version
│   ├── cli.py           # CLI commands
│   ├── data_models.py   # Core data models (Spec, Eval, Pin, etc.)
│   ├── loaders.py       # YAML loading and validation
│   ├── paths.py         # Path resolution utilities
│   ├── pins.py          # Pin extraction and context handling
│   ├── patches.py       # Patch parsing and application
│   ├── workspace.py     # Workspace management
│   ├── evals.py         # Eval execution
│   ├── providers.py     # LLM provider abstraction
│   ├── backend.py       # Evaluation backend
│   ├── optimizer.py     # Optimization loop
│   ├── tracing.py       # JSONL tracing
│   ├── scaffolds.py     # Scaffold generation
│   └── utils.py         # Error handling, formatting
├── tests/               # Test suite
├── examples/            # Example projects
├── SPEC.md              # Original design specification
└── PLAN.md              # Implementation plan

License

[Add your license here]

Acknowledgments

Inspired by ideas from OpenEvolve and the broader LLM-guided code optimization community.

About

Massively parallel agents to optimize anything

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages