A framework for LLM-guided code optimization.
Sigil treats code optimization as search:
- Specs declare what code may change
- Evals declare how candidates are measured
- Optimizers propose candidates via LLM and select the best
- Workspaces record every candidate with full provenance
# Clone the repository
git clone https://github.com/your-org/sigil.git
cd sigil
# Install with uv (recommended)
uv venv
source .venv/bin/activate
uv pip install -e .
# Or with pip
pip install -e .A spec defines what code Sigil can modify. Use EVOLVE-BLOCK markers in your source:
# fibonacci.py
def fibonacci(n: int) -> int:
# SIGIL:EVOLVE-BLOCK-START fib_impl
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
# SIGIL:EVOLVE-BLOCK-END fib_implGenerate a spec scaffold:
sigil generate-spec my_optimizerOr create .sigil/my_optimizer.sigil.yaml manually:
version: "0.1"
name: my_optimizer
repo_root: ".."
pins:
- id: fib_impl
language: python
files:
- fibonacci.py
evolve_block: fib_impl
evals:
- my_eval.eval.yamlAn eval defines how to measure candidates, the following generates an eval specification
sigil generate-eval --spec my_optimizer my_evalOr create .sigil/my_eval.eval.yaml:
version: "0.1"
name: my_eval
# Cascade: stages run in order, short-circuit on failure
stages:
- id: syntax
command: python -m py_compile fibonacci.py
accept: "exit_code==0"
timeout_s: 10
- id: tests
command: pytest test_fibonacci.py -v
accept: "exit_code==0"
timeout_s: 60
# Metrics: measured after stages pass
metrics:
- id: benchmark_ms
kind: timer
command: python benchmark.py
parse: "regex:Time: ([0-9.]+)ms"
# Optimization objective
aggregate:
objective: "minimize:benchmark_ms"
budgets:
candidate_timeout_s: 120# With stub provider (for testing)
sigil run --spec my_optimizer --workspace dev --num 5 --provider stub
# With real LLM (requires API key)
export ANTHROPIC_API_KEY=your-key
sigil run --spec my_optimizer --workspace prod --num 10 --provider anthropic# View candidates and metrics
sigil inspect --spec my_optimizer --workspace dev
# Show the lineage tree
sigil inspect --spec my_optimizer --workspace dev --show-lineage
# Show diff for best candidate
sigil inspect --spec my_optimizer --workspace dev --show-diffsigil select --spec my_optimizer --workspace dev \
--candidate <candidate_id> --name v1_optimized# Validate a spec file
sigil validate-spec .sigil/my_spec.sigil.yaml
# Generate a spec scaffold
sigil generate-spec <name> [--language python] [--files "src/**/*.py"]
# Show extracted context for a pin
sigil show-pin --spec <name> --pin <pin_id> [--redact/--no-redact]# Validate an eval file
sigil validate-eval .sigil/my_eval.eval.yaml
# Generate an eval scaffold
sigil generate-eval --spec <spec> <name> [--type cascade|metrics]sigil run --spec <spec> --workspace <workspace> \
[--eval <eval>] \
[--optimizer simple] \
[--provider stub|anthropic|openai] \
[--backend local] \
[--num <candidates>] \
[--max-repair <attempts>] \
[--seed <seed>] \
[--parallel <workers>]# Inspect a workspace
sigil inspect --spec <spec> --workspace <workspace> \
[--run <run_id>] \
[--show-lineage] \
[--show-diff] \
[--show-artifacts] \
[--candidate <id>]
# Select a candidate
sigil select --spec <spec> --workspace <workspace> \
--candidate <id> --name <name># Check environment and providers
sigil doctor
# List available providers
sigil providers
# Validate a patch file
sigil validate-patch --spec <spec> --patch-file <file> [--apply]Pins identify mutable code regions. Sigil supports three types:
-
EVOLVE-BLOCK: Marked regions in source files
# SIGIL:EVOLVE-BLOCK-START my_block # ... code that can be modified ... # SIGIL:EVOLVE-BLOCK-END my_block
-
Symbol: Function or class by name (Python only)
pins: - id: my_func language: python files: ["module.py"] symbol: my_function
-
File: Entire file is mutable
pins: - id: whole_file language: python files: ["target.py"]
Evals define how candidates are measured:
- Stages: Sequential checks that short-circuit on failure (e.g., syntax → tests → benchmark)
- Metrics: Numeric measurements (timer, checker, numeric)
- Aggregate: How to compare candidates (minimize/maximize objective)
- Budgets: Timeouts and limits
Workspaces store optimization runs:
.sigil/
<spec>/
workspaces/
<workspace>/
runs/
<run_id>/
run.json # Config
index.json # Candidate index
traces.jsonl # Event trace
candidates/
<candidate_id>/
patch.diff
result.json
...
selected/
<name> -> runs/<run_id>/candidates/<id>
LLM providers for generating proposals:
stub: Deterministic responses for testinganthropic: Claude API (requiresANTHROPIC_API_KEY)openai: OpenAI API (requiresOPENAI_API_KEY)
Execution backends for evaluation:
local: Process-based isolation with parallel workers
See the examples/ directory:
- fibonacci_opt: Optimize a naive Fibonacci implementation
# Install dev dependencies
uv pip install -e ".[dev]"
# Run tests
pytest
# Run with coverage
pytest --cov=sigil
# Type checking
mypy sigilsigil/
├── sigil/
│ ├── __init__.py # Version
│ ├── cli.py # CLI commands
│ ├── data_models.py # Core data models (Spec, Eval, Pin, etc.)
│ ├── loaders.py # YAML loading and validation
│ ├── paths.py # Path resolution utilities
│ ├── pins.py # Pin extraction and context handling
│ ├── patches.py # Patch parsing and application
│ ├── workspace.py # Workspace management
│ ├── evals.py # Eval execution
│ ├── providers.py # LLM provider abstraction
│ ├── backend.py # Evaluation backend
│ ├── optimizer.py # Optimization loop
│ ├── tracing.py # JSONL tracing
│ ├── scaffolds.py # Scaffold generation
│ └── utils.py # Error handling, formatting
├── tests/ # Test suite
├── examples/ # Example projects
├── SPEC.md # Original design specification
└── PLAN.md # Implementation plan
[Add your license here]
Inspired by ideas from OpenEvolve and the broader LLM-guided code optimization community.