Lift declarative artifacts into the right substrate. Diagrams, playbooks, and queries already encode dataflow or control flow; Philip turns them into runnable Burr state machines and Hamilton DAGs you can audit, replay, and introspect. Drive the resulting FSMs under MCP gating via Theodosia, or run any lifted artifact locally as a regular Python application.
Sources lifting to Burr state machines:
- Ansible playbooks (YAML) —
philip.from_playbook(path). Tasks become actions, conditions become guards, failures become classified transitions. - Mermaid stateDiagram-v2 —
philip.from_mermaid(path). Paste a diagram, get a runnable FSM. Multi-outbound branches lift to_choiceguards the actor picks at runtime.
Sources lifting to Hamilton DAGs (install the hamilton extra):
- Mermaid flowchart / graph —
philip.from_mermaid_flow(path). Each node becomes a Hamilton function whose parameter names declare its upstream dependencies, exactly mirroring the diagram's edges. - SQL with CTEs —
philip.from_sql_cte(sql). Each CTE becomes a function; external tables become Driver inputs. Read a 500-line query as a typed, visualizable dataflow.
More sources land in subsequent minor releases: AWS Step Functions ASL,
BPMN, dbt manifests, and additional DAG sources. Hamilton itself ships
no from_X of its own; Burr ships no from_X of its own. Philip is
the lift layer both substrates were missing.
import philip
# Ansible YAML -> Burr Application
app = philip.from_playbook("site.yml")
# Mermaid diagram -> Burr Application
app = philip.from_mermaid("docs/incident_response.mmd")
last, _, state = app.run(halt_after=["done", "escalate"])
report = philip.inspect("site.yml")
print(report.rendered_markdown())
# SQL with CTEs -> Hamilton module
module = philip.from_sql_cte(open("query.sql").read())
from hamilton.driver import Driver
Driver({}, module).visualize_execution(["query"], output_file_path="dag.png")Today there are two choices for AI-operated infrastructure. Run a playbook unattended (rigid, no judgment, can't adapt to context). Or let a free-form agent run shell commands (no constraint, no audit). Both extremes are wrong for most real ops work.
A third option exists. The playbook author writes the contract: which steps are reachable, in what order, where verification must happen, which failures route to recovery. The model (or human) operates within that contract. It picks, interprets logs, refuses when unsure, composes with other tools. It cannot invent steps, skip verification, or escape the procedure.
Three independent sources validate this architecture:
- STRATUS (IBM Research, NeurIPS 2025) demonstrated FSM-organized SRE agents beat free-form ones by at least 1.5x on AIOpsLab and ITBench.
- ITBench (IBM Research, Apache 2.0) is the open benchmark suite the STRATUS work was evaluated on.
- Wikimedia's Spicerack is the operational existence proof. The most transparent SRE organization in the public world looked at Ansible in 2014, decided it was not controllable enough for production response, and built a Python framework with explicit phases and structural error handling from scratch.
Burr is the structured-Python substrate Spicerack would have used if it existed in 2014. Philip lifts your existing Ansible playbooks onto that substrate without rewriting anything.
uv add philip-machinePulls ansible-core and ansible-runner transitively. Install Ansible
collections via ansible-galaxy for your modules of choice:
ansible-galaxy collection install community.general community.docker ansible.posix| Situation | ansible-playbook |
Philip + Burr |
|---|---|---|
| Mid-procedure decisions on runtime context | when: over already-known vars |
Explicit transitions; an agent or human picks based on full state |
| Resume from arbitrary point | --start-at-task hack; registers lost |
Burr persister rebuilds full state including registers |
| Approval gates | pause: is stdin-bound; AWX bolt-on |
First-class states; approver acts through the same surface as any actor |
| Auditable rationale | Task logs only | Every transition logged with actor's choice and state snapshot |
| Counterfactual replay | None | fork_at(sequence_id) walks the alternate path |
| Composition with non-Ansible work | Shell out and pray | Coordinate Ansible steps alongside any Python in the same audit surface |
| Refusal as a structural action | Run or fail | First-class refusal transitions; "I don't know, escalating" routes structurally |
ansible-playbook remains correct for deterministic, unattended, batch
runs. Philip is the right tool when the playbook has decision points,
needs verification gates, must be auditable for postmortem, or composes
with non-Ansible work.
Philip lifts a defined subset of Ansible playbook syntax. Supported:
- One play per file (multi-play raises
UnsupportedPlaybookConstruct). - Tasks with
name, exactly one module reference, and a module-args dict. when:predicates (string expressions; lists are AND-joined).register:capturing the full module result into a state key.become:per-task or per-play.failed_when:andignore_errors:as guard transitions.gather_facts: yeslowers to a leadingansible.builtin.setupaction.- Play-level
vars:populatewith_state(...). block:(group-only) inlines its tasks with the block'swhen:AND-propagated to each inner task.include_tasks:andimport_tasks:with literal filesystem paths.notify:andhandlers:with a deferred handler gated on_last_changed.loop:andwith_items:with literal list values.set_fact:(bare and FQCN).
The following raise UnsupportedPlaybookConstruct:
rescue:andalways:(deferred to a later release).loop_control:,with_dict:,with_fileglob:,with_subelements:.- Jinja-templated
loop:values. - Jinja-templated
include_tasks:paths. import_role:,include_role:,include:.roles:blocks,pre_tasks:,post_tasks:.serial:,strategy:,max_fail_percentage:,any_errors_fatal:.- Multi-play files.
The supported subset covers the common single-play remediation and day-2 procedures the FSM lift adds value to.
import philip
app = philip.from_mermaid("incident_response.mmd")Or directly from text:
text = """
stateDiagram-v2
[*] --> Acknowledge
Acknowledge --> Investigate : on_alert
Investigate --> Mitigate
Investigate --> Escalate : severity == "critical"
Mitigate --> Verify
Verify --> Done
Escalate --> Done
Done --> [*]
"""
app = philip.from_mermaid_text(text)Lifting rules:
[*] --> Xdeclares the entrypoint. Exactly one is required.X --> [*]marksXas a terminal. Every terminal routes through a synthesizeddoneaction so the Burr graph is closed.A --> Bis an unconditional transition.A --> B : labelcarries an edge label. Labels that look like Python predicates (contain comparison operators,and,or,not,in,is) lift directly toburr.core.exprguards. Labels that look like event names lift to_choice == "<label>"guards when the source has multiple outbound edges; otherwise they are documentation only.- Comments (
%%),classDef,class,note,direction, andstatedeclarations without a body are ignored. - Composite states (
state X { ... }) raiseMermaidLiftError. Inline them in your diagram before lifting.
Pair with Theodosia to mount a diagram as an MCP server:
import philip
import theodosia
theodosia.mount(
philip.from_mermaid("incident_response.mmd"),
name="incident",
).run()Theodosia mounts a Burr
Application as an MCP server. Philip lifts an Ansible playbook to a
Burr Application. The composition is two lines:
import philip
import theodosia
theodosia.mount(philip.from_playbook("site.yml"), name="nginx-deploy").run()Now any MCP client (Claude Code, Cursor, fast-agent, mcphost, a custom agent built on the Agent SDK) can drive the playbook step by step. The FSM enforces structural constraints. The model picks transitions at branch points. Every step is audit-trailed. Forking at any sequence_id gives you counterfactual replay for postmortem.
Theodosia and Philip are independent packages. Philip does not depend on Theodosia.
philip run <playbook.yml> # execute end to end
philip inspect <playbook.yml> # static report: variables + failure topology
philip inspect <playbook.yml> --format json
philip graph <playbook.yml> [--format mermaid|dot|text]
philip lint <playbook.yml> # dry-convert with structural summary
philip emit <playbook.yml> # round-trip lift -> emit canonical YAMLimport philip
# Lift
app = philip.from_playbook("site.yml")
yaml_text = philip.to_playbook(app)
# Static introspection
report = philip.inspect("site.yml")
report.variables # variable provenance DAG
report.undefined_variables # references with no defining site
report.unused_definitions # bound but never referenced
report.failure_topology # per-action FAILURE_KIND routing
report.actions_with_recovery # actions with a true recovery branch
report.unhandled_failures # actions where a failure has no transition
report.rendered_markdown() # human-readable report
# Hand-write actions backed by Ansible modules
@philip.module_action("ansible.builtin.command", reads=["target"], writes=["uptime"])
def get_uptime(state):
return {"cmd": "uptime"}
# Or call modules directly
result = philip.run_module("ansible.builtin.ping", {}, host="myhost")
# Connection management
host = philip.host(ansible_host="example.com", ansible_user="deploy")
# Polling sub-graphs
wait = philip.wait_until(
"ansible.builtin.wait_for",
args={"port": 8080, "timeout": 1},
max_attempts=30,
)Every module-backed action writes structural failure sentinels into state on each step:
_last_action_last_failed_last_changed_last_unreachable_last_msg_last_failure_kind(one ofok,unreachable,auth_failed,timeout,module_error)
Transitions branch on these without each action having to opt in via
writes=. The classification is conservative and pattern-based: the
unreachable and failed result flags are trusted; the diagnostic msg
is scanned for known phrases that indicate auth and timeout. The labels
are loosely aligned with the
MAST taxonomy.
Apache 2.0.
LLM-assisted development was used during construction.