GitHub - modelguide/modelguide: Open-source voice agent orchestration framework - build production voice AI pipelines without vendor lock-in

Own your agent stack.

ModelGuide is the open-source orchestration layer for production voice-first agents.
Keep your runtime. Wire up integrations once. Define agent behavior with playbooks, SOPs, and guardrails.
Build → generate tests → simulate → score → improve → ship. A closed feedback loop you own.

No vendor lock-in. Bring your own models, runtimes, channels, and deployment.

Start with a reference implementation → LiveKit · Pipecat · ElevenLabs · Mastra

Quick Start · Reference Implementations · Connect Your Agent · Admin Guide · Build a Connector · Roadmap

The Missing Feedback Loop

Getting an agent to talk is easy. Making it reliable is the hard part.

A bad conversation happens. Someone reviews it manually. A prompt gets tweaked. But no reusable test is created, no eval is added, and the same failure comes back later in a slightly different form.

The missing layer is the feedback loop around the runtime: business tool access, policy enforcement, session history, QA workflows, evals, provisioning, and deployment.

ModelGuide gives you that layer as open source — so you can turn failures into tests, tests into better instructions, and ship voice agents on any stack without rebuilding production infrastructure from scratch. Start with voice. Extend to other customer-facing channels when needed.

What ModelGuide Is

sop_demo.mp4

ModelGuide sits between your agent runtime and your business systems. It is not a voice runtime and it is not a hosted black box. It is the orchestration layer you own.

Connect business systems once over MCP
Assign the right tools to each agent with confirmation gates and secure credentials
Compile SOPs and guardrails into agent behavior
Record sessions with transcripts, tool traces, CSAT, and QA tags
Run evals and simulations against real workflows
Provision new organizations from repeatable YAML blueprints

Why Builders Use ModelGuide

Builder need	What ModelGuide gives you
Closed feedback loop	Run simulations and evals, turn failed conversations into reusable test cases and evaluators, and recompile better instructions
Less production glue code	Connect tools, sessions, SOPs, evals, and operator workflows without rebuilding the harness around every runtime
Runtime portability	Keep LiveKit, Pipecat, ElevenLabs, Mastra, or your own runtime. The business layer stays portable.
One place for agent context	Manage tools, SOPs, guardrails, confirmation policies, and review workflows from a single control layer
Reviewable behavior	Full session records, tool traces, CSAT, QA tags, and eval results — complements your observability stack
Self-hostable production infrastructure	Open-source, self-hostable, with multi-tenant auth, encrypted secrets, and row-level security

ModelGuide focuses on agent behavior and review: transcripts, tool traces, CSAT, QA tags, SOP adherence, and eval results. Keep Langfuse, Datadog, Honeycomb, or OpenTelemetry for lower-level runtime telemetry and infrastructure tracing.

Connect Tools	Review Conversations	Define Behavior

Write Playbooks	Track Quality	Run Evals

Quick Start

Prerequisites: Docker 24+, Bun 1.1+, Node 22+

git clone https://github.com/modelguide/modelguide.git
cd modelguide
make quickstart

Then in separate terminals:

make api-dev    # API at http://localhost:3000
make ui-dev     # Dashboard at http://localhost:3001

Open http://localhost:3001. The seed creates three industry-vertical organizations — retail, medical call center, B2B industrial — each with Medusa e-commerce and Zendesk helpdesk connectors, two agents, and ~300 realistic sessions. Log in with delivered+admin-glowbox@resend.dev (magic link printed to API console).

Full vertical matrix, dev accounts, and session scenarios: docs/guide/seed-data.md.

How Teams Use ModelGuide

1. Define what your agent should do. Describe the persona, connect your business systems, set the rules and guardrails. ModelGuide keeps that operational context in one place.

2. Generate the instructions your runtime uses. ModelGuide compiles that context into agent instructions and exposes the approved business tools over MCP.

3. Generate test assets automatically. ModelGuide creates synthetic conversations, eval suites, evaluators, and QA workflows to test the agent before it reaches production traffic.

4. Run the feedback loop. ModelGuide runs simulations, scores behavior, and gives your team transcripts, tool traces, CSAT, QA tags, and eval results to review.

5. Tighten the operating context. Use failures to update SOPs, guardrails, persona, tools, and compiled instructions until the automated checks consistently look right.

6. Validate manually before launch. Once the agent passes the automated checks, run manual tests in your runtime and confirm the experience is good enough to ship.

The closed feedback loop is already here: define the context, compile the instructions, generate tests, run simulations, score behavior, and improve the agent from failures. Over time, more of the prompt and context fixes can be automated.

Reference Implementations

The reference implementations prove that the orchestration layer stays portable across runtimes and channels.

Start with the LiveKit implementation for the fastest end-to-end path. Use the Pipecat or ElevenLabs examples if your team already runs there. The Mastra example shows the same orchestration layer extending beyond voice when you need another customer-facing channel.

Runtime	Why it exists	Path
LiveKit Agents (flagship)	Fastest path to a production voice agent with telephony, MCP tool wiring, session tracking, eval tests, and deployment docs	`examples/agents/livekit-agent/`
Pipecat	Same orchestration model for teams already committed to Pipecat	`examples/agents/pipecat-agent/`
ElevenLabs Conversational AI	Manage platform agent config, tools, and prompts from version-controlled local definitions	`examples/agents/elevenlabs-agent/`
Mastra	Email "Where Is My Order?" example showing the orchestration layer extends beyond voice when you need another customer-facing channel	`examples/agents/mastra-wismo-email-agent/`

Provisioning an Organization

The mg CLI provisions a new organization from a directory of YAML files — users, connectors, agents with compiled instructions, SOPs, guardrails, and demo sessions — in one command. Safe to re-run against the same directory.

bun run src/cli/mg.ts setup /path/to/my-org/

Full flag reference, per-command usage, and Railway instructions: docs/guide/cli.md.

Roadmap

🚧 Sub-agents & Workflow Builder — Compose multi-step agent workflows with branching and handoffs

🚧 OTEL + A/B Testing via Langfuse — OpenTelemetry traces, prompt variant experiments, side-by-side comparison

🚧 Agentic Insights — Custom funnels tracking agent behavior through business-defined conversion paths

🚧 Closed-loop instruction tuning — turn repeated eval and simulation failures into suggested SOP, guardrail, and instruction fixes

📋 More Blueprints — Contact center ships first; healthcare intake, field service, B2B sales next

📋 Connector Marketplace — Community-built integrations

Deployment

Docker Compose for local and staging (make docker-up), Railway for production. The Railway architecture is PostgreSQL + API + UI + Caddy load balancer (the LB is the only public-facing service, routing /api/* and /mcp to the API and everything else to the UI over Railway's internal network). Config is as-code via railway.toml per service — full setup and deploy steps in railway/DEPLOY.md.

Tech Stack

Layer	Technology
API	Hono + Bun.js
Agent Protocol	MCP (`@modelcontextprotocol/sdk`)
Database	PostgreSQL 16 + Drizzle ORM
Dashboard	TanStack Start + React 19 + Tailwind CSS v4
Auth	JWT + magic links (users) · API keys (agents)
API Docs	Scalar (auto-generated from OpenAPI)

No proprietary components. Every layer is inspectable, replaceable, forkable.

Production foundations include RBAC with separate admin/support/agent auth paths, encrypted secrets, row-level security, and a full CI pipeline running lint, typecheck, unit, integration, and MCP-protocol tests on every PR. See ADR-005 for the SOP primitive, ADR-007 and ADR-009 for the evals engine.

Documentation

Resource	Description
MCP Integration Guide	Connect your AI agent via MCP
Admin Guide	Configure connectors, agents, and tools through the dashboard
Adding a Connector	Build a new connector manifest, handlers, and tests
`mg` CLI — Provisioning	Provision organizations from YAML
Seed Data	Dev accounts, orgs, and session scenarios
Architecture Decisions	ADRs for significant design choices
Deployment Guide	Railway production deployment
Contributing	Setup, workflow, project structure, conventions

Contributing

Contributions welcome. No CLA. See CONTRIBUTING.md for the full guide.

# Run checks before submitting
make api-test          # Unit + integration tests
make ui-test           # UI component tests
make api-lint-check    # Linting
make api-typecheck     # Type checking

Check open issues — look for good first issue. Fork → branch → PR with tests.

License

MIT

Built by ModelGuide · The open-source orchestration framework for production voice-first agents · 🇵🇱 Poland

Name		Name	Last commit message	Last commit date
Latest commit History 322 Commits
.claude		.claude
.github		.github
assets		assets
docker		docker
docs		docs
examples		examples
gateway		gateway
modelguide-api		modelguide-api
modelguide-ui		modelguide-ui
railway		railway
scripts		scripts
.gitignore		.gitignore
.railwayignore		.railwayignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
lefthook.yml		lefthook.yml
llms.txt		llms.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Own your agent stack.

The Missing Feedback Loop

What ModelGuide Is

Why Builders Use ModelGuide

Quick Start

How Teams Use ModelGuide

Reference Implementations

Provisioning an Organization

Roadmap

Deployment

Tech Stack

Documentation

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Own your agent stack.

The Missing Feedback Loop

What ModelGuide Is

Why Builders Use ModelGuide

Quick Start

How Teams Use ModelGuide

Reference Implementations

Provisioning an Organization

Roadmap

Deployment

Tech Stack

Documentation

Contributing

License

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages