Winner - Best Use of Gemini
Project built at CUHackit 2026 Hackathon
Open-Stitch is an AI-powered video editing system that uses multimodal models to automatically generate edited videos from raw clips and natural language instructions.
Built with FastAPI, Vite + React, Remotion, Gemini 3 Pro, and Whisper.
Devpost:https://devpost.com/software/open-stitch#updates
Screen 0: Select mp4s from Drive → download → Flash Lite summary (2 FPS)
Screen 1: Reorder clips + answer clarifying questions → structured prompt
Screen 2: Watch pipeline progress (ASR + VLM + edit plan + Remotion render)
Screen 3: Review and download final video
Graph-managed agents:
- Planning → intent brief
- Research → evidence-backed findings
- Clarification → user question set
- User Verification → approved structured prompt
- Synthesis → edit spec
- Remotion Synthesis → timeline draft
- Editing Synthesis → composition payload
- Internal Verification → deterministic checks + retry target
- Final QA → render gate (
qa_passed)
Legacy clarify/edit flows are still available as immediate fallback.
- Python 3.11+
- Node.js 20+
- Docker (for Postgres, Redis, MinIO, sandbox)
- FFmpeg installed locally
- Google Cloud project with OAuth credentials and Drive API enabled
- Google API key (for Gemini models) and/or Azure API key (for OpenAI models)
cp .env.example .envFill in .env:
GOOGLE_API_KEY=your-google-api-key
GOOGLE_BASE_URL=https://...
AZURE_API_KEY=your-azure-api-key
AZURE_BASE_URL=https://...
GOOGLE_CLIENT_ID=your-google-client-id
GOOGLE_CLIENT_SECRET=your-google-client-secret
docker compose up -d postgres redis minio minio-init phoenixpython3.12 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"Run the server:
uvicorn server.main:app --port 8080 --reloadVerify: curl http://localhost:8080/health should return {"status":"ok"}
cd client
npm install
npm run devdocker compose build sandbox-buildUse this when teammates should call one shared OpenAI-compatible endpoint that forwards to your configured inference base URL.
cp .env.relay.example .env.relay
source .venv/bin/activate
uvicorn relay.main:app --host 127.0.0.1 --port 8090Endpoints:
GET /healthGET /v1/models(Bearer token required)POST /v1/chat/completions(Bearer token required, stream + non-stream)
Cloudflare relay/backend setup: TUNNEL_GUIDE.md
Teammate env template: TEAM_CLIENT.env.example
Environment flags:
GRAPH_ENABLED=falsefalse: always use legacy clarify/edit route handlers.true: use LangGraph orchestration for clarify/edit.
GRAPH_FAIL_OPEN=truetrue: if graph execution fails, fallback to legacy.false: fail closed (no fallback).
GRAPH_MAX_STEPS=32- Max recursion budget for graph execution.
GRAPH_EMIT_NODE_EVENTS=true- Emit graph gate/state events into trace stream.
Immediate rollback:
export GRAPH_ENABLED=falseNo code changes are required for rollback.
Migration path:
- Ship with
GRAPH_ENABLED=falseto keep legacy clarify/edit behavior. - Enable graph in one environment:
GRAPH_ENABLED=true,GRAPH_FAIL_OPEN=true. - Validate traces and outputs, then disable fail-open for strict mode:
GRAPH_FAIL_OPEN=false. - If issues appear, revert immediately with
GRAPH_ENABLED=false.
The test script runs the full ASR + VLM + summary pipeline on a local video:
source .venv/bin/activate
python tools/test_ingestion.py data/your_video.mp4 --fps 4 --window 5This runs:
- Frame extraction (dense @ 4 FPS + summary @ 2 FPS) + audio extraction — parallel
- Whisper ASR + Gemini VLM + Flash Lite summary — parallel
- Merged ASR+VLM timeline
- Edit plan generation via Gemini 3 Pro
One command to validate graph + legacy paths:
./tools/test_all.shEquivalent make target:
make test-ciIndividual commands:
python3 -m ruff check server/graph server/agents/tools.py tests
python3 -m mypy --config-file mypy.graph.ini
python3 -m pytest -q testsStart the server, then:
# Health check
curl http://localhost:8080/health
# Auth (requires Google OAuth flow — use the frontend for this)
# Or test directly with a Bearer token:
curl -H "Authorization: Bearer YOUR_GOOGLE_TOKEN" \
http://localhost:8080/api/drive/filesauto-vid/
├── client/ Vite + React SPA
│ └── src/
│ ├── pages/ Login, Select, Setup, Progress, Review
│ ├── components/ VideoCard, ClarifyChat, ProgressTimeline
│ ├── hooks/ useDrive, useProject, useSocket
│ └── lib/api.ts REST + WebSocket client
│
├── server/ FastAPI Python backend
│ ├── main.py App entry point
│ ├── config.py Settings + YAML loaders
│ ├── graph/ LangGraph orchestration, nodes, validators
│ ├── routes/ auth, drive, projects, jobs
│ ├── agents/ clarifying.py, editing.py
│ ├── ingestion/ pipeline.py, asr.py, vlm.py, summary.py
│ ├── drive/ Google OAuth + Drive API client
│ ├── sandbox/ Docker container management + HTTP client
│ ├── schemas/ project.py, video.py, composition.py
│ └── storage/ db.py (Postgres), objects.py (MinIO)
│
├── sandbox/ Remotion Docker image (sandbox-server.js)
├── config/ YAML configs (models, agents, prompts)
├── docs/ Graph event taxonomy + developer guide
├── tools/ Dev scripts (test_ingestion.py)
└── docker-compose.yaml Postgres, Redis, MinIO, Phoenix
| Service | Port |
|---|---|
| Frontend | 5173 |
| Backend | 8080 |
| Postgres | 5432 |
| Redis | 6379 |
| MinIO | 9000 |
| Phoenix | 6006 |
All model and pipeline settings are in .env and config/:
| Setting | File | Purpose |
|---|---|---|
GOOGLE_API_KEY |
.env |
Google API key (Gemini models) |
AZURE_API_KEY |
.env |
Azure API key (OpenAI models) |
VLM_MODEL |
.env |
Dense VLM model (default: Gemini 3 Pro) |
SUMMARY_MODEL |
.env |
Fast summary model (default: Gemini 2.5 Flash Lite) |
SUMMARY_FPS |
.env |
Summary frame rate (default: 2) |
ASR_MODEL |
.env |
Whisper model size |
GOOGLE_CLIENT_ID |
.env |
Google OAuth client ID |
config/models.yaml |
config | Model endpoints and parameters |
config/agents.yaml |
config | Agent and ingestion settings |
config/prompts/vlm_edit_grade.txt |
config | VLM system prompt |
Graph docs:
docs/GRAPH_EVENT_TAXONOMY.mddocs/GRAPH_DEVELOPER.md
- Open
/progress/:projectIdto inspect trace graph. - Confirm node lifecycle events (
agent.start,agent.end). - Inspect gate transitions (
graph.gate) and state snapshots (graph.state). - If graph execution fails:
- with
GRAPH_FAIL_OPEN=true, verify legacy fallback in logs. - with
GRAPH_FAIL_OPEN=false, inspect verification report and raised error.
- with
- For loop issues, inspect
next_nodechosen by internal verification.