Every personal assistant today has amnesia. You tell it you prefer aisle seats three times. It asks again. You archive the same newsletter every morning. It keeps notifying you. Every interaction starts from scratch.
SkyTwin is different. It builds a structured model of your preferences, risk tolerances, and decision patterns — a digital twin — then uses that model to act on your behalf. When it's confident, it just handles things. When it's not, it asks the right question instead of the wrong one.
The core principle: ask the twin before asking the user.
Gmail, Calendar, etc.
│
▼
┌──────────────┐
│ Connectors │ Ingest signals from your accounts
└──────┬───────┘
▼
┌──────────────┐
│ Decision │ "What's happening? What would
│ Engine │ the user want here?"
└──────┬───────┘
▼
┌──────────────┐
│ Twin Model │ Your preferences, patterns,
│ + Memory │ and episodic memory (gbrain default,
│ │ MemPalace optional)
└──────┬───────┘
▼
┌──────────────┐
│ Policy │ Spend limits, trust tiers,
│ Engine │ safety constraints
└──────┬───────┘
▼
┌────┴────┐
▼ ▼
Auto- Escalate
execute with context
│ │
▼ ▼
Explain You decide
│ │
└────┬────┘
▼
┌──────────────┐
│ Feedback │ Your response trains the twin
│ Loop │ to be better next time
└──────────────┘
Every path produces an explanation. Every outcome feeds back into the twin. The system gets better at predicting what you want over time.
|
Onboarding
|
Dashboard
|
|
Approvals
|
Decision History
|
|
Setup & Credentials
|
Settings
|
|
My Learnings
|
| Scenario | What SkyTwin Does |
|---|---|
| Newsletter arrives | Your twin knows you archive these without reading. Auto-archived. Explanation logged. You never see it. |
| Calendar conflict | You always prioritize skip-level 1:1s over standups. Standup rescheduled with a note to the organizer. |
| Subscription renewal | $15.99/mo streaming service, used 3x this month, 18 months of renewals. Auto-renewed within your spend norms. |
| Grocery reorder | Repeats your last order with your substitution rules. Flags the one item that jumped 15% in price. |
| Flight booking | Finds the United aisle seat, morning departure, direct, $380. At high trust: books it. At low trust: presents top 3 options. |
| Unknown sender email | Low confidence. Escalates with a one-line summary so you can decide in 5 seconds instead of 5 minutes. |
It's not a chatbot. SkyTwin is operational, not conversational. It doesn't wait for you to type a prompt — it watches your connected accounts and acts when opportunities arise.
It earns trust incrementally. New users start at observer — the system only suggests. As you approve and correct, it earns autonomy domain by domain. Trust in email triage doesn't mean trust with your calendar.
Safety constraints are the product. Every action passes through a policy engine with hard spend limits, trust tier gating, reversibility checks, and sensitivity classification. The system can be inspected, overridden, narrowed, and shut off at any time. Read the full safety model →
Every action is explainable. No black boxes. Every automated decision produces an explanation record: what happened, what evidence was used, what preferences were invoked, why this action over alternatives, and how to correct it.
Your twin is inspectable. It's not a vector embedding or a bag of keywords. It's a typed, versioned data structure where every preference has a confidence level, supporting evidence, and provenance. Contradictions are tracked, not hidden.
Memory knows who said what. Signals from supported connectors arrive stamped with an authoring tier — content you wrote vs. a newsletter vs. an inbound stranger — and tier-weighted retrieval lets self-authored content outrank broadcast noise. The twin feels like it knows you instead of just having read your inbox.
⬇ Download the latest release →
Grab the installer for your OS, double-click, and you're in. No terminal, no Docker, no Ollama, no .env. CockroachDB ships inside the bundle as a hash-verified native binary and an embedded llama.cpp model is the default LLM — nothing else to install.
| OS | Installer on the release page |
|---|---|
| macOS (Apple Silicon) | SkyTwin-…-arm64.dmg |
| Windows | SkyTwin.Setup.….exe |
| Linux | SkyTwin-….AppImage, .deb, or .rpm |
⚠ Unsigned builds (for now). Code-signing certs (Apple Developer + Windows EV) are a pending launch step, so your OS warns on first launch:
- macOS: right-click the app → Open → Open (clears Gatekeeper once).
- Windows: SmartScreen → More info → Run anyway.
Signing lands before the public launch; until then this is the expected first-run experience.
curl -fsSL https://raw.githubusercontent.com/jayzalowitz/skytwin/main/install.sh | bashThe installer detects your OS, installs anything missing (Homebrew on mac, Node 20+, pnpm), fetches the official CockroachDB single-node binary (hash-verified), clones the repo to ~/skytwin, runs the bootstrap, starts the services, and opens the dashboard at http://localhost:3200 once it's up. Re-running pulls latest and restarts.
No Docker required. Before v0.6.56 the installer pulled Docker Desktop and ran CockroachDB inside a container — by far the heaviest dependency on the list, with its own EULA and a "open it once after install" gotcha. The default path now installs the CRDB binary directly into ~/.local/share/skytwin/bin/cockroach and spawns it as a child process. Docker remains supported via SKYTWIN_USE_DOCKER=true for users who already have a Docker workflow.
To stop later: cd ~/skytwin && ./bin/skytwin-dev --stop.
The first 60 seconds:
- The dashboard opens. Type any situation into "Ask your twin" — the agent reasons out loud and explains what it would do, with confidence and alternatives. No accounts connected yet, no signals required.
- Click "Try with a sample profile" on the welcome screen to skip the OAuth setup entirely and poke at a fully populated example twin (decisions, learnings, approvals, the whole thing). The button is enabled whenever the seeded demo user is loaded — and tells you exactly what to run (
pnpm db:seed) when it isn't, instead of silently disappearing. - Want to look around first? Press Esc, click the × in the modal corner, or hit Skip for now — the dashboard chrome stays navigable behind the modal, and a "Sign in" button on the placeholder gets you back into the wizard whenever you're ready.
- When you're ready to wire up your own, the in-app walkthrough handles the Google API setup in about 5 minutes — paste your client ID, click "Save and connect now," and you're at Google's sign-in.
The defaults give you a working SkyTwin without any LLM API keys or Docker. Power users can opt into:
| Env var | Effect |
|---|---|
SKYTWIN_USE_DOCKER=true |
Run CockroachDB inside Docker instead of as a native binary. Useful for users who already have Docker and prefer container lifecycle. |
SKYTWIN_WITH_OLLAMA=true |
Install Ollama + pull the gemma4 model (~9.6GB). The default install uses the embedded llama.cpp provider, which doesn't require this. |
SKYTWIN_DISABLE_EMBEDDED=1 |
Skip the embedded LLM provider in the API's provider chain. Pair with hosted-only keys (e.g. ANTHROPIC_API_KEY) for reproducible evaluation runs. |
SKYTWIN_CRDB_VERSION |
Pin a non-default CockroachDB version. Refresh the hash tables in bin/skytwin-db and apps/desktop/scripts/build-single-binary.sh together. |
If you'd rather drive each step yourself:
Prerequisites
- Node.js >= 20
- pnpm >= 9
- That's it. CockroachDB is fetched as a native binary by
bin/skytwin-db install. No Docker, no system DB install.
git clone https://github.com/jayzalowitz/skytwin.git && cd skytwin
pnpm install
# Fetch + start CockroachDB (native binary, hash-verified)
./bin/skytwin-db install
./bin/skytwin-db start
./bin/skytwin-db ensure-db
# Configure
cp .env.example .env # edit with your values
# Migrate and seed
pnpm db:migrate
pnpm db:seed
# Build and run
pnpm build
pnpm devThe API starts on localhost:3100, the web dashboard on localhost:3200.
Before shipping, regression-check the install end-to-end across a matrix of Linux distros:
./bin/validate-installs # Ubuntu 22.04, Debian 12, Fedora 40
./bin/validate-installs ubuntu # one distro
./bin/validate-installs --keep-on-fail ubuntu # leave container alive on failureEach run spawns a fresh OS container, untars a snapshot of the working
tree, runs install.sh exactly the way a real user would, and asserts
the dashboard responds at localhost:3200. macOS/Windows are exercised
via the same install.sh and bin/skytwin-db codepaths but need a real
machine to verify the platform-specific bits (Homebrew, NSIS, etc.).
pnpm test # ~2,985 tests across 36 workspace packagesSkyTwin is a TypeScript monorepo (pnpm + Turborepo) with 29 packages and 7 apps:
apps/
api/ HTTP API — decisions, user management, webhooks, /api/voice/*
web/ Dashboard — review decisions, manage preferences, configure policies
worker/ Background jobs — async execution, briefing generation, tier backfill
desktop/ Electron app — macOS (.dmg), Windows (.exe), Linux (.AppImage)
mobile/ React Native (Expo) — QR pairing, push notifications, SSE, voice capture
openclaw-bridge/ OpenClaw proxy — bridges local API to OpenClaw execution service
twin-mcp-server/ MCP server exposing the twin's read-only surface to external clients
packages/
shared-types/ TypeScript interfaces — the dependency root for everything
config/ Env var loading and validation
core/ Retry logic, circuit breaker, error types, logging
db/ CockroachDB client, migrations, repositories
twin-model/ Twin profile CRUD, preference learning, confidence scoring
decision-engine/ Event interpretation, candidate generation, action selection
policy-engine/ Trust tiers, spend limits, domain policies, safety checks
policy-prompts/ Versioned LLM prompts with JSON schema validation and deterministic fallbacks
ironclaw-adapter/ Execution adapter with HMAC auth, retries, circuit breaker
execution-router/ Adapter selection, fallback chains, risk modifiers, plugin discovery
llm-client/ Unified LLM client — Anthropic / OpenAI / Google / Ollama / embedded
embedded-llm/ Local-first: llama.cpp text, whisper.cpp STT, Piper TTS — spawn-based
explanations/ Human-readable explanation generation
connectors/ Gmail / Calendar / mock connectors with OAuth, stamps AuthoringTier
assistant/ Stateless chat service wrapping LlmClient with context enrichment
capability-engine/ Infers user app capabilities from signals (keyword v1 + LLM verification)
credential-vault/ Envelope encryption for OAuth tokens (AES-256-GCM + scrypt KDF)
idle-miner/ Filesystem scanner that extracts project metadata during idle time
mcp-host/ Manages MCP servers (stdio/HTTP/SSE) with circuit breakers + telemetry
dxt/ Serializes/deserializes DXT artifacts (packed MCP server configs)
observability/ In-memory metrics + ring-buffered rollup for the capability loop
registry-client/ Loads curated MCP registry entries with OAuth quirks and service lookup
mempalace/ Legacy memory: episodic, knowledge graph, 4-layer retrieval (opt-in backend)
memory-port/ Backend-agnostic MemoryPort interface + capability negotiation
memory-gbrain/ Default memory backend — vector + tsvector RRF on CRDB brain_* tables
memory-gbrain-crdb-adapter/ CRDB driver for gbrain — tier-weighted RRF, pin/hide, embedding providers
memory-hybrid/ Composes any two MemoryPort impls — per-capability read routing
memory-mempalace/ MemoryPort adapter for the legacy mempalace classes
evals/ Decision quality evaluation and regression testing
| Layer | Technology |
|---|---|
| Language | TypeScript (strict, ES2022) |
| Database | CockroachDB (PostgreSQL wire protocol) |
| Runtime | Node.js >= 20 |
| Package Manager | pnpm with workspaces |
| Build | Turborepo |
| Desktop | Electron + electron-builder |
| Mobile | React Native + Expo |
| Testing | Vitest (1,436 tests) |
| CI/CD | GitHub Actions |
| Execution | IronClaw, OpenClaw (via local bridge), and a Direct fallback — trust-ranked with automatic failover |
The API uses req.ip for every IP-keyed check: the session-auth
localhost dev-bypass, the OAuth new-user rate limit, the
/api/v1/demo/preview per-IP bucket, and any future per-client limit.
Behind any reverse proxy, req.ip is the proxy's address by default —
which collapses every per-IP limit into a single shared bucket. You
need TRUST_PROXY_HOPS set to the exact number of trusted hops between
the Node process and the real client.
The number you want is "trusted proxies between this Node process and the
actual client" — count every box that legitimately appends to
X-Forwarded-For on its way in, including any platform-injected router
your provider sits behind.
| Topology | TRUST_PROXY_HOPS |
|---|---|
| Direct (no proxy, or untrusted upstream) | 0 (default) |
| Single reverse proxy (your own nginx, Caddy, ELB target) | 1 |
| Single platform hop (Fly's edge, Render's router, Heroku's app router, an AWS ALB on its own) | 1 |
| CDN → your reverse proxy (Cloudflare → nginx → Node, no platform router) | 2 |
| CDN → platform router → Node (Cloudflare → Fly/Render/Heroku → Node) | 2 |
| CDN → platform router → your reverse proxy → Node (Cloudflare → Fly → nginx → Node) | 3 |
| Multi-hop edge (Cloudflare → AWS WAF → ALB → Node) | 3+ |
If you can't draw the topology from memory, prefer Express's array/CIDR
form for trust proxy (set per-network, not per-hop) — see the
Express docs. Hop
counts are simple but brittle when a platform inserts a hop you didn't
know about.
Setting this too high is a security hole. A client-controlled
X-Forwarded-For becomes req.ip and bypasses every per-IP limit by
header rotation. When in doubt, prefer fewer hops.
Verify after deploy:
curl -H 'X-Forwarded-For: 1.2.3.4' https://your-api/api/health/live
# response includes {"clientIp": "..."} — should NOT be "1.2.3.4"
# unless 1.2.3.4 is actually a trusted upstreamIf clientIp in the response matches the spoofed header, your
TRUST_PROXY_HOPS is too permissive and rate-limit bypass is open.
The public LLM-backed preview endpoint has three layers of protection:
| Env var | Default | Purpose |
|---|---|---|
DEMO_PREVIEW_DISABLED |
unset | Set to 1 to return 503 unconditionally — operator kill switch when the endpoint gets abused. |
DEMO_PREVIEW_GLOBAL_LIMIT_PER_HOUR |
500 |
Hard global cap across all callers. Survives misconfigured TRUST_PROXY_HOPS and rotated-IP abuse. |
| Per-IP bucket | 20 / 5 min | Built in. Effectiveness depends on TRUST_PROXY_HOPS resolving the real client IP. |
The per-IP bucket and the global cap are process-local. If you run multiple API replicas, the global cap multiplies by replica count. For unauthenticated public deployments at scale, replace the in-memory counter with Redis or a DB row with atomic increment (tracked in TODOS.md as a P3).
SkyTwin uses a progressive trust model. Autonomy is earned, not assumed.
| Tier | What It Means |
|---|---|
observer |
Default for new users. The twin proposes actions and surfaces them as approval requests — you approve, reject, or edit. Never auto-executes. |
suggest |
Drafts actions for your review. You approve or edit before anything happens. |
low_autonomy |
Auto-executes low-risk, reversible actions in trusted domains. Escalates everything else. |
moderate_autonomy |
Handles most routine decisions. Escalates novel situations and high-cost actions. |
high_autonomy |
Acts on your behalf across domains. Still respects hard limits and irreversibility checks. |
Trust is domain-specific. You might be at moderate_autonomy for email but suggest for calendar. A bad decision in one domain can reduce trust in that domain without affecting others.
| Document | What's Inside |
|---|---|
| Product Spec | Vision, target user, operating principles, example workflows |
| Technical Spec | Architecture, data flow, API endpoints, database schema |
| Safety Model | Threat model, trust tiers, defense layers, safety philosophy |
| Decision Engine | Situation interpretation, risk assessment, confidence scoring |
| IronClaw Integration | Execution adapter, HMAC auth, failure handling |
| CockroachDB Architecture | Schema design (18+ tables), query patterns, versioning |
| Evals | Evaluation harness, scenario simulation, calibration metrics |
SkyTwin is in Tier 1 launch polish (see docs/launch-plan.md) — Tier 0 (bundled installer, in-app OAuth setup, Gmail wizard) shipped; Tier 1 (cold-load demo, signed binaries, mobile cut, safety + privacy debt) is the active pre-launch sprint tracked under epic #357. The current shipped version is in the badge above and in CHANGELOG.md. Core decision pipeline, twin model, policy engine, and swappable memory layer are functional; Gmail and Google Calendar connectors run with real OAuth; desktop builds ship for all three platforms; the mobile app pairs via QR code and captures voice. v0.5.0.0 brought the one-command installer and a non-technical-user UX overhaul; the v0.6 series added the embedded local LLM (#187), tier-aware memory retrieval (#251), per-Lifebook surfaces (#193), the voice loop (mobile capture + Piper TTS), and Epic A's cold-load demo unblocker (#358).
Free and open-source forever for personal use. Team and hosted tiers are planned for organizations that need shared policies, audit logs, or managed infrastructure — see docs/launch-plan.md for the split.
What works today:
- One-command install (
curl | bash) on macOS, Linux, and WSL — installs every dependency, clones the repo, starts the services, opens the dashboard - "Ask your twin" widget on the dashboard — type any situation, get a predicted action with reasoning and confidence, no accounts required
- Tour mode with a fully populated sample profile so you can poke at decisions, learnings, and approvals before connecting your own accounts
- Full decision pipeline: signal → interpret → decide → policy check → execute/escalate → explain → learn
- LLM-powered decisions via configurable provider chain (Claude, GPT, Gemini, Ollama) with automatic fallback to built-in rules
- Twin model with versioned profiles, confidence scoring, and preference learning
- Policy engine with spend limits, trust tiers, and domain-specific rules
- Swappable memory backend: gbrain (default — vector + tsvector RRF on CRDB) plus optional hybrid mode that adds the legacy spatial Memory Palace (#197). Selectable per-installation via
MEMORY_BACKENDand per-user via the dashboard. Seedocs/memory-swap.md. - Web dashboard for reviewing decisions, managing preferences, configuring AI providers, and auditing
- Desktop app (macOS, Windows, Linux) with system-browser OAuth for Google accounts
- Mobile app (iOS, Android) with QR pairing, push notifications, and voice capture that ships audio to the paired desktop for transcription
- Embedded local LLM stack: llama.cpp text, whisper.cpp STT, Piper TTS (
/api/voice/transcribeand/api/voice/synthesize) — runs entirely on-device when binaries + models are present - SSRF-safe URL validation for all LLM provider endpoints, with DNS rebinding protection
- Dynamic adapter discovery for third-party execution plugins
- 1,436 tests with CI/CD on GitHub Actions
What's next:
- More connectors (Slack, Notion, bank feeds)
- Hosted version with multi-tenant support
- Improved preference learning from implicit signals
We welcome contributions. See CONTRIBUTING.md for guidelines on getting started, running tests, and submitting pull requests.
Found a vulnerability? See SECURITY.md for responsible disclosure instructions.
Apache License 2.0 — use it, modify it, build on it.
Free and open source forever for personal use. Future Team and Hosted tiers are planned for organizations that need shared policies, audit logs, or managed infrastructure. Personal features will never be paywalled.
No prices today — we're not ready to commit numbers, and overpromising on a backlog you haven't shipped is the easiest trust to lose. The shape of the future, not the price list.






