Context
SkyTwin v0.3 shipped a production-grade decision pipeline: signals → interpretation → twin query → candidates → risk → policy → routing → execution → learning. 432 tests, 50+ eval scenarios, 13 domains, 7 safety invariants enforced. The engine works.
What's missing is the product . A user can't install SkyTwin, launch it, connect their accounts, and use it daily without touching a terminal. There's no desktop app, no real-time notifications, no mobile access, no process supervision. The dashboard covers the happy path but can't audit, undo, or search. Connectors crash silently on transient errors. The twin doesn't respect quiet hours.
This epic tracks everything needed to go from "engineer's working prototype" to "download it, launch it, use it every day, post it on Hacker News."
PR Workflow
Every child issue PR must pass `/review` before merge. No exceptions.
Current State (verified 2026-04-04)
What's working well (do not break)
Component
State
Evidence
Decision pipeline (signal → outcome)
Production-ready
432 tests, E2E coverage across 13 domains
Safety invariants (7/7)
Enforced
Integration test suite, regression detection
Trust tier progression
Automated
10/20/50/100 thresholds, rolling-window regression
Spend tracking
Hard-gated
Rolling 24h window, per-action + daily aggregate
Domain autonomy
Enforced
Most-restrictive-wins logic
Gmail + Calendar connectors
Real APIs
OAuth token refresh, multi-user worker
Twin learning (preference evolution)
Full lifecycle
Feedback → preference update → attribution tracking
Eval suite + CI
Automated
GitHub Actions, safety regression gate
What's incomplete
Gap
Severity
Current state
Desktop app
Critical
427 LOC Electron shell, no real UI
Real-time notifications
Critical
30-second badge poll only, no SSE/push
Process supervision
Critical
Crashed processes stay dead
Connector error handling
High
Gmail 429 → signal lost, no retry
First-run experience
High
Onboarding wizard ends at empty dashboard
Audit log
High
Data logged to DB, no viewer
Rollback UI
High
API exists, no button in dashboard
Decision search/filter
High
Domain dropdown only
Mobile access
High
No way to approve from phone
Mobile responsive CSS
High
Basic fixes only, not systematically tested
Quiet hours
Medium
Schema exists, not enforced
Policy CRUD
Medium
No API for action policy management
Type safety
Medium
as-never casts in ask.ts
Structured logging
Medium
console.log only
Temporal profile persistence
Low
In-memory Map, lost on restart
Rate limiter persistence
Low
In-memory Map, lost on restart
Child Issues
#
Title
Priority
Claude Code Est.
Dependencies
#11
Local resilience: retry/backoff, process supervision, graceful degradation
Critical
~2-3h
None
#13
Desktop Electron app: native install, tray, process management
Critical
~4-6h
None
#8
Live notification layer: SSE, approval expiry cron, push alerts
Critical
~1-2h
#11
#9
Guided first-run: connect → see signals → approve → watch twin learn
High
~2-3h
#13
#10
Dashboard completeness: audit log, rollback UI, decision search
High
~3-4h
None
#14
Engine gaps: quiet hours, policy CRUD, type safety, structured logging
Medium
~1-2h
None
#15
Mobile local access: QR pairing, mDNS, responsive dashboard
High
~3-4h
#13
#16
Cross-platform builds: Windows, Linux, native mobile apps
High
~5-8h
#13 , #15
Total serial: ~21-32h Claude Code time
Parallel execution plan (Conductor)
Wave 1 (~4-6h): #11 + #13 + #10 + #14 (all independent)
Wave 2 (~3-4h): #8 + #9 + #15 (deps satisfied)
Wave 3 (~5-8h): #16 (needs #13 + #15)
Wall time with parallelism: ~12-18h across 3 waves
Dependency Graph
#11 Resilience ──────────────> #8 Live Notifications
#13 Desktop App ──┬──────────> #9 First-Run Experience
├──────────> #15 Mobile Access ──> #16 Cross-Platform
└─────────────────────────────────/
#10 Dashboard ────────────── (independent, start anytime)
#14 Engine Gaps ──────────── (independent, start anytime)
Sequencing Rationale
Local resilience: retry/backoff, process supervision, graceful degradation #11 first : retry/backoff and process supervision are prerequisites for everything else. If the API crashes during onboarding or connectors lose signals, nothing else matters.
Desktop Electron app: native install, tray, process management #13 early : the desktop app is the product for local-first. It manages processes, hosts the API, and provides the install experience. Guided first-run: connect → see signals → approve → watch twin learn #9 (first-run) and Mobile local access: QR pairing, mDNS, responsive dashboard #15 (mobile) depend on it.
Live notification layer: SSE, approval expiry cron, push alerts #8 after Local resilience: retry/backoff, process supervision, graceful degradation #11 : SSE connections and expiry cron need a process that stays alive.
Guided first-run: connect → see signals → approve → watch twin learn #9 after Desktop Electron app: native install, tray, process management #13 : first-run experience happens inside the Electron app, not a browser tab.
Dashboard completeness: audit log, rollback UI, decision search #10 and Engine gaps: quiet hours, policy CRUD, type safety, structured logging #14 anytime : dashboard completeness and engine gaps are independent. Good parallel work.
Mobile local access: QR pairing, mDNS, responsive dashboard #15 after Desktop Electron app: native install, tray, process management #13 : mobile access requires the desktop to be serving the API reliably.
Cross-platform builds: Windows, Linux, native mobile apps #16 after Desktop Electron app: native install, tray, process management #13 + Mobile local access: QR pairing, mDNS, responsive dashboard #15 : cross-platform builds extend the macOS desktop app and mobile web to Windows/Linux installers and native iOS/Android apps.
Acceptance Criteria
User downloads DMG (macOS), MSI (Windows), or AppImage (Linux), installs and launches SkyTwin — sees onboarding wizard within 10 seconds
User connects Google account, sees first signals arrive within 30 seconds
User approves 3 actions, sees trust tier progress indicator update
Approval notification appears as native OS notification within 5 seconds of creation
User opens dashboard on phone via QR scan, approves an action from mobile
Native iOS and Android apps connect to desktop via local network
Quiet hours prevent auto-execution and suppress non-urgent notifications
kill -9 on API process → app restarts it within 5 seconds
Gmail returns 429 → connector retries with backoff, signal is not lost
User can search decisions by date range, situation type, and text
User can undo an executed action from the dashboard with structured reasoning
Audit page shows trust tier changes, spend events, and preference evolution
All 432 existing tests continue passing
No safety invariant regressions in eval suite
Every child PR passes `/review` before merge
Out of Scope
Cloud deployment (future epic)
Auto-update mechanism
App store distribution (Apple App Store, Google Play, Microsoft Store)
Multi-device sync beyond local network
Off-network mobile access
Note on cloud epic : A separate epic should be created for cloud deployment. This epic is strictly local-first: everything runs on the user's machine.
Definition of Done
All 15 acceptance criteria pass
A non-technical person can install, onboard, and use SkyTwin daily without opening a terminal — on macOS, Windows, or Linux
The system recovers from transient failures (connector errors, process crashes) without user intervention
Phone approval workflow works on same WiFi network via native app or mobile browser
Eval suite green, no safety regressions
Every PR passed `/review`
Ready to post on Hacker News: install → connect → first decision in under 5 minutes
Rollback Plan
Each child issue is a separate PR. If a child breaks something:
Revert the PR
Existing v0.3 functionality is unaffected (all changes are additive)
Desktop app is a new entrypoint — doesn't modify existing API/worker/web behavior
Issue Archaeology
Issues #2 , #3 , and #4 were audited against the v0.2/v0.3 CHANGELOG on 2026-04-04. All three are now closed:
Make SkyTwin usable by a non-technical person end-to-end #2 (Make SkyTwin usable by a non-technical person): 7/10 sections shipped. Remaining items (confidence dashboard, error handling, mobile responsive, pause twin) migrated to Dashboard completeness: audit log, rollback UI, decision search #10 , Local resilience: retry/backoff, process supervision, graceful degradation #11 , Desktop Electron app: native install, tray, process management #13 , Mobile local access: QR pairing, mDNS, responsive dashboard #15 .
Make SkyTwin actually work end-to-end: adapter layer, execution persistence, safety gaps #3 (Make SkyTwin actually work end-to-end): Phases 1-2 complete. Remaining items (notifications, quiet hours, SSE, policy CRUD, structured logging, type safety) migrated to Live notification layer: SSE, approval expiry cron, push alerts #8 , Engine gaps: quiet hours, policy CRUD, type safety, structured logging #14 .
Scope Expansion: Full Build Plan (CEO Review 2026-04-01) #4 (CEO Review build plan): All 5 build phases executed in v0.2/v0.3. Closed as completed planning artifact.
Context
SkyTwin v0.3 shipped a production-grade decision pipeline: signals → interpretation → twin query → candidates → risk → policy → routing → execution → learning. 432 tests, 50+ eval scenarios, 13 domains, 7 safety invariants enforced. The engine works.
What's missing is the product. A user can't install SkyTwin, launch it, connect their accounts, and use it daily without touching a terminal. There's no desktop app, no real-time notifications, no mobile access, no process supervision. The dashboard covers the happy path but can't audit, undo, or search. Connectors crash silently on transient errors. The twin doesn't respect quiet hours.
This epic tracks everything needed to go from "engineer's working prototype" to "download it, launch it, use it every day, post it on Hacker News."
PR Workflow
Every child issue PR must pass `/review` before merge. No exceptions.
Current State (verified 2026-04-04)
What's working well (do not break)
What's incomplete
Child Issues
Total serial: ~21-32h Claude Code time
Parallel execution plan (Conductor)
Wall time with parallelism: ~12-18h across 3 waves
Dependency Graph
Sequencing Rationale
Acceptance Criteria
Out of Scope
Definition of Done
Rollback Plan
Each child issue is a separate PR. If a child breaks something:
Issue Archaeology
Issues #2, #3, and #4 were audited against the v0.2/v0.3 CHANGELOG on 2026-04-04. All three are now closed: