Skip to content

Epic: SkyTwin v0.4 — Local-First Product Completion (HN Launch Ready) #12

@jayzalowitz

Description

@jayzalowitz

Context

SkyTwin v0.3 shipped a production-grade decision pipeline: signals → interpretation → twin query → candidates → risk → policy → routing → execution → learning. 432 tests, 50+ eval scenarios, 13 domains, 7 safety invariants enforced. The engine works.

What's missing is the product. A user can't install SkyTwin, launch it, connect their accounts, and use it daily without touching a terminal. There's no desktop app, no real-time notifications, no mobile access, no process supervision. The dashboard covers the happy path but can't audit, undo, or search. Connectors crash silently on transient errors. The twin doesn't respect quiet hours.

This epic tracks everything needed to go from "engineer's working prototype" to "download it, launch it, use it every day, post it on Hacker News."

PR Workflow

Every child issue PR must pass `/review` before merge. No exceptions.

Current State (verified 2026-04-04)

What's working well (do not break)

Component State Evidence
Decision pipeline (signal → outcome) Production-ready 432 tests, E2E coverage across 13 domains
Safety invariants (7/7) Enforced Integration test suite, regression detection
Trust tier progression Automated 10/20/50/100 thresholds, rolling-window regression
Spend tracking Hard-gated Rolling 24h window, per-action + daily aggregate
Domain autonomy Enforced Most-restrictive-wins logic
Gmail + Calendar connectors Real APIs OAuth token refresh, multi-user worker
Twin learning (preference evolution) Full lifecycle Feedback → preference update → attribution tracking
Eval suite + CI Automated GitHub Actions, safety regression gate

What's incomplete

Gap Severity Current state
Desktop app Critical 427 LOC Electron shell, no real UI
Real-time notifications Critical 30-second badge poll only, no SSE/push
Process supervision Critical Crashed processes stay dead
Connector error handling High Gmail 429 → signal lost, no retry
First-run experience High Onboarding wizard ends at empty dashboard
Audit log High Data logged to DB, no viewer
Rollback UI High API exists, no button in dashboard
Decision search/filter High Domain dropdown only
Mobile access High No way to approve from phone
Mobile responsive CSS High Basic fixes only, not systematically tested
Quiet hours Medium Schema exists, not enforced
Policy CRUD Medium No API for action policy management
Type safety Medium as-never casts in ask.ts
Structured logging Medium console.log only
Temporal profile persistence Low In-memory Map, lost on restart
Rate limiter persistence Low In-memory Map, lost on restart

Child Issues

# Title Priority Claude Code Est. Dependencies
#11 Local resilience: retry/backoff, process supervision, graceful degradation Critical ~2-3h None
#13 Desktop Electron app: native install, tray, process management Critical ~4-6h None
#8 Live notification layer: SSE, approval expiry cron, push alerts Critical ~1-2h #11
#9 Guided first-run: connect → see signals → approve → watch twin learn High ~2-3h #13
#10 Dashboard completeness: audit log, rollback UI, decision search High ~3-4h None
#14 Engine gaps: quiet hours, policy CRUD, type safety, structured logging Medium ~1-2h None
#15 Mobile local access: QR pairing, mDNS, responsive dashboard High ~3-4h #13
#16 Cross-platform builds: Windows, Linux, native mobile apps High ~5-8h #13, #15

Total serial: ~21-32h Claude Code time

Parallel execution plan (Conductor)

Wave 1 (~4-6h):  #11 + #13 + #10 + #14  (all independent)
Wave 2 (~3-4h):  #8 + #9 + #15          (deps satisfied)
Wave 3 (~5-8h):  #16                     (needs #13 + #15)

Wall time with parallelism: ~12-18h across 3 waves

Dependency Graph

#11 Resilience ──────────────> #8 Live Notifications

#13 Desktop App ──┬──────────> #9 First-Run Experience
                  ├──────────> #15 Mobile Access ──> #16 Cross-Platform
                  └─────────────────────────────────/

#10 Dashboard ────────────── (independent, start anytime)

#14 Engine Gaps ──────────── (independent, start anytime)

Sequencing Rationale

Acceptance Criteria

  1. User downloads DMG (macOS), MSI (Windows), or AppImage (Linux), installs and launches SkyTwin — sees onboarding wizard within 10 seconds
  2. User connects Google account, sees first signals arrive within 30 seconds
  3. User approves 3 actions, sees trust tier progress indicator update
  4. Approval notification appears as native OS notification within 5 seconds of creation
  5. User opens dashboard on phone via QR scan, approves an action from mobile
  6. Native iOS and Android apps connect to desktop via local network
  7. Quiet hours prevent auto-execution and suppress non-urgent notifications
  8. kill -9 on API process → app restarts it within 5 seconds
  9. Gmail returns 429 → connector retries with backoff, signal is not lost
  10. User can search decisions by date range, situation type, and text
  11. User can undo an executed action from the dashboard with structured reasoning
  12. Audit page shows trust tier changes, spend events, and preference evolution
  13. All 432 existing tests continue passing
  14. No safety invariant regressions in eval suite
  15. Every child PR passes `/review` before merge

Out of Scope

  • Cloud deployment (future epic)
  • Auto-update mechanism
  • App store distribution (Apple App Store, Google Play, Microsoft Store)
  • Multi-device sync beyond local network
  • Off-network mobile access

Note on cloud epic: A separate epic should be created for cloud deployment. This epic is strictly local-first: everything runs on the user's machine.

Definition of Done

  1. All 15 acceptance criteria pass
  2. A non-technical person can install, onboard, and use SkyTwin daily without opening a terminal — on macOS, Windows, or Linux
  3. The system recovers from transient failures (connector errors, process crashes) without user intervention
  4. Phone approval workflow works on same WiFi network via native app or mobile browser
  5. Eval suite green, no safety regressions
  6. Every PR passed `/review`
  7. Ready to post on Hacker News: install → connect → first decision in under 5 minutes

Rollback Plan

Each child issue is a separate PR. If a child breaks something:

  • Revert the PR
  • Existing v0.3 functionality is unaffected (all changes are additive)
  • Desktop app is a new entrypoint — doesn't modify existing API/worker/web behavior

Issue Archaeology

Issues #2, #3, and #4 were audited against the v0.2/v0.3 CHANGELOG on 2026-04-04. All three are now closed:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions