feat(#194 Child 4): accessibility — high-contrast + text-scale + voice STT route by jayzalowitz · Pull Request #244 · jayzalowitz/skytwin

jayzalowitz · 2026-05-09T05:13:59Z

Summary

Closes #194 Child 4 a11y commitments + ships a real /api/voice/transcribe route backed by the WhisperCppSttBackend from #238. Five concrete additions:

High-contrast theme (4th variant in dropdown): pure B/W, bold yellow/blue accents, 2px borders, 3px focus outlines.
Text-scale slider: 100/125/150/200% via [data-text-scale] on <html> — every rem reflows.
Reduced-motion override: respects prefers-reduced-motion by default, user can force on/off.
Voice-first toggle: persists to localStorage, sets body class for CSS-driven mic affordances.
/api/voice/transcribe + /capabilities/:userId: real Whisper STT exposed as HTTP endpoint. Mobile Capability loop #G: Mobile parity (Capabilities + Briefing + voice + push) #179 will consume it once recording lands.

Plus baseline sweep: skip-link, focus-visible rings, prefers-reduced-motion media query.

Why this fits the theme

"For everyone" is in the architectural identity. A twin that needs 20/20 vision and steady hands isn't a personal AI for everyone. The voice route is the natural consumer of the embedded Whisper backend in #238 — same prompts, same memory, reachable without a keyboard.

Test plan

pnpm --filter @skytwin/api test — 423 passing, 24 skipped (8 new voice tests)
pnpm build — all 34 packages clean
No real whisper-cli spawn in tests (vi.mock)
CI green

Out of scope

Mobile recording UI (Capability loop #G: Mobile parity (Capabilities + Briefing + voice + push) #179 — hardware-blocked)
axe-core CI gate (process change, separate)
Multi-language i18n (separate slice; route already accepts language)

🤖 Generated with Claude Code

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds end-to-end accessibility preferences in the web UI (high-contrast theme, text scaling, reduced motion, voice-first affordances) and introduces a new API surface for server-side voice transcription using the embedded Whisper STT backend.

Changes:

Web: new high-contrast theme, global focus/skip-link, text-scale + reduced-motion + voice-first preferences persisted in localStorage
API: new /api/voice/capabilities/:userId and /api/voice/transcribe routes backed by @skytwin/embedded-llm, with tests
Docs/meta: changelog entry and API package dependency update

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
apps/web/public/js/theme-switcher.js	Adds “high-contrast” theme variant metadata for the UI theme selector
apps/web/public/js/pages/settings.js	Adds Accessibility settings card and wires preference updates into the settings event delegator
apps/web/public/js/app.js	Boots `initA11y()` early and injects a skip-link on DOMContentLoaded
apps/web/public/js/a11y.js	New module to persist/apply text scale, reduced motion override, and voice-first mode
apps/web/public/css/themes.css	Adds high-contrast theme variables + global focus rings + text-scale + reduced-motion + skip-link styling
apps/api/src/routes/voice.ts	New voice capabilities/transcribe routes with caching and input validation
apps/api/src/index.ts	Mounts the new `/api/voice` router
apps/api/src/tests/voice-routes.test.ts	Adds Vitest coverage for voice capabilities + transcribe behaviors
apps/api/package.json	Adds `@skytwin/embedded-llm` dependency needed by the new voice routes
CHANGELOG.md	Documents the new a11y features and voice endpoints (adds a new unreleased section)

Files not reviewed (1)

pnpm-lock.yaml: Language not supported

Comments suppressed due to low confidence (1)

apps/web/public/js/pages/settings.js:1

The change delegator handles the two <select> controls before the AI-card guard, but does not handle the voice-first checkbox action. If the click handler path changes or if the checkbox is toggled in a way that only emits change, this risks not persisting/applying voice-first. Consider handling a11y-toggle-voice-first here (before the ai-provider-card guard) alongside the select handlers.

import { fetchUser, updateTrustTier, fetchOAuthStatus, getGoogleAuthUrl, disconnectProvider, escapeHtml, fetchSettings, updateAutonomySettings, updateIronClawChannel, upsertDomainPolicy, deleteDomainPolicy, createEscalationTrigger, deleteEscalationTrigger, createSession, fetchSessions, revokeSession, saveAIProviders, testAIProvider, fetchRoutines, deleteRoutine } from '../api-client.js';

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+      const userId = body['userId'];
+      const audioBase64 = body['audioBase64'];
+      const language = body['language'];
+
+      if (typeof userId !== 'string' || userId.length === 0) {
+        res.status(400).json({ error: 'userId required' });


+
+let cachedPort: Promise<EmbeddedSttPort> | null = null;
+function getPort(): Promise<EmbeddedSttPort> {
+  if (cachedPort === null) cachedPort = createEmbeddedSttPort();


+        bytes: audio.length,
+        chars: transcript.length,
+      });
+      res.json({ transcript, durationBytes: audio.length });


+html[data-text-scale="125"] { font-size: 17.5px; }   /* 1.25× of 14px */
+html[data-text-scale="150"] { font-size: 21px; }     /* 1.5× of 14px */
+html[data-text-scale="200"] { font-size: 28px; }     /* 2× of 14px */


@@ -1,5 +1,49 @@
 All notable changes to SkyTwin will be documented in this file.

+## [unreleased] — Accessibility: high-contrast + text-scale + voice STT route (#194 Child 4)


+- Multi-language i18n (separate slice; voice route already accepts
+  `language`, so the audio side is ready).
+
 ## [unreleased] — Embedded LLM as a first-class llm-client provider (#187 AC#7)


+          <label for="a11y-text-scale" style="display: block; font-weight: 500; margin-bottom: 0.25rem;">Text size</label>
+          <div style="display: flex; gap: 0.5rem; align-items: center;">
+            <select class="form-input" id="a11y-text-scale" style="flex: 1;" data-action="a11y-set-text-scale">
+              <option value="100">Default</option>
+              <option value="125">Larger (125%)</option>
+              <option value="150">Much larger (150%)</option>
+              <option value="200">Maximum (200%)</option>
+            </select>
+          </div>


…+ voice STT route Five real implementations: - High-contrast theme variant (4th in the dropdown): pure b/w with bold yellow/blue accents, 2px borders, 3px focus rings. WCAG AAA-aimed. - Text-scale slider: 100/125/150/200% via data-text-scale on <html>; every rem-based size reflows. - Reduced-motion override: respects prefers-reduced-motion by default, user can force on/off via Settings. - Voice-first toggle: persists to localStorage, sets body class so CSS surfaces mic affordances. - /api/voice/transcribe + /api/voice/capabilities/:userId backed by createEmbeddedSttPort() from #238. Real Whisper STT exposed as a reusable HTTP endpoint (mobile #179 will consume it once recording lands; desktop voice-first does today). Sweep: - Skip-link injected as first body child (CSS hides until focused). - Global focus-visible ring across buttons, links, inputs. - prefers-reduced-motion: reduce honored across all themes. - New Accessibility card in Settings with text-size, animations, voice. 8 new voice route tests; api total 423 passing (24 skipped). Out of scope: mobile recording UI (#179 hardware-blocked), axe-core CI gate, multi-language i18n (separate slice). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes the code-bound half of #179. The mobile app can now capture audio and ship it to the paired desktop's `/api/voice/transcribe` (the route landed in PR #244). The remaining work is QA on physical devices, which has always been the actual blocker. Components: - New `VoiceScreen` tab with six-state machine (idle, denied, recording, processing, result, error). Recording driven by `useAudioRecorder` from `expo-audio`. Pulse animation + tabular timer while recording; permission-denied state has a "How to fix" affordance. - New `voice-service.ts` pure helpers: `audioFileToBase64()` reads the recorder's output via `expo-file-system`'s `File.base64()` API; `transcribeRecording()` orchestrates base64 → upload → result mapping with stable error codes the UI branches on. - New `transcribeVoice(userId, audioBase64, language?)` method on the API client; 60s timeout because whisper's first-run model load can take several seconds on cold start. - Permissions added to app.json: NSMicrophoneUsageDescription (iOS) + RECORD_AUDIO (Android) + expo-audio plugin entry. - Deps: expo-audio ~55.0.14, expo-file-system ~55.0.19 (the latter was already transitively installed; declared explicitly so the dep is auditable). Out of scope for this PR (deliberate follow-ups): TTS playback (pairs with #187 AC#4 desktop Piper), "send to twin" hand-off (waits on mobile assistant surface), physical-device QA on real iOS/Android hardware. Test plan: 11 new vitest cases mocking `File.base64()` + fetch. Mobile suite: 163 passing + 2 skipped (discovery tests, unrelated). Workspace: 70/70 turbo tasks green; build clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(#179 voice): mobile voice recording + transcribe pipeline Closes the code-bound half of #179. The mobile app can now capture audio and ship it to the paired desktop's `/api/voice/transcribe` (the route landed in PR #244). The remaining work is QA on physical devices, which has always been the actual blocker. Components: - New `VoiceScreen` tab with six-state machine (idle, denied, recording, processing, result, error). Recording driven by `useAudioRecorder` from `expo-audio`. Pulse animation + tabular timer while recording; permission-denied state has a "How to fix" affordance. - New `voice-service.ts` pure helpers: `audioFileToBase64()` reads the recorder's output via `expo-file-system`'s `File.base64()` API; `transcribeRecording()` orchestrates base64 → upload → result mapping with stable error codes the UI branches on. - New `transcribeVoice(userId, audioBase64, language?)` method on the API client; 60s timeout because whisper's first-run model load can take several seconds on cold start. - Permissions added to app.json: NSMicrophoneUsageDescription (iOS) + RECORD_AUDIO (Android) + expo-audio plugin entry. - Deps: expo-audio ~55.0.14, expo-file-system ~55.0.19 (the latter was already transitively installed; declared explicitly so the dep is auditable). Out of scope for this PR (deliberate follow-ups): TTS playback (pairs with #187 AC#4 desktop Piper), "send to twin" hand-off (waits on mobile assistant surface), physical-device QA on real iOS/Android hardware. Test plan: 11 new vitest cases mocking `File.base64()` + fetch. Mobile suite: 163 passing + 2 skipped (discovery tests, unrelated). Workspace: 70/70 turbo tasks green; build clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(#254 post-/review): type-only client import, real Linking.openSettings, accurate copy + size label Six Copilot findings on PR #254 addressed: 1. voice-service.ts now uses `import type { SkyTwinApiClient }` — the client is only a type reference here, runtime import was unnecessary coupling. 2. VoiceTranscriptError docstring no longer references a permission_denied code that doesn't exist in the union; mic permission is handled inside VoiceScreen before this layer. 3. transcribeVoice docstring corrected: 25MB DECODED (~33MB base64), not "25MB base64." 4. openSystemSettings now actually opens the OS settings page via Linking.openSettings() with the explanatory alert as fallback. One-tap recovery instead of just an explainer. 5. Permission-denial copy acknowledges the temporary on-device audio file. The earlier "never stored anywhere besides paired desktop" was technically inaccurate (file URI is read back as base64). 6. Result-state UI changed from "{X} of audio" to "Audio size: {X}" so the byte count isn't mislabeled as a duration. Test plan: mobile 163/163 passing (+2 skipped, unrelated). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 9, 2026 05:14

Copilot started reviewing on behalf of jayzalowitz May 9, 2026 05:14 View session

Copilot AI reviewed May 9, 2026

View reviewed changes

jayzalowitz mentioned this pull request May 9, 2026

Epic: SkyTwin for Real Life — federation, family scopes, crisis modes, accessibility #194

Closed

jayzalowitz force-pushed the feat/accessibility-a11y branch from daee7e4 to e6db0e0 Compare May 9, 2026 05:39

jayzalowitz merged commit 1be3f75 into main May 9, 2026
8 checks passed

jayzalowitz deleted the feat/accessibility-a11y branch May 9, 2026 05:54

This was referenced May 9, 2026

Epic: Capability Acquisition Loop — an MCP-native autonomous twin (OSS launch v1) #195

Open

Capability loop #G: Mobile parity (Capabilities + Briefing + voice + push) #179

Closed

feat(#179 voice): mobile voice recording + transcribe pipeline #254

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(#194 Child 4): accessibility — high-contrast + text-scale + voice STT route#244

feat(#194 Child 4): accessibility — high-contrast + text-scale + voice STT route#244
jayzalowitz merged 1 commit into
mainfrom
feat/accessibility-a11y

jayzalowitz commented May 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -1,5 +1,49 @@
		All notable changes to SkyTwin will be documented in this file.

		## [unreleased] — Accessibility: high-contrast + text-scale + voice STT route (#194 Child 4)

Conversation

jayzalowitz commented May 9, 2026

Summary

Why this fits the theme

Test plan

Out of scope

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants