Skip to content

feat(#194 Child 4): accessibility — high-contrast + text-scale + voice STT route#244

Merged
jayzalowitz merged 1 commit into
mainfrom
feat/accessibility-a11y
May 9, 2026
Merged

feat(#194 Child 4): accessibility — high-contrast + text-scale + voice STT route#244
jayzalowitz merged 1 commit into
mainfrom
feat/accessibility-a11y

Conversation

@jayzalowitz

Copy link
Copy Markdown
Owner

Summary

Closes #194 Child 4 a11y commitments + ships a real /api/voice/transcribe route backed by the WhisperCppSttBackend from #238. Five concrete additions:

  1. High-contrast theme (4th variant in dropdown): pure B/W, bold yellow/blue accents, 2px borders, 3px focus outlines.
  2. Text-scale slider: 100/125/150/200% via [data-text-scale] on <html> — every rem reflows.
  3. Reduced-motion override: respects prefers-reduced-motion by default, user can force on/off.
  4. Voice-first toggle: persists to localStorage, sets body class for CSS-driven mic affordances.
  5. /api/voice/transcribe + /capabilities/:userId: real Whisper STT exposed as HTTP endpoint. Mobile Capability loop #G: Mobile parity (Capabilities + Briefing + voice + push) #179 will consume it once recording lands.

Plus baseline sweep: skip-link, focus-visible rings, prefers-reduced-motion media query.

Why this fits the theme

"For everyone" is in the architectural identity. A twin that needs 20/20 vision and steady hands isn't a personal AI for everyone. The voice route is the natural consumer of the embedded Whisper backend in #238 — same prompts, same memory, reachable without a keyboard.

Test plan

  • pnpm --filter @skytwin/api test — 423 passing, 24 skipped (8 new voice tests)
  • pnpm build — all 34 packages clean
  • No real whisper-cli spawn in tests (vi.mock)
  • CI green

Out of scope

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 9, 2026 05:14

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds end-to-end accessibility preferences in the web UI (high-contrast theme, text scaling, reduced motion, voice-first affordances) and introduces a new API surface for server-side voice transcription using the embedded Whisper STT backend.

Changes:

  • Web: new high-contrast theme, global focus/skip-link, text-scale + reduced-motion + voice-first preferences persisted in localStorage
  • API: new /api/voice/capabilities/:userId and /api/voice/transcribe routes backed by @skytwin/embedded-llm, with tests
  • Docs/meta: changelog entry and API package dependency update

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
apps/web/public/js/theme-switcher.js Adds “high-contrast” theme variant metadata for the UI theme selector
apps/web/public/js/pages/settings.js Adds Accessibility settings card and wires preference updates into the settings event delegator
apps/web/public/js/app.js Boots initA11y() early and injects a skip-link on DOMContentLoaded
apps/web/public/js/a11y.js New module to persist/apply text scale, reduced motion override, and voice-first mode
apps/web/public/css/themes.css Adds high-contrast theme variables + global focus rings + text-scale + reduced-motion + skip-link styling
apps/api/src/routes/voice.ts New voice capabilities/transcribe routes with caching and input validation
apps/api/src/index.ts Mounts the new /api/voice router
apps/api/src/tests/voice-routes.test.ts Adds Vitest coverage for voice capabilities + transcribe behaviors
apps/api/package.json Adds @skytwin/embedded-llm dependency needed by the new voice routes
CHANGELOG.md Documents the new a11y features and voice endpoints (adds a new unreleased section)
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported
Comments suppressed due to low confidence (1)

apps/web/public/js/pages/settings.js:1

  • The change delegator handles the two <select> controls before the AI-card guard, but does not handle the voice-first checkbox action. If the click handler path changes or if the checkbox is toggled in a way that only emits change, this risks not persisting/applying voice-first. Consider handling a11y-toggle-voice-first here (before the ai-provider-card guard) alongside the select handlers.
import { fetchUser, updateTrustTier, fetchOAuthStatus, getGoogleAuthUrl, disconnectProvider, escapeHtml, fetchSettings, updateAutonomySettings, updateIronClawChannel, upsertDomainPolicy, deleteDomainPolicy, createEscalationTrigger, deleteEscalationTrigger, createSession, fetchSessions, revokeSession, saveAIProviders, testAIProvider, fetchRoutines, deleteRoutine } from '../api-client.js';

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +60 to +65
const userId = body['userId'];
const audioBase64 = body['audioBase64'];
const language = body['language'];

if (typeof userId !== 'string' || userId.length === 0) {
res.status(400).json({ error: 'userId required' });

let cachedPort: Promise<EmbeddedSttPort> | null = null;
function getPort(): Promise<EmbeddedSttPort> {
if (cachedPort === null) cachedPort = createEmbeddedSttPort();
bytes: audio.length,
chars: transcript.length,
});
res.json({ transcript, durationBytes: audio.length });
Comment on lines +632 to +634
html[data-text-scale="125"] { font-size: 17.5px; } /* 1.25× of 14px */
html[data-text-scale="150"] { font-size: 21px; } /* 1.5× of 14px */
html[data-text-scale="200"] { font-size: 28px; } /* 2× of 14px */
Comment thread CHANGELOG.md
@@ -1,5 +1,49 @@
All notable changes to SkyTwin will be documented in this file.

## [unreleased] — Accessibility: high-contrast + text-scale + voice STT route (#194 Child 4)
Comment thread CHANGELOG.md
- Multi-language i18n (separate slice; voice route already accepts
`language`, so the audio side is ready).

## [unreleased] — Embedded LLM as a first-class llm-client provider (#187 AC#7)
Comment on lines +126 to +134
<label for="a11y-text-scale" style="display: block; font-weight: 500; margin-bottom: 0.25rem;">Text size</label>
<div style="display: flex; gap: 0.5rem; align-items: center;">
<select class="form-input" id="a11y-text-scale" style="flex: 1;" data-action="a11y-set-text-scale">
<option value="100">Default</option>
<option value="125">Larger (125%)</option>
<option value="150">Much larger (150%)</option>
<option value="200">Maximum (200%)</option>
</select>
</div>
…+ voice STT route

Five real implementations:
- High-contrast theme variant (4th in the dropdown): pure b/w with bold
  yellow/blue accents, 2px borders, 3px focus rings. WCAG AAA-aimed.
- Text-scale slider: 100/125/150/200% via data-text-scale on <html>;
  every rem-based size reflows.
- Reduced-motion override: respects prefers-reduced-motion by default,
  user can force on/off via Settings.
- Voice-first toggle: persists to localStorage, sets body class so CSS
  surfaces mic affordances.
- /api/voice/transcribe + /api/voice/capabilities/:userId backed by
  createEmbeddedSttPort() from #238. Real Whisper STT exposed as a
  reusable HTTP endpoint (mobile #179 will consume it once recording
  lands; desktop voice-first does today).

Sweep:
- Skip-link injected as first body child (CSS hides until focused).
- Global focus-visible ring across buttons, links, inputs.
- prefers-reduced-motion: reduce honored across all themes.
- New Accessibility card in Settings with text-size, animations, voice.

8 new voice route tests; api total 423 passing (24 skipped).

Out of scope: mobile recording UI (#179 hardware-blocked), axe-core CI
gate, multi-language i18n (separate slice).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jayzalowitz jayzalowitz force-pushed the feat/accessibility-a11y branch from daee7e4 to e6db0e0 Compare May 9, 2026 05:39
@jayzalowitz jayzalowitz merged commit 1be3f75 into main May 9, 2026
8 checks passed
@jayzalowitz jayzalowitz deleted the feat/accessibility-a11y branch May 9, 2026 05:54
jayzalowitz added a commit that referenced this pull request May 12, 2026
Closes the code-bound half of #179. The mobile app can now capture
audio and ship it to the paired desktop's `/api/voice/transcribe` (the
route landed in PR #244). The remaining work is QA on physical devices,
which has always been the actual blocker.

Components:

- New `VoiceScreen` tab with six-state machine (idle, denied, recording,
  processing, result, error). Recording driven by `useAudioRecorder`
  from `expo-audio`. Pulse animation + tabular timer while recording;
  permission-denied state has a "How to fix" affordance.

- New `voice-service.ts` pure helpers: `audioFileToBase64()` reads the
  recorder's output via `expo-file-system`'s `File.base64()` API;
  `transcribeRecording()` orchestrates base64 → upload → result mapping
  with stable error codes the UI branches on.

- New `transcribeVoice(userId, audioBase64, language?)` method on the
  API client; 60s timeout because whisper's first-run model load can
  take several seconds on cold start.

- Permissions added to app.json: NSMicrophoneUsageDescription (iOS) +
  RECORD_AUDIO (Android) + expo-audio plugin entry.

- Deps: expo-audio ~55.0.14, expo-file-system ~55.0.19 (the latter
  was already transitively installed; declared explicitly so the dep
  is auditable).

Out of scope for this PR (deliberate follow-ups): TTS playback
(pairs with #187 AC#4 desktop Piper), "send to twin" hand-off (waits
on mobile assistant surface), physical-device QA on real iOS/Android
hardware.

Test plan: 11 new vitest cases mocking `File.base64()` + fetch.
Mobile suite: 163 passing + 2 skipped (discovery tests, unrelated).
Workspace: 70/70 turbo tasks green; build clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
jayzalowitz added a commit that referenced this pull request May 12, 2026
* feat(#179 voice): mobile voice recording + transcribe pipeline

Closes the code-bound half of #179. The mobile app can now capture
audio and ship it to the paired desktop's `/api/voice/transcribe` (the
route landed in PR #244). The remaining work is QA on physical devices,
which has always been the actual blocker.

Components:

- New `VoiceScreen` tab with six-state machine (idle, denied, recording,
  processing, result, error). Recording driven by `useAudioRecorder`
  from `expo-audio`. Pulse animation + tabular timer while recording;
  permission-denied state has a "How to fix" affordance.

- New `voice-service.ts` pure helpers: `audioFileToBase64()` reads the
  recorder's output via `expo-file-system`'s `File.base64()` API;
  `transcribeRecording()` orchestrates base64 → upload → result mapping
  with stable error codes the UI branches on.

- New `transcribeVoice(userId, audioBase64, language?)` method on the
  API client; 60s timeout because whisper's first-run model load can
  take several seconds on cold start.

- Permissions added to app.json: NSMicrophoneUsageDescription (iOS) +
  RECORD_AUDIO (Android) + expo-audio plugin entry.

- Deps: expo-audio ~55.0.14, expo-file-system ~55.0.19 (the latter
  was already transitively installed; declared explicitly so the dep
  is auditable).

Out of scope for this PR (deliberate follow-ups): TTS playback
(pairs with #187 AC#4 desktop Piper), "send to twin" hand-off (waits
on mobile assistant surface), physical-device QA on real iOS/Android
hardware.

Test plan: 11 new vitest cases mocking `File.base64()` + fetch.
Mobile suite: 163 passing + 2 skipped (discovery tests, unrelated).
Workspace: 70/70 turbo tasks green; build clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(#254 post-/review): type-only client import, real Linking.openSettings, accurate copy + size label

Six Copilot findings on PR #254 addressed:

1. voice-service.ts now uses `import type { SkyTwinApiClient }` —
   the client is only a type reference here, runtime import was
   unnecessary coupling.

2. VoiceTranscriptError docstring no longer references a
   permission_denied code that doesn't exist in the union; mic
   permission is handled inside VoiceScreen before this layer.

3. transcribeVoice docstring corrected: 25MB DECODED (~33MB base64),
   not "25MB base64."

4. openSystemSettings now actually opens the OS settings page via
   Linking.openSettings() with the explanatory alert as fallback.
   One-tap recovery instead of just an explainer.

5. Permission-denial copy acknowledges the temporary on-device audio
   file. The earlier "never stored anywhere besides paired desktop"
   was technically inaccurate (file URI is read back as base64).

6. Result-state UI changed from "{X} of audio" to "Audio size: {X}"
   so the byte count isn't mislabeled as a duration.

Test plan: mobile 163/163 passing (+2 skipped, unrelated).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Epic: SkyTwin for Real Life — federation, family scopes, crisis modes, accessibility

2 participants