Releases: izwi-ai/izwi
Releases · izwi-ai/izwi
Izwi v0.1.0-beta-12
What's Changed
🚀 Features & Enhancements
- Onboarding Redesign: Streamlined the initial setup with a consistent 3-step onboarding flow designed for a cleaner first time user experience.
- Studio Workspace Refactor: Updated the /studio layout to use a route-style project history table, ensuring a consistent UI between the "create" and "detail" views.
- TTS Workflow Polish: Refined the Text-to-Speech model selection flow and performed a server side cleanup to remove dead code warnings and improve stability.
⚡ Performance
- LFM2 Latency Optimization: Reduced "Time to First Token" (TTFT) for fallback chats by implementing adaptive prefill and optimizing decode hot-paths for faster response times.
📚 Documentation
- Docs Sync: Updated the user documentation to reflect the latest model catalog, CLI defaults, and current feature support, ensuring all guides are accurate to the current build.
Full Changelog: v0.1.0-beta-11...v0.1.0-beta-12
Izwi v0.1.0-beta-11
What's Changed
✨ New Features
- System Tray Integration: Added system tray controls for the desktop app, including settings for startup behavior and visibility.
- In-App Updater: Introduced a cross-platform updater for the beta channel, supported by a signed release-manifest CI pipeline and hardened runtime UI.
- Anonymous Analytics: Implemented opt-in, anonymous desktop analytics using Aptabase, complete with user consent controls.
🎨 UX & Workflow Improvements
- Transcription Overhaul: Refined the transcription index and detail workflow with a cleaner record page, background-safe polling, and a polished streaming upload modal.
- Unified Voice Experience: Merged the /voices and /text-to-speech flows to provide a standardized, model-driven UX and polished table interactions.
- Standardized Actions: Aligned history row actions across the transcription, diarization, and TTS modules for a more consistent user experience.
- Workflow Alignment: Updated the diarization module to align with the new route-based application workflow.
- Settings Refinement: Polished the settings interface and updated the version display for better clarity.
🛠️ Technical Refinements & Fixes
- Performance Optimization: Resolved polling regressions that were causing high CPU usage and page flickering.
- Architecture Update: Migrated application record IDs to UUIDs for improved data integrity and scalability.
- Bug Fixes: Resolved issues with stacked modal behavior in the transcription model selection.
Full Changelog: v0.1.0-beta-10...v0.1.0-beta-11
Izwi v0.1.0-beta-10
What's Changed
- Optimize Qwen3.5 inference end-to-end on Metal with safer SDPA/RoPE paths, adaptive dense→paged decode routing, and generic kernel telemetry compatibility by @zinyando in #106
- Native Qwen3 ASR GGUF integration with Metal hot-path decode/prefill optimizations by @zinyando in #108
- Add diarization summaries with unified Qwen3.5 model and stabilize loading/polling UX by @zinyando in #110
- Add manual segment insertion to Studio with inline add controls by @zinyando in #111
Full Changelog: v0.1.0-beta-9...v0.1.0-beta-10
Izwi v0.1.0-beta-9
What's Changed
- Qwen3.5 inference performance overhaul: batched prefill, DeltaNet/attention optimizations, and decode hot-path reductions by @zinyando in #95
- Redesign transcription and diarization into a shared two-state audio workspace by @zinyando in #96
- Fix diarization mic upload transcoding by @zinyando in #97
- Remove deprecated ASR variants and polish transcription/diarization review UX by @zinyando in #98
- Consolidate voice workflows into /voices, refresh the voices UX, and split TTS Projects into its own route by @zinyando in #99
- Rename Projects to Studio and streamline text-to-speech delivery controls by @zinyando in #100
- Overhaul Studio into a routed project workspace with full TTS project lifecycle support by @zinyando in #101
- feat(studio): add per-segment model/voice controls and harden settings state handling by @zinyando in #102
- Fix TTS voice workflows: route-aware generation dispatch, Qwen3-TTS GQA cache/decode alignment, and explicit designed-voice naming by @zinyando in #103
- Refactor Studio end-to-end naming and API contracts from tts_project to studio (remove legacy paths) by @zinyando in #104
- feat(ui): hide incomplete studio folders ui by @zinyando in #105
Full Changelog: v0.1.0-beta-8...v0.1.0-beta-9
Izwi v0.1.0-beta-8
What's Changed
- End-to-end Qwen3.5 GGUF integration with 4B/9B output-quality fixes across runtime, APIs, and UI by @zinyando in #90
- Add native LFM2.5 Audio GGUF support across the runtime, speech routes, and realtime voice by @zinyando in #91
- Refine Izwi’s UI with consistent selectors, calmer workflows, and route-aligned transcription surfaces PR Description by @zinyando in #92
- Improve Qwen3.5 chat performance and streaming reliability by @zinyando in #93
- Add first‑run onboarding modal with model setup and persisted completion state by @zinyando in #94
Full Changelog: v0.1.0-beta-7...v0.1.0-beta-8
Izwi v0.1.0-beta-7
What's Changed
- Stabilize the serve/runtime config contract by @zinyando in #81
- Add end-to-end diarization review workflow with corrections, exports, reruns, and polished history UI by @zinyando in #82
- Normalize persisted history APIs to RESTful /v1 resources and harden API routing under the UI fallback by @zinyando in #83
- Add timestamped transcription workflows with diarization-style review, history, and export UX by @zinyando in #85
- Build reusable voice workflows and redesign the TTS and voices experience by @zinyando in #87
- fix: use rustls instead of native tls by @vigsterkr in #86
- Improve /voice with persistent sessions, editable agent prompts, observational memory, and a cleaner configuration UI by @zinyando in #88
New Contributors
- @vigsterkr made their first contribution in #86
Full Changelog: v0.1.0-beta-6...v0.1.0-beta-7
Izwi v0.1.0-beta-6
What's Changed
- feat(backends): centralize backend routing and device policy across runtime, CLI, and server by @zinyando in #69
- Refactor core GGUF loading to use a shared backend-aware reader policy across Qwen and LFM2 models by @zinyando in #70
- refactor: centralize core runtime layers and extract app entrypoints by @zinyando in #72
- Improve inference latency, memory stability, and streaming backpressure across core runtime paths by @zinyando in #73
- refactor(ui): reorganize the frontend into a production Vite architecture by @zinyando in #74
- Improve Qwen3.5 runtime behavior and unify history UX across the UI by @zinyando in #76
- Refine chat UI, history management, and Qwen chat defaults by @zinyando in #78
- refactor(ui): move history actions into page headers and refresh the chat composer by @zinyando in #80
Full Changelog: v0.1.0-beta-5...v0.1.0-beta-6
Izwi v0.1.0-beta-5
What's Changed
- UI Improvements and Cleanup by @zinyando in #64
- Enhance TranscriptionPlayground UI with improved recording indicator by @zinyando in #65
- Add native Whisper-Large-v3-Turbo ASR end-to-end with upstream-aligned robust decoding and UI integration by @zinyando in #66
- Add LFM2.5 1.2B GGUF (Q4_K_M) support by @zinyando in #67
- Align lfm2.5 audio architecture by @zinyando in #68
Full Changelog: v0.1.0-beta-4...v0.1.0-beta-5
Izwi v0.1.0-beta-4
What's Changed
- Add Qwen 3.5 small model support by @zinyando in #61
- Refactor Qwen3.5 quantized backend with linear-attention, cache, and multimodal support improvements by @zinyando in #62
- refactor: add early media validation for non-Qwen3.5 models by @zinyando in #63
Full Changelog: v0.1.0-beta-3...v0.1.0-beta-4
Izwi v0.1.0-beta-3
What's Changed
- Improve diarization alignment plumbing and refinement defaults by @zinyando in #56
- Align Sortformer v2.1 diarization with NeMo streaming behavior and fix transcript speaker grouping by @zinyando in #57
- Online parakeet transcription by @zinyando in #58
- refactor: add graceful shutdown and model cleanup by @zinyando in #59
Full Changelog: v0.1.0-beta-2...v0.1.0-beta-3