Skip to content

Fixes and refactoring of Swift library and demo app#279

Closed
DePasqualeOrg wants to merge 1 commit intoBlaizzy:mainfrom
DePasqualeOrg:fixes
Closed

Fixes and refactoring of Swift library and demo app#279
DePasqualeOrg wants to merge 1 commit intoBlaizzy:mainfrom
DePasqualeOrg:fixes

Conversation

@DePasqualeOrg
Copy link

@DePasqualeOrg DePasqualeOrg commented Nov 25, 2025

I've fixed and refactored the Swift parts of this repo:

  • Fixed Orpheus (now ~2.4 RTF on M3)
  • Added OuteTTS (~4.3 RTF on M3)
  • Minor fixes for Kokoro and Marvis
  • Fixed several crashes
  • Fixed MLX usage
  • Consolidated iOS and macOS apps into one multi-platform app
  • Cleaned up the UI
  • Used the latest SwiftUI patterns
  • Reduced code duplication
  • Loading espeak-ng as package instead of bundling
  • Loading model, tokenizer, and voice files from Hugging Face and GitHub instead of bundling
  • Separated library files for later publication as package
  • Migrated to Swift 6.2 using the latest concurrency patterns for thread safety

The final step will be to move the Swift parts into one or more separate repos.

@rudrankriyam
Copy link
Contributor

Well, awesome work! That's a big PR to review

Loading espeak-ng as a package instead of bundling with the package

I have not gone through the changes but we want to move espeak-ng here: https://github.com/Blaizzy/EspeakNG-Swift Reason is because of its licensing and separating it from Marvis. So developers can easily use Marvis without having to use Kokoro or its dependencies

@DePasqualeOrg
Copy link
Author

The espeak-ng organization already has this Swift package, which I'm using in this PR: https://github.com/espeak-ng/espeak-ng-spm

@Blaizzy
Copy link
Owner

Blaizzy commented Dec 1, 2025

Well done @DePasqualeOrg!

@Blaizzy, we could also download the Kokoro voices from Hugging Face instead of bundling these heavy JSON files if they are uploaded as .safetensors instead of Pickle files.

I agree, this makes sense, you can implement it 👍🏾

@Blaizzy
Copy link
Owner

Blaizzy commented Dec 1, 2025

For LLMs in Swift we currently have a repo called mlx-swift-lm. Following this pattern, we could name the Swift package repo mlx-swift-audio.

Interesting proposal. I don't have strong ideas between mlx-audio-swift or mlx-swift-audio. Either work, but the former seems better from a discoverability point.

@Blaizzy
Copy link
Owner

Blaizzy commented Dec 1, 2025

The espeak-ng organization already has this Swift package, which I'm using in this PR: https://github.com/espeak-ng/espeak-ng-spm

@rudrankriyam what are your thoughts on this?

@DePasqualeOrg
Copy link
Author

Well done @DePasqualeOrg!

@Blaizzy, we could also download the Kokoro voices from Hugging Face instead of bundling these heavy JSON files if they are uploaded as .safetensors instead of Pickle files.

I agree, this makes sense, you can implement it 👍🏾

How would you like to handle the Hugging Face repo with the voices? They're currently in Pickle format, which is not technically safe. For Swift we need .safetensors files.

@Blaizzy
Copy link
Owner

Blaizzy commented Dec 1, 2025

@rudrankriyam could you handle the voices? If you come across any issues let me know.

@DePasqualeOrg
Copy link
Author

DePasqualeOrg commented Dec 1, 2025

@Blaizzy, I added the voices to the Hugging Face repo in .safetensors format here: https://huggingface.co/mlx-community/Kokoro-82M-bf16/discussions/1

This will allow them to be downloaded in the Swift app instead of bundling converted files.

@DePasqualeOrg
Copy link
Author

Here's what the multi-platform app currently looks like on macOS and iOS:

Screenshot 2025-12-02 at 16 48 48 Screenshot 2025-12-02 at 16 52 29 Screenshot 2025-12-02 at 16 53 08

@DePasqualeOrg
Copy link
Author

The CI build test is failing because I've used some newer Swift syntax that requires iOS 18.4/macOS 15.4 or newer (specifically, Atomic and isolated deinit). These help a lot with resolving concurrency issues.

What do the other maintainers of this repo think: Is it acceptable to require at least last year's versions of iOS and macOS to run this library? By now around 95% of users are running compatible OS versions. My preference is to prioritize code ergonomics rather than supporting old OS versions that a diminishing fraction of users will be running.

If you're okay with this, we should update the CI settings accordingly.

Cc @Blaizzy @lucasnewman @rudrankriyam

@DePasqualeOrg
Copy link
Author

DePasqualeOrg commented Dec 3, 2025

I'm planning to do some extensive work on Swift MLX audio tooling and apps over the coming months, which I'll continue in my own repo: https://github.com/DePasqualeOrg/mlx-swift-audio

I've preserved the commit history from this repo for the relevant files there. If you're interested in contributing, let me know so that we can coordinate our efforts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants