This document provides a high-level introduction to the möbius platform: an edge AI model repository designed to bring production-ready inference to mobile, embedded, and edge devices. The platform focuses specifically on Apple platform deployment through CoreML, targeting iOS 17+ and macOS 14 devices.
möbius addresses the fragmentation problem in edge AI deployment. While running models on NVIDIA GPUs is straightforward, deploying to edge devices involves navigating diverse accelerators (Apple Neural Engine, NPUs), platform-specific constraints, and complex conversion pipelines. This platform provides standardized conversion toolkits, validated model implementations, and deployment-ready packages across multiple AI model classes.
For quick start instructions on cloning and running models, see Getting Started. For detailed explanations of the repository structure and conversion pipeline, see Core Concepts.
Sources: README.md1-17 CITATION.cff10-11
möbius is built on three core principles:
Standardized Organization: All models follow a consistent models/{class}/{name}/{destination} hierarchy, making the repository predictable and maintainable.
Self-Contained Toolkits: Each destination directory (e.g., coreml, onnx) contains everything needed for conversion—its own pyproject.toml, uv.lock, conversion scripts, test scripts, and documentation. This eliminates dependency conflicts between different models.
Production-Ready Focus: The platform targets real-world deployment scenarios with validated performance metrics, documented edge cases, and solutions to platform-specific issues discovered during iOS/macOS integration.
The platform uses uv for dependency management, ensuring each model's conversion environment is isolated and reproducible. Large model binaries are hosted on Hugging Face rather than stored in the repository.
Sources: README.md18-31 AGENTS.md3-4
The repository follows a strict three-tier hierarchy that separates concerns and enables independent evolution of different model implementations:
Diagram 1: Repository Hierarchy and Code Structure
Each level serves a specific purpose:
| Tier | Path Element | Purpose | Examples |
|---|---|---|---|
| 1 | {class} | Categorizes models by task type | tts, stt, segment-text |
| 2 | {name} | Identifies specific model implementation | pocket_tts, qwen3-asr-0.6b, kokoro |
| 3 | {destination} | Defines target runtime and conversion toolkit | coreml, onnx, openvino |
The destination directory is where conversion work happens. A typical destination contains:
pyproject.toml - Python dependencies for this specific conversionuv.lock - Locked dependency versionsconvert-coreml.py or similar - Conversion scripttest.py or compare-models.py - Validation scriptsREADME.md - Model-specific documentationSources: README.md18-31 AGENTS.md3-4
The platform currently supports three model classes, each containing multiple implementations optimized for different use cases:
Diagram 2: Model Class Taxonomy with Code Paths
| Model | Class | Architecture | Complexity | Primary Use Case |
|---|---|---|---|---|
| PocketTTS | tts | Flow-matching LM + Mimi decoder | ★★★ High | High-quality voice cloning TTS |
| Kokoro | tts | StyleTTS2 (BERT + vocoder) | ★★ Medium | Fast multi-speaker synthesis |
| Qwen3-ASR | stt | Audio Encoder + LLM decoder | ★★★ High | Cutting-edge ASR with LLM benefits |
| Nemotron | stt | FastConformer RNNT | ★★ Medium | Production streaming ASR |
| Canary | stt | FastConformer + Transformer | ★★ Medium | Multilingual ASR with punctuation |
| Segment-Any-Text | segment-text | 3-layer Transformer | ★ Low | Sentence boundary detection |
Complexity ratings reflect the number of components, state management requirements, and conversion challenges documented in the repository.
For detailed architecture documentation:
Sources: README.md29 Diagram 1 and Diagram 2 from provided high-level diagrams
möbius primarily targets Apple platforms through CoreML conversion, with experimental support for other runtimes:
Diagram 3: Deployment Targets and Conversion Paths
The platform concentrates on CoreML for several strategic reasons:
Target User Base: Most users are on iOS 17+ and macOS 14, providing a large addressable market with modern ML acceleration capabilities.
Hardware Integration: CoreML provides direct access to the Apple Neural Engine (ANE), enabling efficient low-power inference on Apple Silicon.
Deployment Simplicity: CoreML packages (.mlpackage and compiled .mlmodelc) integrate seamlessly with Swift applications through Xcode, as demonstrated in the FluidAudio integration repository.
Model Compatibility: All major model classes (TTS, STT, text processing) have been validated on CoreML with documented performance characteristics.
Apple platforms impose specific constraints that the conversion process must address:
MLState for stateful models.cpuAndNeuralEngine (lower RAM, moderate speed) or .cpuAndGPU (higher RAM, best speed)For detailed conversion pipeline information, see CoreML Conversion Pipeline. For compute unit strategy, see Compute Unit Strategy.
Sources: README.md38-41 AGENTS.md13-17 Diagram 4 from provided high-level diagrams
Each destination directory functions as an independent conversion toolkit with isolated dependencies. This architecture prevents version conflicts between models that may require different library versions:
Diagram 4: Self-Contained Toolkit Components and Workflow
Each destination directory contains standardized components:
| Component | Purpose | Example File |
|---|---|---|
pyproject.toml | Python dependencies, project metadata | models/stt/qwen3-asr-0.6b/coreml/pyproject.toml |
uv.lock | Locked dependency tree for reproducibility | models/stt/qwen3-asr-0.6b/coreml/uv.lock |
| Conversion script | Transforms PyTorch → CoreML | convert-coreml.py, convert_models.py |
| Test/comparison script | Validates conversion accuracy | test.py, compare-models.py |
README.md | Model documentation, benchmarks, issues | Per-model README |
| Sample assets | Test inputs for validation | Audio files, text samples |
The uv dependency manager solves a critical problem: different models often require incompatible library versions. For example:
torch==2.4.0 for tracing compatibilitytorch==2.5.1 for newer featurescoremltools versions for format compatibilityEach destination's pyproject.toml specifies its exact requirements, and uv sync creates an isolated environment. Developers work in one destination at a time:
This approach is documented in AGENTS.md6-11 and enforced throughout the repository.
For more details on dependency management, see Dependency Management with uv.
Sources: README.md20-31 AGENTS.md6-11 AGENTS.md3-4
The möbius repository provides conversion toolkits and model packages, but runtime integration occurs in separate repositories:
| Repository | Purpose | Platform |
|---|---|---|
| FluidAudio | Swift package for iOS/macOS inference | iOS 17+, macOS 14 |
| fluid-server | Server-side inference runtime | Cross-platform |
The FluidAudio repository (https://github.com/FluidInference/FluidAudio) demonstrates production integration patterns for CoreML models, including:
Qwen3AsrManager, KokoroCompleteCoreML)MLState APIThe typical workflow for adding or updating a model:
models/tts/pocket_tts/coreml/)uv syncuv run python convert-coreml.pyThe platform enforces specific requirements for consistency:
pyproject.toml).CpuOnly for torch.jit.trace to ensure compatibilityuv for all toolkit operationsFor step-by-step instructions, see Getting Started. For contribution guidelines, see Contributing to möbius.
Sources: README.md33-42 AGENTS.md6-11 AGENTS.md13-17
The repository is under active development, with models at different maturity levels:
| Model | Status | Deployment | Key Metrics |
|---|---|---|---|
| Nemotron Streaming | ✓ Production-ready | FluidAudio | 1.79% WER, 1.12s chunks |
| PocketTTS | ✓ Production-ready | FluidAudio | 4 models, voice cloning |
| Qwen3-ASR | ◐ Near-production | FluidAudio | 3.3x RTFx, stateful caching |
| Kokoro | ◐ Near-production | FluidAudio | Multiple variants (5s/10s/15s) |
| Canary | ◑ Experimental | Partial | Tokenizer integration issues |
| Segment-Any-Text | ✓ Production-ready | FluidAudio | Lightweight utility |
The platform welcomes community contributions through the Discord channel linked in the README. The project is licensed under Apache 2.0, with requirements for proper attribution when using upstream models.
For contribution guidelines, see CONTRIBUTING.md1-18 For development conventions, see AGENTS.md1-30
Sources: README.md5-9 CONTRIBUTING.md1-18 LICENSE1-10 Diagram 6 from provided high-level diagrams
Refresh this wiki
This wiki was recently refreshed. Please wait 7 days to refresh again.