möbius Platform Overview

Relevant source files

Purpose and Scope

This document provides a high-level introduction to the möbius platform: an edge AI model repository designed to bring production-ready inference to mobile, embedded, and edge devices. The platform focuses specifically on Apple platform deployment through CoreML, targeting iOS 17+ and macOS 14 devices.

möbius addresses the fragmentation problem in edge AI deployment. While running models on NVIDIA GPUs is straightforward, deploying to edge devices involves navigating diverse accelerators (Apple Neural Engine, NPUs), platform-specific constraints, and complex conversion pipelines. This platform provides standardized conversion toolkits, validated model implementations, and deployment-ready packages across multiple AI model classes.

For quick start instructions on cloning and running models, see Getting Started. For detailed explanations of the repository structure and conversion pipeline, see Core Concepts.

Sources: README.md1-17 CITATION.cff10-11

Platform Philosophy

möbius is built on three core principles:

Standardized Organization: All models follow a consistent models/{class}/{name}/{destination} hierarchy, making the repository predictable and maintainable.
Self-Contained Toolkits: Each destination directory (e.g., coreml, onnx) contains everything needed for conversion—its own pyproject.toml, uv.lock, conversion scripts, test scripts, and documentation. This eliminates dependency conflicts between different models.
Production-Ready Focus: The platform targets real-world deployment scenarios with validated performance metrics, documented edge cases, and solutions to platform-specific issues discovered during iOS/macOS integration.

The platform uses uv for dependency management, ensuring each model's conversion environment is isolated and reproducible. Large model binaries are hosted on Hugging Face rather than stored in the repository.

Sources: README.md18-31 AGENTS.md3-4

Repository Organization

The repository follows a strict three-tier hierarchy that separates concerns and enables independent evolution of different model implementations:

Diagram 1: Repository Hierarchy and Code Structure

Directory Structure Details

Each level serves a specific purpose:

Tier	Path Element	Purpose	Examples
1	`{class}`	Categorizes models by task type	`tts`, `stt`, `segment-text`
2	`{name}`	Identifies specific model implementation	`pocket_tts`, `qwen3-asr-0.6b`, `kokoro`
3	`{destination}`	Defines target runtime and conversion toolkit	`coreml`, `onnx`, `openvino`

The destination directory is where conversion work happens. A typical destination contains:

pyproject.toml - Python dependencies for this specific conversion
uv.lock - Locked dependency versions
convert-coreml.py or similar - Conversion script
test.py or compare-models.py - Validation scripts
README.md - Model-specific documentation
Sample assets for testing

Sources: README.md18-31 AGENTS.md3-4

Model Taxonomy

The platform currently supports three model classes, each containing multiple implementations optimized for different use cases:

Diagram 2: Model Class Taxonomy with Code Paths

Model Characteristics Summary

Model	Class	Architecture	Complexity	Primary Use Case
PocketTTS	`tts`	Flow-matching LM + Mimi decoder	★★★ High	High-quality voice cloning TTS
Kokoro	`tts`	StyleTTS2 (BERT + vocoder)	★★ Medium	Fast multi-speaker synthesis
Qwen3-ASR	`stt`	Audio Encoder + LLM decoder	★★★ High	Cutting-edge ASR with LLM benefits
Nemotron	`stt`	FastConformer RNNT	★★ Medium	Production streaming ASR
Canary	`stt`	FastConformer + Transformer	★★ Medium	Multilingual ASR with punctuation
Segment-Any-Text	`segment-text`	3-layer Transformer	★ Low	Sentence boundary detection

Complexity ratings reflect the number of components, state management requirements, and conversion challenges documented in the repository.

For detailed architecture documentation:

TTS models: See Text-to-Speech Models
STT models: See Speech-to-Text Models
Text processing: See Text Processing

Sources: README.md29 Diagram 1 and Diagram 2 from provided high-level diagrams

Deployment Targets and Platform Focus

möbius primarily targets Apple platforms through CoreML conversion, with experimental support for other runtimes:

Diagram 3: Deployment Targets and Conversion Paths

CoreML Focus Rationale

The platform concentrates on CoreML for several strategic reasons:

Target User Base: Most users are on iOS 17+ and macOS 14, providing a large addressable market with modern ML acceleration capabilities.
Hardware Integration: CoreML provides direct access to the Apple Neural Engine (ANE), enabling efficient low-power inference on Apple Silicon.
Deployment Simplicity: CoreML packages (.mlpackage and compiled .mlmodelc) integrate seamlessly with Swift applications through Xcode, as demonstrated in the FluidAudio integration repository.
Model Compatibility: All major model classes (TTS, STT, text processing) have been validated on CoreML with documented performance characteristics.

Platform-Specific Considerations

Apple platforms impose specific constraints that the conversion process must address:

Minimum OS Version: iOS 17+ and macOS 14 required for modern CoreML features like MLState for stateful models
Compute Unit Selection: Models must choose between .cpuAndNeuralEngine (lower RAM, moderate speed) or .cpuAndGPU (higher RAM, best speed)
Static Graphs: CoreML doesn't support dynamic control flow, requiring conversion-time graph rewriting
Precision Management: ANE uses float16, which can cause precision issues requiring selective float32 compute

For detailed conversion pipeline information, see CoreML Conversion Pipeline. For compute unit strategy, see Compute Unit Strategy.

Sources: README.md38-41 AGENTS.md13-17 Diagram 4 from provided high-level diagrams

Self-Contained Toolkit Architecture

Each destination directory functions as an independent conversion toolkit with isolated dependencies. This architecture prevents version conflicts between models that may require different library versions:

Diagram 4: Self-Contained Toolkit Components and Workflow

Toolkit Components

Each destination directory contains standardized components:

Component	Purpose	Example File
`pyproject.toml`	Python dependencies, project metadata	`models/stt/qwen3-asr-0.6b/coreml/pyproject.toml`
`uv.lock`	Locked dependency tree for reproducibility	`models/stt/qwen3-asr-0.6b/coreml/uv.lock`
Conversion script	Transforms PyTorch → CoreML	`convert-coreml.py`, `convert_models.py`
Test/comparison script	Validates conversion accuracy	`test.py`, `compare-models.py`
`README.md`	Model documentation, benchmarks, issues	Per-model README
Sample assets	Test inputs for validation	Audio files, text samples

Dependency Isolation with uv

The uv dependency manager solves a critical problem: different models often require incompatible library versions. For example:

PocketTTS might use torch==2.4.0 for tracing compatibility
Qwen3-ASR might use torch==2.5.1 for newer features
Different coremltools versions for format compatibility

Each destination's pyproject.toml specifies its exact requirements, and uv sync creates an isolated environment. Developers work in one destination at a time:

This approach is documented in AGENTS.md6-11 and enforced throughout the repository.

For more details on dependency management, see Dependency Management with uv.

Sources: README.md20-31 AGENTS.md6-11 AGENTS.md3-4

Integration and Usage

The möbius repository provides conversion toolkits and model packages, but runtime integration occurs in separate repositories:

Integration Repositories

Repository	Purpose	Platform
FluidAudio	Swift package for iOS/macOS inference	iOS 17+, macOS 14
fluid-server	Server-side inference runtime	Cross-platform

The FluidAudio repository (https://github.com/FluidInference/FluidAudio) demonstrates production integration patterns for CoreML models, including:

Model manager implementations (Qwen3AsrManager, KokoroCompleteCoreML)
Streaming state management with MLState API
Audio preprocessing pipelines
Cache management strategies

Conversion Workflow

The typical workflow for adding or updating a model:

Clone the repository with model-specific assets
Navigate to destination directory (e.g., models/tts/pocket_tts/coreml/)
Install dependencies with uv sync
Run conversion with uv run python convert-coreml.py
Validate output with test scripts
Deploy packages to integration repository (FluidAudio)

Platform Requirements

The platform enforces specific requirements for consistency:

Python version: 3.10.12 (specified in pyproject.toml)
Tracing mode: .CpuOnly for torch.jit.trace to ensure compatibility
Target OS: iOS 17+ and macOS 14 for modern CoreML features
Dependency manager: uv for all toolkit operations

For step-by-step instructions, see Getting Started. For contribution guidelines, see Contributing to möbius.

Sources: README.md33-42 AGENTS.md6-11 AGENTS.md13-17

Platform Development Status

The repository is under active development, with models at different maturity levels:

Model	Status	Deployment	Key Metrics
Nemotron Streaming	✓ Production-ready	FluidAudio	1.79% WER, 1.12s chunks
PocketTTS	✓ Production-ready	FluidAudio	4 models, voice cloning
Qwen3-ASR	◐ Near-production	FluidAudio	3.3x RTFx, stateful caching
Kokoro	◐ Near-production	FluidAudio	Multiple variants (5s/10s/15s)
Canary	◑ Experimental	Partial	Tokenizer integration issues
Segment-Any-Text	✓ Production-ready	FluidAudio	Lightweight utility

The platform welcomes community contributions through the Discord channel linked in the README. The project is licensed under Apache 2.0, with requirements for proper attribution when using upstream models.

For contribution guidelines, see CONTRIBUTING.md1-18 For development conventions, see AGENTS.md1-30

Sources: README.md5-9 CONTRIBUTING.md1-18 LICENSE1-10 Diagram 6 from provided high-level diagrams

möbius Platform Overview

Relevant source files

Purpose and Scope

For quick start instructions on cloning and running models, see Getting Started. For detailed explanations of the repository structure and conversion pipeline, see Core Concepts.

Sources: README.md1-17 CITATION.cff10-11

Platform Philosophy

möbius is built on three core principles:

Standardized Organization: All models follow a consistent models/{class}/{name}/{destination} hierarchy, making the repository predictable and maintainable.
Self-Contained Toolkits: Each destination directory (e.g., coreml, onnx) contains everything needed for conversion—its own pyproject.toml, uv.lock, conversion scripts, test scripts, and documentation. This eliminates dependency conflicts between different models.
Production-Ready Focus: The platform targets real-world deployment scenarios with validated performance metrics, documented edge cases, and solutions to platform-specific issues discovered during iOS/macOS integration.

Sources: README.md18-31 AGENTS.md3-4

Repository Organization

The repository follows a strict three-tier hierarchy that separates concerns and enables independent evolution of different model implementations:

Diagram 1: Repository Hierarchy and Code Structure

Directory Structure Details

Each level serves a specific purpose:

Tier	Path Element	Purpose	Examples
1	`{class}`	Categorizes models by task type	`tts`, `stt`, `segment-text`
2	`{name}`	Identifies specific model implementation	`pocket_tts`, `qwen3-asr-0.6b`, `kokoro`
3	`{destination}`	Defines target runtime and conversion toolkit	`coreml`, `onnx`, `openvino`

The destination directory is where conversion work happens. A typical destination contains:

pyproject.toml - Python dependencies for this specific conversion
uv.lock - Locked dependency versions
convert-coreml.py or similar - Conversion script
test.py or compare-models.py - Validation scripts
README.md - Model-specific documentation
Sample assets for testing

Sources: README.md18-31 AGENTS.md3-4

Model Taxonomy

The platform currently supports three model classes, each containing multiple implementations optimized for different use cases:

Diagram 2: Model Class Taxonomy with Code Paths

Model Characteristics Summary

Model	Class	Architecture	Complexity	Primary Use Case
PocketTTS	`tts`	Flow-matching LM + Mimi decoder	★★★ High	High-quality voice cloning TTS
Kokoro	`tts`	StyleTTS2 (BERT + vocoder)	★★ Medium	Fast multi-speaker synthesis
Qwen3-ASR	`stt`	Audio Encoder + LLM decoder	★★★ High	Cutting-edge ASR with LLM benefits
Nemotron	`stt`	FastConformer RNNT	★★ Medium	Production streaming ASR
Canary	`stt`	FastConformer + Transformer	★★ Medium	Multilingual ASR with punctuation
Segment-Any-Text	`segment-text`	3-layer Transformer	★ Low	Sentence boundary detection

Complexity ratings reflect the number of components, state management requirements, and conversion challenges documented in the repository.

For detailed architecture documentation:

TTS models: See Text-to-Speech Models
STT models: See Speech-to-Text Models
Text processing: See Text Processing

Sources: README.md29 Diagram 1 and Diagram 2 from provided high-level diagrams

Deployment Targets and Platform Focus

möbius primarily targets Apple platforms through CoreML conversion, with experimental support for other runtimes:

Diagram 3: Deployment Targets and Conversion Paths

CoreML Focus Rationale

The platform concentrates on CoreML for several strategic reasons:

Target User Base: Most users are on iOS 17+ and macOS 14, providing a large addressable market with modern ML acceleration capabilities.
Hardware Integration: CoreML provides direct access to the Apple Neural Engine (ANE), enabling efficient low-power inference on Apple Silicon.
Deployment Simplicity: CoreML packages (.mlpackage and compiled .mlmodelc) integrate seamlessly with Swift applications through Xcode, as demonstrated in the FluidAudio integration repository.
Model Compatibility: All major model classes (TTS, STT, text processing) have been validated on CoreML with documented performance characteristics.

Platform-Specific Considerations

Apple platforms impose specific constraints that the conversion process must address:

Minimum OS Version: iOS 17+ and macOS 14 required for modern CoreML features like MLState for stateful models
Compute Unit Selection: Models must choose between .cpuAndNeuralEngine (lower RAM, moderate speed) or .cpuAndGPU (higher RAM, best speed)
Static Graphs: CoreML doesn't support dynamic control flow, requiring conversion-time graph rewriting
Precision Management: ANE uses float16, which can cause precision issues requiring selective float32 compute

For detailed conversion pipeline information, see CoreML Conversion Pipeline. For compute unit strategy, see Compute Unit Strategy.

Sources: README.md38-41 AGENTS.md13-17 Diagram 4 from provided high-level diagrams

Self-Contained Toolkit Architecture

Diagram 4: Self-Contained Toolkit Components and Workflow

Toolkit Components

Each destination directory contains standardized components:

Component	Purpose	Example File
`pyproject.toml`	Python dependencies, project metadata	`models/stt/qwen3-asr-0.6b/coreml/pyproject.toml`
`uv.lock`	Locked dependency tree for reproducibility	`models/stt/qwen3-asr-0.6b/coreml/uv.lock`
Conversion script	Transforms PyTorch → CoreML	`convert-coreml.py`, `convert_models.py`
Test/comparison script	Validates conversion accuracy	`test.py`, `compare-models.py`
`README.md`	Model documentation, benchmarks, issues	Per-model README
Sample assets	Test inputs for validation	Audio files, text samples

Dependency Isolation with uv

The uv dependency manager solves a critical problem: different models often require incompatible library versions. For example:

PocketTTS might use torch==2.4.0 for tracing compatibility
Qwen3-ASR might use torch==2.5.1 for newer features
Different coremltools versions for format compatibility

Each destination's pyproject.toml specifies its exact requirements, and uv sync creates an isolated environment. Developers work in one destination at a time:

This approach is documented in AGENTS.md6-11 and enforced throughout the repository.

For more details on dependency management, see Dependency Management with uv.

Sources: README.md20-31 AGENTS.md6-11 AGENTS.md3-4

Integration and Usage

The möbius repository provides conversion toolkits and model packages, but runtime integration occurs in separate repositories:

Integration Repositories

Repository	Purpose	Platform
FluidAudio	Swift package for iOS/macOS inference	iOS 17+, macOS 14
fluid-server	Server-side inference runtime	Cross-platform

The FluidAudio repository (https://github.com/FluidInference/FluidAudio) demonstrates production integration patterns for CoreML models, including:

Model manager implementations (Qwen3AsrManager, KokoroCompleteCoreML)
Streaming state management with MLState API
Audio preprocessing pipelines
Cache management strategies

Conversion Workflow

The typical workflow for adding or updating a model:

Clone the repository with model-specific assets
Navigate to destination directory (e.g., models/tts/pocket_tts/coreml/)
Install dependencies with uv sync
Run conversion with uv run python convert-coreml.py
Validate output with test scripts
Deploy packages to integration repository (FluidAudio)

Platform Requirements

The platform enforces specific requirements for consistency:

Python version: 3.10.12 (specified in pyproject.toml)
Tracing mode: .CpuOnly for torch.jit.trace to ensure compatibility
Target OS: iOS 17+ and macOS 14 for modern CoreML features
Dependency manager: uv for all toolkit operations

For step-by-step instructions, see Getting Started. For contribution guidelines, see Contributing to möbius.

Sources: README.md33-42 AGENTS.md6-11 AGENTS.md13-17

Platform Development Status

The repository is under active development, with models at different maturity levels:

Model	Status	Deployment	Key Metrics
Nemotron Streaming	✓ Production-ready	FluidAudio	1.79% WER, 1.12s chunks
PocketTTS	✓ Production-ready	FluidAudio	4 models, voice cloning
Qwen3-ASR	◐ Near-production	FluidAudio	3.3x RTFx, stateful caching
Kokoro	◐ Near-production	FluidAudio	Multiple variants (5s/10s/15s)
Canary	◑ Experimental	Partial	Tokenizer integration issues
Segment-Any-Text	✓ Production-ready	FluidAudio	Lightweight utility

For contribution guidelines, see CONTRIBUTING.md1-18 For development conventions, see AGENTS.md1-30

Sources: README.md5-9 CONTRIBUTING.md1-18 LICENSE1-10 Diagram 6 from provided high-level diagrams

möbius Platform Overview

Purpose and Scope

Platform Philosophy

Repository Organization

Directory Structure Details

Model Taxonomy

Model Characteristics Summary

Deployment Targets and Platform Focus

CoreML Focus Rationale

Platform-Specific Considerations

Self-Contained Toolkit Architecture

Toolkit Components

Dependency Isolation with uv

Integration and Usage

Integration Repositories

Conversion Workflow

Platform Requirements

Platform Development Status

On this page

möbius Platform Overview

Purpose and Scope

Platform Philosophy

Repository Organization

Directory Structure Details

Model Taxonomy

Model Characteristics Summary

Deployment Targets and Platform Focus

CoreML Focus Rationale

Platform-Specific Considerations

Self-Contained Toolkit Architecture

Toolkit Components

Dependency Isolation with uv

Integration and Usage

Integration Repositories

Conversion Workflow

Platform Requirements

Platform Development Status

On this page