One engine. Three languages.

The reliable and efficient on-device inference core, available as a native Rust crate, a Swift Package, and a Dart Flutter plugin.

Swift Dart

Rust

The native crate. Zero-copy, Tokio async, runs everywhere Rust does — from macOS and Linux servers to embedded targets.

Swift

UniFFI bindings in an SPM package with a signed XCFramework. iOS, macOS, and tvOS. Local test mode built in.

Dart

Flutter-first plugin on pub.dev. On-device inference for iOS, macOS, and visionOS from a single Dart codebase.

Current scope

Chat-focused: ChatEngine, GgufModelConfig, platform-default model selection, and streaming callbacks. More APIs ship with the engine.

Two lines to infer.

Load a model with the platform default config, send a message, get a result. Everything is async and cancellable.

use onde::inference::{ChatEngine, GgufModelConfig}; let engine = ChatEngine::new(); engine .load_gguf_model( GgufModelConfig::platform_default(), Some("You are a helpful assistant.".into()), None, ) .await?; let result = engine.send_message("Hello!").await?; println!("{}", result.text); // completed in 85ms — 100% on device

One engine. Three languages.

Install.

Two lines to infer.

Platform Matrix.