AI for Apple silicon.

Run your AI workflows locally on Apple silicon: iPhone, iPad, and Mac.

main.rs

use onde::inference::{ChatEngine, GgufModelConfig};

let engine = ChatEngine::new();
engine
    .load_gguf_model(
        GgufModelConfig::platform_default(),
        Some("You are a helpful assistant.".into()),
        None,
    )
    .await?;

let result = engine.send_message("Hello!").await?;
println!("{}", result.text);
// completed in 85ms — 100% on device

Benchmark

The Edge is the New Cloud.

While others wait for a handshake from a data center in Virginia, Onde is already finished.

Inference Layer	Latency (ms)	Server Cost	Privacy
Standard Cloud AI	1,200ms+	$$$$	Public
Generic Mobile AI	450ms	$0	Local
Onde Engine	85ms	$0	Encrypted/Local

First-class citizen

Developer developer developer developer developer developer developer developer developer developer developer developer developer developer - Steve

Reliable and efficient

Your app does not stall at one million users. More devices means more distributed compute at the edge.

Ergonomic API

Two lines to load a model. One line to run it. The SDK gets out of your way so you can ship.

Hardware Optimized

Hand-tuned for Apple Neural Engine and Apple silicon. Onde speaks closer to the metal than generic wrappers.

Action speaks (c)louder

In Production.

We did not just build an engine; we built a champion. Splitfire uses Onde Inference to deliver studio-grade audio separation 100% offline. It is the fastest audio splitter on the App Store. Period.

One More Thing

The World's Intelligence. On Your Terms.

Apple, App Store, iOS, and macOS are trademarks of Apple Inc.