Skip to main content

Crate onde

Crate onde 

Source
Expand description

§Onde

On-device chat inference for cross-platform apps.

Run LLM chat locally — no cloud, no latency, no data leaving the device.

Onde wraps mistral.rs with a unified API for model discovery, HuggingFace Hub downloads, cache management, and GPU acceleration across every platform.

Built by Onde Inference

§Modules

  • hf_cache — HuggingFace Hub cache inspection, repair, and model download with a framework-agnostic progress-callback API.
  • inference — Chat inference engine, UniFFI FFI wrapper, model metadata, and HuggingFace token resolution.

§Re-exports

mistralrs, hf_hub, and mistralrs_core are re-exported so that apps depending on onde do not need their own direct dependency on those crates. Access them as onde::mistralrs, onde::hf_hub, and onde::mistralrs_core.

§Example

use onde::inference::ChatEngine;
use onde::inference::GgufModelConfig;

let engine = ChatEngine::new();
engine
    .load_gguf_model(
        GgufModelConfig::platform_default(),
        Some("You are a helpful assistant.".into()),
        None,
    )
    .await?;

let result = engine.send_message("Hello!").await?;
println!("{}", result.text);

Re-exports§

pub use mistralrs;

Modules§

hf_cache
HuggingFace hub cache inspection, repair, and model download.
inference
On-device LLM inference powered by mistral.rs.