Expand description
§Onde
On-device chat inference for cross-platform apps.
Run LLM chat locally — no cloud, no latency, no data leaving the device.
Onde wraps mistral.rs with a unified API for model discovery, HuggingFace Hub downloads, cache management, and GPU acceleration across every platform.
Built by Onde Inference
§Modules
hf_cache— HuggingFace Hub cache inspection, repair, and model download with a framework-agnostic progress-callback API.inference— Chat inference engine, UniFFI FFI wrapper, model metadata, and HuggingFace token resolution.
§Re-exports
mistralrs, hf_hub, and mistralrs_core are re-exported so that apps
depending on onde do not need their own direct dependency on those crates.
Access them as onde::mistralrs, onde::hf_hub, and onde::mistralrs_core.
§Example
ⓘ
use onde::inference::ChatEngine;
use onde::inference::GgufModelConfig;
let engine = ChatEngine::new();
engine
.load_gguf_model(
GgufModelConfig::platform_default(),
Some("You are a helpful assistant.".into()),
None,
)
.await?;
let result = engine.send_message("Hello!").await?;
println!("{}", result.text);Re-exports§
pub use mistralrs;
Modules§
- hf_
cache - HuggingFace hub cache inspection, repair, and model download.
- inference
- On-device LLM inference powered by mistral.rs.