Skip to main content
On-device RAG for Swift

Wax

Documents, embeddings, BM25 and HNSW indexes in a single file. No Docker. No network calls. No cloud dependency.

brain.swift
import Wax

// Hybrid search with on-device embeddings (MiniLM, 384-dim)
let brain = try await MemoryOrchestrator.openMiniLM(
at: URL(fileURLWithPath: "brain.wax")
)

// Remember something
try await brain.remember(
"User prefers dark mode and gets headaches from bright screens",
metadata: ["source": "onboarding"]
)

// Recall with RAG
let context = try await brain.recall(query: "user preferences")
Fast
0.84ms
Vector search @ 10K docs on Metal GPU
Durable
Kill-9
Power-loss safe. WAL-protected writes.
Deterministic
100%
Same query = same context, every time
Portable
1 file
One .wax file — move it, back it up, ship it
Private
0 calls
100% on-device. Zero network dependency.
Benchmarks

Vector Search Latency

10,000 × 384-dim documents on Apple Silicon

Wax Metal (warm)0.84ms
Wax Metal (cold)9.2ms
Wax CPU105ms
SQLite FTS5150ms
17ms
Cold Open to First Query
105ms
Hybrid Search @ 10K docs
How it compares

Comparison

Wax vs. the alternatives for iOS/macOS.

FeatureWaxChromaCore Data + FAISSPinecone
Single file
Works offlinepartial
Crash-safepartialN/A
GPU vector search
No server required
Swift-native
Deterministic RAG