On-device RAG for Swift

Wax

Documents, embeddings, BM25 and HNSW indexes in a single file. No Docker. No network calls. No cloud dependency.

brain.swift

import Wax

// Hybrid search with on-device embeddings (MiniLM, 384-dim)
let brain = try await MemoryOrchestrator.openMiniLM(
    at: URL(fileURLWithPath: "brain.wax")
)

// Remember something
try await brain.remember(
    "User prefers dark mode and gets headaches from bright screens",
    metadata: ["source": "onboarding"]
)

// Recall with RAG
let context = try await brain.recall(query: "user preferences")

Fast

0.84ms

Vector search @ 10K docs on Metal GPU

Durable

Kill-9

Power-loss safe. WAL-protected writes.

Deterministic

100%

Same query = same context, every time

Portable

1 file

One .wax file — move it, back it up, ship it

Private

0 calls

100% on-device. Zero network dependency.

Benchmarks

Vector Search Latency

10,000 × 384-dim documents on Apple Silicon

Wax Metal (warm)0.84ms

Wax Metal (cold)9.2ms

Wax CPU105ms

SQLite FTS5150ms

17ms

Cold Open to First Query

105ms

Hybrid Search @ 10K docs

How it compares

Comparison

Wax vs. the alternatives for iOS/macOS.

Feature	Wax	Chroma	Core Data + FAISS	Pinecone
Single file	✓	✕	✕	✕
Works offline	✓	partial	✓	✕
Crash-safe	✓	✕	partial	N/A
GPU vector search	✓	✕	✕	✕
No server required	✓	✓	✓	✕
Swift-native	✓	✕	✓	✕
Deterministic RAG	✓	✕	✕	✕