Open Source · Apache 2.0

Private AI on your Mac.

Linux already tested in VM. Windows — coming soon.

A local AI server with persistent memory.
No cloud. No telemetry. No subscription. Free forever.
It works. Most of the time.

Download for macOS Download for Linux Source code

macOS · Linux · Apple Silicon · v1.0.6

See it in action.

No mockups. No renders.
Real screenshots of server.nexe running on a Mac.

localhost:9119

server.nexe web interface light mode — local AI server macOS

server.nexe web interface dark mode — local AI server macOS

Chat interface with collapsible sidebar. Pick a model, toggle RAG collections, manage and rename sessions.

Your AI. Your machine.

Everything runs on your computer. Your data never leaves.

100% Local & Private

Conversations, documents and embeddings stay on your machine. Optional AES-256 encryption at rest. Zero cloud. Zero telemetry.

Persistent Memory

RAG-powered vector memory that keeps your context across sessions. Index documents, toggle collections individually, and the AI can delete facts on request.

Multi-Engine

Native MLX for Apple Silicon. llama.cpp for GGUF. Ollama if you prefer. Pick your engine.

Modular Plugins

Web UI, security, RAG engine — everything is a plugin. Add what you need, skip the rest.

Multilingual

Use any LLM you want. If the model speaks your language, so does server.nexe.

Free. For real.

Open source. Apache 2.0 license. Contribute to keep it that way.

Install in minutes.

Native macOS installer. Double-click the DMG, follow the wizard. No terminal needed.
From 2 to 30 minutes depending on your internet connection — the only time you need to be online.

Welcome

Pick your language. The installer detects your hardware.

Location

Choose where to install. See available space.

AI Model

14 verified models. Recommends the best one for your Mac.

Summary

Everything that will be downloaded. One click to start.

Done!

Private AI in 8-30 minutes (depending on LLM model).

14 models. You choose.

From 1.5 GB models to 70B. The installer only shows what your hardware can handle.
These are the defaults — you can install any model you want afterwards.

Small

8 GB RAM

Qwen3.5 4B'26

Alibaba3.4 GB

Medium

16 GB RAM

Gemma 4 E4B'26

Google4.5 GB

Salamandra 7B'25

BSC/AINA4.9 GB

Qwen3.5 9B'26

Alibaba6.6 GB

Mistral Nemo 12B'24

Mistral7.1 GB

Large

32 GB RAM

Gemma 4 31B'26

Google18.5 GB

Qwen3.5 27B'26

Alibaba18.0 GB

GPT-OSS 20B'25

OpenAI22.2 GB

All models run locally via MLX, llama.cpp, or Ollama. Nothing is downloaded until you choose.

Before you download.

It's an OSS project.

Built by one person in Barcelona. Not a company. Not a startup. Open source, eager to learn, and trying to do things right.

It has bugs.

Like all software, it has bugs. Every release gets better. If you find one, share it on the community forum — you're helping make it better.

Local models ≠ cloud models.

A local model isn't a cloud model. But for many tasks it's more than enough — and it doesn't need the internet to work.

It won't break your Mac.

Tested from M1 Air to Mac Studio. Respects RAM, disk and battery. 99% safe. Linux already works (tested in a VM, Ubuntu 24.04 ARM64); working on Windows next.

Have your say.

Bugs, ideas, questions or just words of encouragement — everything helps us improve.

🛡 Moderated by AI + human. No data tracked — just name and message. See all messages ↗

Are you a developer? GitHub Discussions ↗

Support the project

server.nexe is free and open source. Your support helps keep it alive.

Ready?

Download. Double-click. Follow the wizard.
Private AI in minutes.

Download for macOS Download for Linux