AI server running 100% locally.
Persistent memory across conversations.
Zero data in the cloud.
Minimum viable product for the real world. Open to community feedback. 🚀
chmod +x nexe-app_*.AppImage && ./nexe-app_*.AppImageWhy NEXE
Six pillars
Local & Private
Runs entirely on your computer. No conversations, no data, no documents ever leave your device. Absolute privacy guaranteed by architecture.
RAG Memory
Remembers information across sessions with 768-dimensional embeddings in Qdrant. Indexes MD, PDF and TXT documents. Toggle individual collections on/off from the sidebar.
Multi-backend
Native MLX for Apple Silicon, universal llama.cpp, or Ollama bridge. Switch model and backend without rewriting anything. Unified API.
Modular
Each backend is an independent plugin. Add new features without touching the core. Architecture designed to grow and experiment.
Automatic Memory
The server auto-saves relevant information from conversations with trilingual intent detection, intelligent deduplication and automatic pruning. Delete facts with MEM_DELETE and see each save as a collapsible blue block.
Multilingual
Full i18n system in CA/ES/EN for the interface, system prompts, RAG labels and error messages. Switch language without restarting.
Let's start
Four commands
$ git clone https://github.com/jgoy-labs/server-nexe
$ cd server-nexe
# Detects hardware, picks backend and model
$ ./setup.sh
$ ./nexe go
# → http://localhost:9119
# → http://localhost:9119/ui
$ ./nexe chat --rag
# Store information:
$ ./nexe memory store "..."
Available backends
Choose your engine
MLX
Native for Apple Silicon. Maximum performance on your M1/M2/M3. Uses the Neural Engine GPU at 100%. Best option if you have a modern Mac.
llama.cpp
Compatible with all GGUF formats. Works on Mac (Metal GPU), Linux and Windows. Lightweight, flexible and very active community.
Ollama
If you already have Ollama installed, NEXE can use it directly as a backend. Reuse all the models you already have downloaded.
Documentation
Explore the project
Start now
Download it. Break it. Experiment.
NEXE is your local assistant. Ask it how it works, how to create plugins or how to extend it. It remembers context. Always local.