Releases: jgoy-labs/server-nexe
v1.0.6 — Security hardening
Security hardening release.
- Memory at rest is now encrypted with SQLCipher
- macOS Keychain ACL hang fixed (sidecar mode skips the OS keyring at boot)
- Hardened key endpoints and bootstrap-token handling; memory/RAG bug fixes
- MLX and Llama.cpp streaming errors surfaced to the client (no silent failures)
- Audit P3 batch — 66 fixes with tests across memory, plugins, personality, installer, docs and CI
See CHANGELOG.md for the full list.
Downloads
| Platform | File |
|---|---|
| macOS (Apple Silicon, 14+) | nexe-app_1.0.6_aarch64.dmg (notarized) |
| Linux (ARM64) | nexe-app_1.0.6_aarch64.AppImage |
v1.0.5-beta — Desktop App (macOS DMG + Linux AppImage)
Desktop app installers for server-nexe v1.0.5-beta — macOS DMG + Linux AppImage.
The desktop app bundles the server-nexe engine and runs it as a local sidecar (chat, persistent memory, RAG). Everything runs on your machine; no data is sent to external services. App source: https://github.com/jgoy-labs/nexe-app
Downloads
| Platform | File | Requirements |
|---|---|---|
| macOS (Apple Silicon) | nexe-app_1.0.5_aarch64.dmg |
macOS 14 (Sonoma) or later · signed & notarized |
| Linux (ARM64) | nexe-app_1.0.5_aarch64.AppImage |
— |
SHA256
nexe-app_1.0.5_aarch64.dmg 1ab1da945982be827f9e6d0703ce588db12566a37480b952d791d62e61115819
nexe-app_1.0.5_aarch64.AppImage 7d9eeb8eba36cac9e5358fbace08782fe83ff8e18e514caffd550260a287dead
Beta release — feedback and issues welcome.
v1.0.4 — Desktop App Release
v1.0.4 — Desktop App Release
Server Nexe now ships as a Tauri v2 desktop application with onboarding wizard, system tray, and automatic sidecar management.
Downloads
| Platform | Package | Size |
|---|---|---|
| macOS (Apple Silicon) | nexe-app_1.0.4_aarch64.dmg |
~1.3 GB |
| Linux (ARM64) | nexe-app_1.0.4_aarch64.AppImage |
~1.2 GB |
Highlights
- Desktop app — Tauri v2 shell wrapping server-nexe as a sidecar process
- Onboarding wizard — hardware detection, backend selection, model catalog, API key setup
- System tray — native menus, logs, quit confirmation
- Monorepo — server-nexe core + desktop app (
app/) in a single repository - Security hardening — SHA-256 plugin integrity with atomic snapshot verify+load, isolation pattern, strict CSP
- Sidecar mode — dedicated path layer, Pydantic config, fail-fast startup
- Linux support — AppImage for ARM64, Ollama backend
- Version 1.0.4 — removes -beta tag
Linux notice
Tested on Ubuntu 24.04 ARM64 virtual machines (UTM). CPU inference via Ollama verified. GPU acceleration not yet tested — if you test on native hardware, please open an issue with your results.
For developers
The CLI and headless installation remain available:
git clone https://github.com/jgoy-labs/server-nexe.git
cd server-nexe
./setup.sh
nexe gov1.0.4-beta — security hardening, MLX reliability, Qwen3-VL family
Sprint after 1.0.3-beta with 389 commits focused on security hardening, MLX
engine reliability, vision model support (Qwen3-VL family), observability,
and a comprehensive type-safety + complexity reduction pass.
Security
- NFKC unicode normalization on RAG injection filter and memory tag
stripper — defends against unicode-confusable bypass of[MEM_SAVE]and
injection patterns. - API key now required on info/health endpoints — previously some
metadata endpoints leaked version/build info without auth. - Rate-limit on
/v1/memory/search(60/min) — prevents search
enumeration. - Web UI graceful degradation when security plugin missing — clean 503
instead of stack trace. - SSE error message sanitization — stream errors no longer leak
internal paths; non-streaming completion content also sanitized. - AES-GCM AAD bound to session_id — prevents session swap attacks on
encrypted.encfiles. chmod 600on session.encwrites + refuse plaintext.json
sessions in production when crypto missing.- Ollama bundle SHA256 pinning — first install verifies the bundle hash
before extraction. - Stop logging partial API keys in auth failure logs.
Added
- SHA256 pinning of installer downloads — integrity check infrastructure
for MLX snapshots, GGUF files, and Ollama manifests. DMG-bundled fastembed
model gets a manifest with three digests. Catalog pins remainNoneat
this release; pin population is roadmapped for the next sprint. - Live test suite (
tests/test_live/,dev-tools/run_live.py) — 53
tests across all backends (Ollama, MLX, llama.cpp), MEM_SAVE, prompt
injection, fail-closed, input validation, rate limit. Auto-starts the
server if down. - Qwen3-VL family in MLX catalog — 4B / 8B / 30B-A3B with vision
capability detection. - PyTorch + torchvision bundled in installer — vision/multimodal models
run on first install without manual setup. - MLX hardware tier detection (
low/mid/high/ultra) for adaptive
defaults per Apple Silicon variant. - Rotating
rag.log— daily rotation, 14-day retention. - Recall@N evaluation (real, not synthetic) + bracket support for CJK
and mathematical brackets. - Web UI thinking-state polish — Mexican-wave per-letter animation,
orange NEXE avatar with traffic-light cycle, placeholder border pulse,
MODEL_LOADING banner guaranteed visible ≥ 700 ms. - Shared engine helpers (
_common.py,_streaming.py) — deduplicated
Ollama/MLX/llama.cpp request and stream code. THREAT_MODEL.mdplus per-language versions inknowledge/.- 138 new docstrings +
interrogateconfigured to enforce coverage.
Fixed
- MLX cancel propagation — HTTP client cancel now reaches the MLX
streaming loop (no more zombie generation after disconnect). - MLX stream affinity — single-worker executor pinning preserves
per-thread default_stream, fixing intermittent stream corruption. - Qwen3.5 thinking on MLX — directive prepended with critical tags
(append failed for prompt-length reasons), synthetic<think>opener
re-emitted when the chat template injects it,thinking_enabled
forwarded through the VLM branch andapply_chat_template. - VL model loading without PyTorch — clear
MissingDependencyError,
auto-fallback to text-only mode, auto-disable stale safetensors index
pointing at non-existent shards. - RAG recall:
MemoryAPI not availableon every chat request — broken
singleton (assigned beforeinitialize()), permanent failure flag (no
retry), and silent debug fallback all fixed. New 60-second retry window. GCDaemonwas never invoked — score-based episodic pruning, budget
enforcement, and tombstones existed but were dead code. Now wired into
DreamingCycle.run_cycle()per active user.DreamingCycleran without an embedder —_sync_vector_indexwas a
no-op in production; episodic memories never reached the vector store.- Streaming:
data: [DONE]always emitted after post-processing (some
clients hung waiting for the marker). - Installer: int8 quantized ONNX variant for the embedding bundle
(fp16 was incompatible with modern ONNX Runtime); Ollama bundle pinned to
v0.22.1. cancel_eventscoped to MLX only — Ollama and llama.cpp cancel
natively via async transport.- Web UI: 21 fixes including footer thinking-badge alignment, image
MIME persisted in sessions,NEXE_LLAMA_CPP_MODELhonored in backend
scan, vision icon for MLX Qwen3.5. - Dependency CVEs patched:
pypdf,python-dotenv,python-multipart,
filelock(9 advisories fromosv-scanner).
Changed
- Documentation honesty pass —
README.md,SECURITY.md,
IDENTITY.mdaligned with actual behavior on telemetry ("scoped to
runtime"), encryption defaults (autois not fail-closed), CSP
(style-src 'unsafe-inline'is allowed for Web UI), and "agnostic"
scope (backend choice only, not platform). - 297 test files migrated from packages to
tests/root for unified
discovery. - Multiple complexity reductions across the chat handler, response
generator, web UI helpers, and runner — facade helpers extracted, no
behavior change. - Comprehensive type-safety pass across plugins, memory, and core.
Removed
- Internal one-shot scripts and personal tooling (kept locally via
git rm --cached); all four added to.gitignore. - Personal path references from
COMMANDS.mdand four test/docs files
(anonymized). - Stale
type: ignoreannotations and orphan whitelist entries in lint
configs.
v1.0.2-beta — post-beta fixes (security, UI, encrypted sessions, knowledge embeddings)
What's new
Small but meaningful fixes after v1.0.1-beta. No breaking changes.
Fixed
-
Security filter: SQL detector false positives on natural text. The pattern
r'--\s'triggered on legitimate user input such as email visual separators (----------), RFC 3676 signature delimiters (-- \n), em-dashes in prose and dash-separated enumerations. Chat messages containing any of these returned HTTP 400 "SQL detected" at/ui/chat. Replaced withr'[\'"]\s*--, which only matches the quote-dashdash signature of real SQL comment injection attacks. All 5 real SQL attacks remain blocked via other patterns. 4 regression tests added. -
UI: LaTeX math notation in chat output. Some models (notably Gemma-4-31B-8bit) emit LaTeX like
$\rightarrow$,$\times 2$,$\sqrt{x}$and\piin normal chat answers; the web UI renders Markdown viamarked.jswith no LaTeX engine, so users saw literal strings. Fixed server-side at the streaming boundary so web UI + future clients all benefit, without shipping KaTeX or any JS dependency. Newlatex_to_unicode()(two-pass inline-span + bare-command substitution, ~35 commands covered) andLatexStreamBufferfor chunked streams. Currency ($24.50), shell variables ($HOME) and bare dollars survive untouched. FORMAT instruction added to all 6 system prompts (ca/es/en × small/full). 35 new pytest cases. -
Session manager: double-init race hid encrypted sessions and silently wrote unencrypted ones. After a restart the sidebar only listed
.jsonsessions (.encinvisible though decryptable), new sessions were persisted unencrypted, and a reboot's.json → .encmigration could overwrite existing.encfiles belonging to a different conversation with the same session id (collision observed in the wild). Root cause was a three-bug chain (loader early-init + plugin double-create + router local reference). Fixed with a_SessionManagerProxythat re-readsmodule_instance.session_manageron every attribute access (late-binding), and a singleSessionManager(crypto_provider=crypto)construction ininitialize(). 4 new regression tests pin the contract. -
Installer wizard tier mismatch on 48+ GB Macs.
HardwareDetector.ramTierstill returnedtier_64/tier_48strings that the Python backend had removed frommodels.jsonwhen the catalog narrowed to 4 tiers in v1.0.0-beta. On 48 GB+ Macs the wizard proposed a tier with no corresponding model set, so defaults fell through. Trimmed the Swift branches to the four tiers actually shipped.
Changed
- Knowledge-base embeddings regenerated (
knowledge/.embeddings/) to close two accumulated stale cases (FUNDING.yml ko-fi URL update from 2026-04-17 and the[IMAGEN ADJUNTA] → [IMATGE ADJUNTA]marker fix from 2026-04-20). Version references inside the knowledge base bumped to1.0.2-betaand re-embedded in the same pass. 778 chunks total across ca/en/es.
Install
Download Install Nexe.dmg below and double-click. The installer is a 100% offline bundle — wheels + embedding model are shipped inside, no download at install time.
Upgrading from v1.0.1-beta: the installer keeps your existing memory + sessions (.enc files at ~/Library/Application Support/nexe/).
Apple Silicon + macOS 14 Sonoma or later. Linux ARM64 supported via install_nexe.py (Ubuntu 24.04 tested).
Full changelog
See CHANGELOG.md.
v1.0.1-beta
v1.0.1-beta — 2026-04-20
Added
- Memory delete confirmation flow: MEM_DELETE now requires user confirmation before deleting (PENDING_DELETE token +
POST /memory/confirm-delete) - Atomic fact splitter: compound MEM_SAVE facts automatically split into atomic facts (LLM + regex)
- Fallback extractor for models that omit MEM_SAVE tags (e.g. Gemma-3 VLM)
- Tray icon documentation added to knowledge base (ca/en/es)
- Linux ARM64 tested via UTM (Ubuntu 24.04, Apple Silicon VM)
Fixed
- UI:
responseElundefined crash when saving memory spinner appeared - Security: user chat input was HTML-escaped (
'→') before storage - MLX:
_is_vlmsingleton stale when switching VLM→text model in same session session_manager: enforce user/assistant role alternation to prevent VLM errors- Installer: copy to
~/.local/share/nexe/on Linux if source is in Downloads dir
Dependencies (security)
- fastapi 0.115 → 0.128 (CVE-2025-54121, CVE-2025-62727)
- python-multipart 0.0.22 → 0.0.26 (CVE-2026-40347)
- pytest 8 → 9, pytest-asyncio 0.23 → 1.0
Install
macOS 14+ Apple Silicon — download Install Nexe.dmg, open and run InstallNexe.app.
v1.0.0-beta — pre-1.0 public release
Summary
First public pre-1.0 release. Confidence bump from 0.9.9 after the final documentation coherence audit — no functional code changes beyond what 0.9.9 already shipped. The project is now considered a minimum viable product for the real world, open to community feedback.
Changed
- Version metadata bumped to
1.0.0-betaacross the codebase (pyproject, plugins, installer, knowledge base). - Knowledge base consolidated (13 thematic documents × 3 languages = 39 files), with Table of Contents and "In 30 seconds" quick intros added to user-facing docs (IDENTITY, INSTALLATION, USAGE, RAG).
- New document:
USE_CASES.md(ca/en/es) covering 6 practical use cases and "when server-nexe is NOT the best tool". - New section:
ERRORS.md"How to report an error" with privacy warning for logs. - Honest coverage figure (~85%) replaces inflated historical badges (97.4%/91.1%/93%).
- Security audits attribution expanded: Claude + Gemini + Codex + cross-model reviews (not just Claude).
AI collaborationcredit in author metadata:"Jordi Goy with AI collaboration".- Stripe / Ko-fi / GitHub Sponsors URLs corrected (Ko-fi was wrong:
/jgoylabs→/servernexe). - Root READMEs synchronised across CA/EN/ES with screenshots (
.github/screenshots/) and the "giant spaghetti monster → minimal core" story framing.
Unchanged (still true from 0.9.9)
- All functional fixes from 0.9.9 remain (Bug #18 MEM_DELETE, Bug #19 crypto/memory/session/installer, offline install 100%, macOS 14+ Apple Silicon target,
llama-cpp-python==0.3.19pin). - 4842 tests collected, ~85% global coverage.
Known limits
- AI-only audits: no external human security audit yet.
- Single-user by design.
- Community feedback welcome via GitHub Issues and the forum at
server-nexe.com.
Download
Install Nexe.dmg — macOS 14 Sonoma+ / Apple Silicon (M1/M2/M3/M4).
- Size: 1388 MB
- SHA-256:
427c615a5935c0d9cd35272c55ca63a6b38df13ed21c03e8512ee29add513d71 - Verify:
shasum -a 256 "Install Nexe.dmg"
Open the DMG and run InstallNexe.app inside. The installer wizard will set up Nexe.app + tray app in /Applications.
v0.9.2 — P1 Security fixes
Security fixes (P1 — post mega-consultoria 2026-04-11)
- P1-A — Rate limit UI auth failures per IP: dict in-memory, 60s window, max 20 attempts → 429. Prevents brute-force on
/ui/chatwithout rate limiting. - P1-B — Auth failure logging from web UI routes to security log. Failures on
/ui/*now appear in security log (previously only/v1/*were logged). - P1-C — Symlink upload rejection:
os.path.realpath()check post-save rejects uploads whose resolved path falls outside the upload directory. - P1-D — Encryption default changed from
falsetoauto: enables SQLCipher automatically ifsqlcipher3is available; logs a warning if not. Guard added to skip encryption on pre-existing plain-text databases.
Other
installer/make_dmg_ds_store.py: removed hardcoded/Users/jgoy/path, replaced withos.path.expanduser()
Tests
24 new tests (RED gate verified). 4581 total, 0 regressions.
v0.9.1 — Security hardening + Knowledge coherence
What's new in v0.9.1
Security hardening (mega-consultoria)
- Jailbreak speed-bump detector (47 patterns, multilingual)
- Upload content denylist (API tokens, PEM keys)
- Memory injection protection (tag stripping on all paths)
- Pipeline enforcement (removed bypass endpoints)
- SQLCIPHER fail-closed behavior
- Ollama timeout split (connect=5s, read=600s)
Quality
- 4572 tests, 0 failures
- Knowledge base coherence audit (36 files updated)
- Docker files removed (untested, bare-metal only)
- CHANGELOG, README, SECURITY, CONTRIBUTING updated
macOS installer
- Signed + notarized DMG (24 MB)
- 6 model tiers (8GB → 64GB), 14+ models
- Hardware detection, guided wizard
Full changelog: CHANGELOG.md