Skip to content

Releases: jgoy-labs/server-nexe

v1.0.6 — Security hardening

07 Jun 23:29

Choose a tag to compare

Security hardening release.

  • Memory at rest is now encrypted with SQLCipher
  • macOS Keychain ACL hang fixed (sidecar mode skips the OS keyring at boot)
  • Hardened key endpoints and bootstrap-token handling; memory/RAG bug fixes
  • MLX and Llama.cpp streaming errors surfaced to the client (no silent failures)
  • Audit P3 batch — 66 fixes with tests across memory, plugins, personality, installer, docs and CI

See CHANGELOG.md for the full list.

Downloads

Platform File
macOS (Apple Silicon, 14+) nexe-app_1.0.6_aarch64.dmg (notarized)
Linux (ARM64) nexe-app_1.0.6_aarch64.AppImage

v1.0.5-beta — Desktop App (macOS DMG + Linux AppImage)

01 Jun 10:37

Choose a tag to compare

Desktop app installers for server-nexe v1.0.5-beta — macOS DMG + Linux AppImage.

The desktop app bundles the server-nexe engine and runs it as a local sidecar (chat, persistent memory, RAG). Everything runs on your machine; no data is sent to external services. App source: https://github.com/jgoy-labs/nexe-app

Downloads

Platform File Requirements
macOS (Apple Silicon) nexe-app_1.0.5_aarch64.dmg macOS 14 (Sonoma) or later · signed & notarized
Linux (ARM64) nexe-app_1.0.5_aarch64.AppImage

SHA256

nexe-app_1.0.5_aarch64.dmg       1ab1da945982be827f9e6d0703ce588db12566a37480b952d791d62e61115819
nexe-app_1.0.5_aarch64.AppImage  7d9eeb8eba36cac9e5358fbace08782fe83ff8e18e514caffd550260a287dead

Beta release — feedback and issues welcome.

v1.0.4 — Desktop App Release

26 May 01:22

Choose a tag to compare

v1.0.4 — Desktop App Release

Server Nexe now ships as a Tauri v2 desktop application with onboarding wizard, system tray, and automatic sidecar management.

Downloads

Platform Package Size
macOS (Apple Silicon) nexe-app_1.0.4_aarch64.dmg ~1.3 GB
Linux (ARM64) nexe-app_1.0.4_aarch64.AppImage ~1.2 GB

Highlights

  • Desktop app — Tauri v2 shell wrapping server-nexe as a sidecar process
  • Onboarding wizard — hardware detection, backend selection, model catalog, API key setup
  • System tray — native menus, logs, quit confirmation
  • Monorepo — server-nexe core + desktop app (app/) in a single repository
  • Security hardening — SHA-256 plugin integrity with atomic snapshot verify+load, isolation pattern, strict CSP
  • Sidecar mode — dedicated path layer, Pydantic config, fail-fast startup
  • Linux support — AppImage for ARM64, Ollama backend
  • Version 1.0.4 — removes -beta tag

Linux notice

Tested on Ubuntu 24.04 ARM64 virtual machines (UTM). CPU inference via Ollama verified. GPU acceleration not yet tested — if you test on native hardware, please open an issue with your results.

For developers

The CLI and headless installation remain available:

git clone https://github.com/jgoy-labs/server-nexe.git
cd server-nexe
./setup.sh
nexe go

v1.0.4-beta — security hardening, MLX reliability, Qwen3-VL family

14 May 20:02

Choose a tag to compare

Sprint after 1.0.3-beta with 389 commits focused on security hardening, MLX
engine reliability, vision model support (Qwen3-VL family), observability,
and a comprehensive type-safety + complexity reduction pass.

Security

  • NFKC unicode normalization on RAG injection filter and memory tag
    stripper — defends against unicode-confusable bypass of [MEM_SAVE] and
    injection patterns.
  • API key now required on info/health endpoints — previously some
    metadata endpoints leaked version/build info without auth.
  • Rate-limit on /v1/memory/search (60/min) — prevents search
    enumeration.
  • Web UI graceful degradation when security plugin missing — clean 503
    instead of stack trace.
  • SSE error message sanitization — stream errors no longer leak
    internal paths; non-streaming completion content also sanitized.
  • AES-GCM AAD bound to session_id — prevents session swap attacks on
    encrypted .enc files.
  • chmod 600 on session .enc writes + refuse plaintext .json
    sessions in production when crypto missing.
  • Ollama bundle SHA256 pinning — first install verifies the bundle hash
    before extraction.
  • Stop logging partial API keys in auth failure logs.

Added

  • SHA256 pinning of installer downloads — integrity check infrastructure
    for MLX snapshots, GGUF files, and Ollama manifests. DMG-bundled fastembed
    model gets a manifest with three digests. Catalog pins remain None at
    this release; pin population is roadmapped for the next sprint.
  • Live test suite (tests/test_live/, dev-tools/run_live.py) — 53
    tests across all backends (Ollama, MLX, llama.cpp), MEM_SAVE, prompt
    injection, fail-closed, input validation, rate limit. Auto-starts the
    server if down.
  • Qwen3-VL family in MLX catalog — 4B / 8B / 30B-A3B with vision
    capability detection.
  • PyTorch + torchvision bundled in installer — vision/multimodal models
    run on first install without manual setup.
  • MLX hardware tier detection (low/mid/high/ultra) for adaptive
    defaults per Apple Silicon variant.
  • Rotating rag.log — daily rotation, 14-day retention.
  • Recall@N evaluation (real, not synthetic) + bracket support for CJK
    and mathematical brackets.
  • Web UI thinking-state polish — Mexican-wave per-letter animation,
    orange NEXE avatar with traffic-light cycle, placeholder border pulse,
    MODEL_LOADING banner guaranteed visible ≥ 700 ms.
  • Shared engine helpers (_common.py, _streaming.py) — deduplicated
    Ollama/MLX/llama.cpp request and stream code.
  • THREAT_MODEL.md plus per-language versions in knowledge/.
  • 138 new docstrings + interrogate configured to enforce coverage.

Fixed

  • MLX cancel propagation — HTTP client cancel now reaches the MLX
    streaming loop (no more zombie generation after disconnect).
  • MLX stream affinity — single-worker executor pinning preserves
    per-thread default_stream, fixing intermittent stream corruption.
  • Qwen3.5 thinking on MLX — directive prepended with critical tags
    (append failed for prompt-length reasons), synthetic <think> opener
    re-emitted when the chat template injects it, thinking_enabled
    forwarded through the VLM branch and apply_chat_template.
  • VL model loading without PyTorch — clear MissingDependencyError,
    auto-fallback to text-only mode, auto-disable stale safetensors index
    pointing at non-existent shards.
  • RAG recall: MemoryAPI not available on every chat request — broken
    singleton (assigned before initialize()), permanent failure flag (no
    retry), and silent debug fallback all fixed. New 60-second retry window.
  • GCDaemon was never invoked — score-based episodic pruning, budget
    enforcement, and tombstones existed but were dead code. Now wired into
    DreamingCycle.run_cycle() per active user.
  • DreamingCycle ran without an embedder_sync_vector_index was a
    no-op in production; episodic memories never reached the vector store.
  • Streaming: data: [DONE] always emitted after post-processing (some
    clients hung waiting for the marker).
  • Installer: int8 quantized ONNX variant for the embedding bundle
    (fp16 was incompatible with modern ONNX Runtime); Ollama bundle pinned to
    v0.22.1.
  • cancel_event scoped to MLX only — Ollama and llama.cpp cancel
    natively via async transport.
  • Web UI: 21 fixes including footer thinking-badge alignment, image
    MIME persisted in sessions, NEXE_LLAMA_CPP_MODEL honored in backend
    scan, vision icon for MLX Qwen3.5.
  • Dependency CVEs patched: pypdf, python-dotenv, python-multipart,
    filelock (9 advisories from osv-scanner).

Changed

  • Documentation honesty passREADME.md, SECURITY.md,
    IDENTITY.md aligned with actual behavior on telemetry ("scoped to
    runtime"), encryption defaults (auto is not fail-closed), CSP
    (style-src 'unsafe-inline' is allowed for Web UI), and "agnostic"
    scope (backend choice only, not platform).
  • 297 test files migrated from packages to tests/ root for unified
    discovery.
  • Multiple complexity reductions across the chat handler, response
    generator, web UI helpers, and runner — facade helpers extracted, no
    behavior change.
  • Comprehensive type-safety pass across plugins, memory, and core.

Removed

  • Internal one-shot scripts and personal tooling (kept locally via
    git rm --cached); all four added to .gitignore.
  • Personal path references from COMMANDS.md and four test/docs files
    (anonymized).
  • Stale type: ignore annotations and orphan whitelist entries in lint
    configs.

v1.0.2-beta — post-beta fixes (security, UI, encrypted sessions, knowledge embeddings)

21 Apr 19:31

Choose a tag to compare

What's new

Small but meaningful fixes after v1.0.1-beta. No breaking changes.

Fixed

  • Security filter: SQL detector false positives on natural text. The pattern r'--\s' triggered on legitimate user input such as email visual separators (----------), RFC 3676 signature delimiters (-- \n), em-dashes in prose and dash-separated enumerations. Chat messages containing any of these returned HTTP 400 "SQL detected" at /ui/chat. Replaced with r'[\'"]\s*--, which only matches the quote-dashdash signature of real SQL comment injection attacks. All 5 real SQL attacks remain blocked via other patterns. 4 regression tests added.

  • UI: LaTeX math notation in chat output. Some models (notably Gemma-4-31B-8bit) emit LaTeX like $\rightarrow$, $\times 2$, $\sqrt{x}$ and \pi in normal chat answers; the web UI renders Markdown via marked.js with no LaTeX engine, so users saw literal strings. Fixed server-side at the streaming boundary so web UI + future clients all benefit, without shipping KaTeX or any JS dependency. New latex_to_unicode() (two-pass inline-span + bare-command substitution, ~35 commands covered) and LatexStreamBuffer for chunked streams. Currency ($24.50), shell variables ($HOME) and bare dollars survive untouched. FORMAT instruction added to all 6 system prompts (ca/es/en × small/full). 35 new pytest cases.

  • Session manager: double-init race hid encrypted sessions and silently wrote unencrypted ones. After a restart the sidebar only listed .json sessions (.enc invisible though decryptable), new sessions were persisted unencrypted, and a reboot's .json → .enc migration could overwrite existing .enc files belonging to a different conversation with the same session id (collision observed in the wild). Root cause was a three-bug chain (loader early-init + plugin double-create + router local reference). Fixed with a _SessionManagerProxy that re-reads module_instance.session_manager on every attribute access (late-binding), and a single SessionManager(crypto_provider=crypto) construction in initialize(). 4 new regression tests pin the contract.

  • Installer wizard tier mismatch on 48+ GB Macs. HardwareDetector.ramTier still returned tier_64 / tier_48 strings that the Python backend had removed from models.json when the catalog narrowed to 4 tiers in v1.0.0-beta. On 48 GB+ Macs the wizard proposed a tier with no corresponding model set, so defaults fell through. Trimmed the Swift branches to the four tiers actually shipped.

Changed

  • Knowledge-base embeddings regenerated (knowledge/.embeddings/) to close two accumulated stale cases (FUNDING.yml ko-fi URL update from 2026-04-17 and the [IMAGEN ADJUNTA] → [IMATGE ADJUNTA] marker fix from 2026-04-20). Version references inside the knowledge base bumped to 1.0.2-beta and re-embedded in the same pass. 778 chunks total across ca/en/es.

Install

Download Install Nexe.dmg below and double-click. The installer is a 100% offline bundle — wheels + embedding model are shipped inside, no download at install time.

Upgrading from v1.0.1-beta: the installer keeps your existing memory + sessions (.enc files at ~/Library/Application Support/nexe/).

Apple Silicon + macOS 14 Sonoma or later. Linux ARM64 supported via install_nexe.py (Ubuntu 24.04 tested).

Full changelog

See CHANGELOG.md.

v1.0.1-beta

19 Apr 23:25

Choose a tag to compare

v1.0.1-beta Pre-release
Pre-release

v1.0.1-beta — 2026-04-20

Added

  • Memory delete confirmation flow: MEM_DELETE now requires user confirmation before deleting (PENDING_DELETE token + POST /memory/confirm-delete)
  • Atomic fact splitter: compound MEM_SAVE facts automatically split into atomic facts (LLM + regex)
  • Fallback extractor for models that omit MEM_SAVE tags (e.g. Gemma-3 VLM)
  • Tray icon documentation added to knowledge base (ca/en/es)
  • Linux ARM64 tested via UTM (Ubuntu 24.04, Apple Silicon VM)

Fixed

  • UI: responseEl undefined crash when saving memory spinner appeared
  • Security: user chat input was HTML-escaped (&#x27;') before storage
  • MLX: _is_vlm singleton stale when switching VLM→text model in same session
  • session_manager: enforce user/assistant role alternation to prevent VLM errors
  • Installer: copy to ~/.local/share/nexe/ on Linux if source is in Downloads dir

Dependencies (security)

Install

macOS 14+ Apple Silicon — download Install Nexe.dmg, open and run InstallNexe.app.

v1.0.0-beta — pre-1.0 public release

17 Apr 00:08

Choose a tag to compare

Summary

First public pre-1.0 release. Confidence bump from 0.9.9 after the final documentation coherence audit — no functional code changes beyond what 0.9.9 already shipped. The project is now considered a minimum viable product for the real world, open to community feedback.

Changed

  • Version metadata bumped to 1.0.0-beta across the codebase (pyproject, plugins, installer, knowledge base).
  • Knowledge base consolidated (13 thematic documents × 3 languages = 39 files), with Table of Contents and "In 30 seconds" quick intros added to user-facing docs (IDENTITY, INSTALLATION, USAGE, RAG).
  • New document: USE_CASES.md (ca/en/es) covering 6 practical use cases and "when server-nexe is NOT the best tool".
  • New section: ERRORS.md "How to report an error" with privacy warning for logs.
  • Honest coverage figure (~85%) replaces inflated historical badges (97.4%/91.1%/93%).
  • Security audits attribution expanded: Claude + Gemini + Codex + cross-model reviews (not just Claude).
  • AI collaboration credit in author metadata: "Jordi Goy with AI collaboration".
  • Stripe / Ko-fi / GitHub Sponsors URLs corrected (Ko-fi was wrong: /jgoylabs/servernexe).
  • Root READMEs synchronised across CA/EN/ES with screenshots (.github/screenshots/) and the "giant spaghetti monster → minimal core" story framing.

Unchanged (still true from 0.9.9)

  • All functional fixes from 0.9.9 remain (Bug #18 MEM_DELETE, Bug #19 crypto/memory/session/installer, offline install 100%, macOS 14+ Apple Silicon target, llama-cpp-python==0.3.19 pin).
  • 4842 tests collected, ~85% global coverage.

Known limits

  • AI-only audits: no external human security audit yet.
  • Single-user by design.
  • Community feedback welcome via GitHub Issues and the forum at server-nexe.com.

Download

Install Nexe.dmg — macOS 14 Sonoma+ / Apple Silicon (M1/M2/M3/M4).

  • Size: 1388 MB
  • SHA-256: 427c615a5935c0d9cd35272c55ca63a6b38df13ed21c03e8512ee29add513d71
  • Verify: shasum -a 256 "Install Nexe.dmg"

Open the DMG and run InstallNexe.app inside. The installer wizard will set up Nexe.app + tray app in /Applications.

v0.9.2 — P1 Security fixes

12 Apr 14:01

Choose a tag to compare

Security fixes (P1 — post mega-consultoria 2026-04-11)

  • P1-A — Rate limit UI auth failures per IP: dict in-memory, 60s window, max 20 attempts → 429. Prevents brute-force on /ui/chat without rate limiting.
  • P1-B — Auth failure logging from web UI routes to security log. Failures on /ui/* now appear in security log (previously only /v1/* were logged).
  • P1-C — Symlink upload rejection: os.path.realpath() check post-save rejects uploads whose resolved path falls outside the upload directory.
  • P1-D — Encryption default changed from false to auto: enables SQLCipher automatically if sqlcipher3 is available; logs a warning if not. Guard added to skip encryption on pre-existing plain-text databases.

Other

  • installer/make_dmg_ds_store.py: removed hardcoded /Users/jgoy/ path, replaced with os.path.expanduser()

Tests

24 new tests (RED gate verified). 4581 total, 0 regressions.

v0.9.1 — Security hardening + Knowledge coherence

12 Apr 12:47

Choose a tag to compare

What's new in v0.9.1

Security hardening (mega-consultoria)

  • Jailbreak speed-bump detector (47 patterns, multilingual)
  • Upload content denylist (API tokens, PEM keys)
  • Memory injection protection (tag stripping on all paths)
  • Pipeline enforcement (removed bypass endpoints)
  • SQLCIPHER fail-closed behavior
  • Ollama timeout split (connect=5s, read=600s)

Quality

  • 4572 tests, 0 failures
  • Knowledge base coherence audit (36 files updated)
  • Docker files removed (untested, bare-metal only)
  • CHANGELOG, README, SECURITY, CONTRIBUTING updated

macOS installer

  • Signed + notarized DMG (24 MB)
  • 6 model tiers (8GB → 64GB), 14+ models
  • Hardware detection, guided wizard

Full changelog: CHANGELOG.md