Skip to content

[BUG] - nebi-desktop crashes on Linux/WebKitGTK (SA_ONSTACK fatal) since v0.10.5 #350

@Adam-D-Lewis

Description

@Adam-D-Lewis

Description

nebi-desktop crashes on Linux (Ubuntu 26.04 + libwebkit2gtk-4.1 v2.52) shortly after launch with the Go runtime fatal:

Overriding existing handler for signal 10. Set JSC_SIGNAL_FOR_GC if you want WebKit to use a different signal
signal 11 received but handler not on signal stack
fatal error: non-Go code set up signal handler without SA_ONSTACK flag

Process exits with code 2. UI does not come up usefully — typically dies within ~1-2 seconds of the first frontend HTTP request landing.

Steps to reproduce

  1. On Ubuntu 26.04 (or any distro with libwebkit2gtk-4.1 ≥ 2.46), pixi global install nebi==0.11
  2. Run nebi-desktop
  3. Observe the crash within a few seconds.

Also reproduces from source:

git checkout v0.10.5  # or any later tag, or main
make build-desktop
./build/bin/Nebi

Does not reproduce on v0.10.4.

Expected behavior

The desktop app comes up and stays running.

Environment

OS:           Ubuntu 26.04 LTS (after recent dist-upgrade)
Webkit:       libwebkit2gtk-4.1-0 2.52.3-0ubuntu0.26.04.2 (also libwebkitgtk-6.0 present)
Nebi version: 0.10.5 through 0.11 (main)
Wails:        v2.11.0 (also reproduces on v2.12.0 — see below)
Go:           1.24

Root cause

This is a latent WebKitGTK/JavaScriptCore bug that Wails 2.x does not fully mitigate. Nebi's v0.10.5 simply removed a workload that was incidentally masking it.

Mechanism: WebKitGTK's JSC installs SIGSEGV handlers (used for generational-GC write barriers and thread suspension) without SA_ONSTACK, which Go's runtime refuses to tolerate. Wails 2.12 attempts to fix this by re-applying SA_ONSTACK once via g_idle_add after gtk_init — but JSC installs its handlers lazily on first JS context creation, often after that one-shot fix has already run. So Wails 2.12 alone is not sufficient.

What kept this latent before v0.10.5: Go's runtime calls sigaction (with SA_ONSTACK) every time it spawns a new OS thread to host a goroutine. Heavy startup work — DB queries, regex compilation, etc. — incidentally caused enough thread churn that SA_ONSTACK got re-applied on top of WebKit's bad handler before any JSC GC signal fired.

Bisect

Bisected between v0.10.4 (good) and v0.10.5 (bad). 16 commits, 4 iterations. First bad commit: 1546d97 — "test(e2e): bundle import via API round-trips assets in local mode".

Despite the test-y name, the commit ships an internal/api/router.go change that skips rbac.InitEnforcer in local mode (nebi-desktop always runs in local mode):

-	if err := rbac.InitEnforcer(db, logger); err != nil {
-		logger.Error("Failed to initialize RBAC", "error", err)
-		panic(err)
+	if !cfg.IsLocalMode() {
+		if err := rbac.InitEnforcer(db, logger); err != nil {
+			logger.Error("Failed to initialize RBAC", "error", err)
+			panic(err)
+		}
 	}

Verification experiments at 1546d97:

Patch Crash?
Revert just router.go (always call InitEnforcer) no
Keep skip, replace with time.Sleep(2 * time.Second) yes

So it's not a nil-deref (RequireAdmin / RequireWorkspaceAccess middleware properly short-circuits in local mode and never touches the enforcer), and it's not pure wall-clock delay. It's specifically the OS-thread churn that InitEnforcer's DB + casbin work induces.

Proposed fix

Two options for nebi. Either alone is sufficient; I'd suggest doing both.

1. Re-apply SA_ONSTACK on a short interval in the desktop binary. Linux-only cgo, ~25 lines (a separate PR will follow). It also bumps wails/v2 to v2.12.0 (which improves but does not fix the case on its own — a Wails-side issue will follow with the same evidence).

//go:build linux

package main

/*
#include <signal.h>

static void fix(int s) {
    struct sigaction st;
    if (sigaction(s, NULL, &st) < 0) return;
    if (!(st.sa_flags & SA_ONSTACK)) {
        st.sa_flags |= SA_ONSTACK;
        sigaction(s, &st, NULL);
    }
}

static void fix_all(void) {
    fix(SIGSEGV); fix(SIGBUS); fix(SIGFPE); fix(SIGILL); fix(SIGABRT);
}
*/
import "C"
import "time"

func init() {
    go func() {
        t := time.NewTicker(50 * time.Millisecond)
        defer t.Stop()
        for range t.C { C.fix_all() }
    }()
}

2. Track and pick up the upstream Wails fix. Wails 2.12's g_idle_add is one-shot; a recurring g_timeout_add or a hook on the WebView's load-changed signal would resolve the underlying race. A Wails-side issue will be filed with the same evidence.

Additional context

  • nebi serve + browser remains a working alternative in the meantime (the embedded HTTP path doesn't involve WebKit at all).
  • The same crash signature appears in Wails issues #2134 and #3965. Wails 2.12.0's changelog says #3965 was fixed, but our empirical evidence shows the fix is insufficient for cases where the Go runtime doesn't churn enough OS threads at startup.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions