Skip to content

ollama proxy token diverges from stored token after re-onboard, causing persistent HTTP 401 on inference #2553

@adrian-santos

Description

@adrian-santos

Summary

When nemoclaw onboard is re-run (e.g. to fix a config issue or add messaging channels), the Ollama auth proxy can end up running with a different token than what is stored in ~/.nemoclaw/ollama-proxy-token. Any subsequent rebuild wires the sandbox to use the stored token, while the proxy rejects it — every inference call returns HTTP 401 Unauthorized.

Root Cause

ensureOllamaAuthProxy() in src/lib/onboard-ollama-proxy.ts checks whether the PID from ollama-auth-proxy.pid belongs to a running proxy process, and if so, loads the persisted token and returns:

if (isOllamaProxyProcess(pid)) {
  ollamaProxyToken = token;  // assumes file token == running token
  return;
}

It never verifies that the running process was actually started with the persisted token. The two diverge in the following scenario:

  1. Full onboard (run 1): proxy starts with token A, token A is persisted to file and to the provider credential.
  2. Re-onboard (run 2): startOllamaAuthProxy() calls killStaleProxy(), kills the proxy, generates token B, starts proxy with token B. Token B is persisted to file and provider credential.
  3. Re-onboard (run 3, e.g. in resume/rebuild mode): inference setup is skipped ([resume] Skipping inference). ensureOllamaAuthProxy() is called instead — it reads token B from the file, sees the proxy PID is alive (still running with token B from run 2), and returns. Everything appears consistent.
  4. However, if run 2 was a full interactive onboard that generated token B but the proxy from run 1 was not killed successfully (e.g. PID file was stale or lsof returned nothing), the proxy is still running with token A, the file now has token B, and every sandbox request returns 401.

In practice, this happens when multiple onboard runs occur before the sandbox is rebuilt — a common workflow when users are troubleshooting initial setup.

Reproduction Steps

  1. Run nemoclaw onboard and select Local Ollama.
  2. Run nemoclaw onboard a second time (re-onboard for any reason — adding Telegram, fixing a setting, etc.), completing the flow.
  3. Run nemoclaw <name> rebuild.
  4. Message your bot → HTTP 401: Unauthorized.

You can confirm the divergence by comparing:

# Token the proxy is running with (Linux)
cat /proc/$(pgrep -f ollama-auth-proxy)/environ | tr '\0' '\n' | grep OLLAMA_PROXY_TOKEN

# Token stored in file
cat ~/.nemoclaw/ollama-proxy-token

Impact

  • Every inference call from the sandbox returns HTTP 401 Unauthorized.
  • The bot appears to receive messages (Telegram bridge works) but always responds with an error.
  • No error in nemoclaw status — everything looks healthy from the outside.
  • The only fix is to manually read the running process's token and sync it everywhere, which requires knowledge of the internal architecture.

Proposed Fix

Add a lightweight token validation step in ensureOllamaAuthProxy(). If the running proxy rejects the stored token, kill it and restart with the stored token:

function isProxyTokenValid(token: string): boolean {
  // /v1/models is an authenticated endpoint (unlike /api/tags which is exempt)
  const result = spawnSync("curl", [
    "-sf", "--max-time", "3",
    "-H", `Authorization: Bearer ${token}`,
    `http://localhost:${OLLAMA_PROXY_PORT}/v1/models`,
  ], { encoding: "utf8" });
  return result.status === 0;
}

function ensureOllamaAuthProxy(): void {
  const token = loadPersistedProxyToken();
  if (!token) return;

  const pid = loadPersistedProxyPid();
  if (isOllamaProxyProcess(pid) && isProxyTokenValid(token)) {
    ollamaProxyToken = token;
    return;
  }

  // Proxy not running, or running with a different token — restart with persisted token.
  killStaleProxy();
  ollamaProxyToken = token;
  spawnOllamaAuthProxy(token);
  sleep(1);
}

This is a small, targeted change with no behavior change in the happy path (first-run or reboot recovery). It only adds an extra curl call when the proxy is already running, which is negligible.

Alternatively, killStaleProxy() could use pgrep -f ollama-auth-proxy.js in addition to the PID file + lsof approach, to catch orphaned proxy processes more reliably across all platforms.

Environment

  • Hardware: NVIDIA DGX Spark (GB10 Grace Blackwell, ARM64, 128GB unified memory)
  • OS: Ubuntu (ARM64)
  • NemoClaw: v0.0.25
  • OpenClaw: v2026.4.9
  • Model: nemotron-3-super:120b via Ollama

Metadata

Metadata

Assignees

No one assigned

    Labels

    04-25-regressionIssues raised from the Apr 25 weekend regression analysisarea: e2eEnd-to-end tests, nightly failures, or validation infrastructure

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions