ollama proxy token diverges from stored token after re-onboard, causing persistent HTTP 401 on inference

## Summary

When `nemoclaw onboard` is re-run (e.g. to fix a config issue or add messaging channels), the Ollama auth proxy can end up running with a **different token** than what is stored in `~/.nemoclaw/ollama-proxy-token`. Any subsequent rebuild wires the sandbox to use the stored token, while the proxy rejects it — every inference call returns `HTTP 401 Unauthorized`.

## Root Cause

`ensureOllamaAuthProxy()` in `src/lib/onboard-ollama-proxy.ts` checks whether the PID from `ollama-auth-proxy.pid` belongs to a running proxy process, and if so, loads the persisted token and returns:

```ts
if (isOllamaProxyProcess(pid)) {
  ollamaProxyToken = token;  // assumes file token == running token
  return;
}
```

It **never verifies** that the running process was actually started with the persisted token. The two diverge in the following scenario:

1. Full onboard (run 1): proxy starts with **token A**, token A is persisted to file and to the provider credential.
2. Re-onboard (run 2): `startOllamaAuthProxy()` calls `killStaleProxy()`, kills the proxy, generates **token B**, starts proxy with token B. Token B is persisted to file and provider credential.
3. Re-onboard (run 3, e.g. in resume/rebuild mode): inference setup is **skipped** (`[resume] Skipping inference`). `ensureOllamaAuthProxy()` is called instead — it reads **token B** from the file, sees the proxy PID is alive (still running with token B from run 2), and returns. Everything appears consistent.
4. However, if run 2 was a full interactive onboard that generated token B but the proxy from run 1 was not killed successfully (e.g. PID file was stale or `lsof` returned nothing), the proxy is still running with **token A**, the file now has **token B**, and every sandbox request returns 401.

In practice, this happens when multiple onboard runs occur before the sandbox is rebuilt — a common workflow when users are troubleshooting initial setup.

## Reproduction Steps

1. Run `nemoclaw onboard` and select Local Ollama.
2. Run `nemoclaw onboard` a second time (re-onboard for any reason — adding Telegram, fixing a setting, etc.), completing the flow.
3. Run `nemoclaw <name> rebuild`.
4. Message your bot → `HTTP 401: Unauthorized`.

You can confirm the divergence by comparing:
```bash
# Token the proxy is running with (Linux)
cat /proc/$(pgrep -f ollama-auth-proxy)/environ | tr '\0' '\n' | grep OLLAMA_PROXY_TOKEN

# Token stored in file
cat ~/.nemoclaw/ollama-proxy-token
```

## Impact

- Every inference call from the sandbox returns `HTTP 401 Unauthorized`.
- The bot appears to receive messages (Telegram bridge works) but always responds with an error.
- No error in `nemoclaw status` — everything looks healthy from the outside.
- The only fix is to manually read the running process's token and sync it everywhere, which requires knowledge of the internal architecture.

## Proposed Fix

Add a lightweight token validation step in `ensureOllamaAuthProxy()`. If the running proxy rejects the stored token, kill it and restart with the stored token:

```ts
function isProxyTokenValid(token: string): boolean {
  // /v1/models is an authenticated endpoint (unlike /api/tags which is exempt)
  const result = spawnSync("curl", [
    "-sf", "--max-time", "3",
    "-H", `Authorization: Bearer ${token}`,
    `http://localhost:${OLLAMA_PROXY_PORT}/v1/models`,
  ], { encoding: "utf8" });
  return result.status === 0;
}

function ensureOllamaAuthProxy(): void {
  const token = loadPersistedProxyToken();
  if (!token) return;

  const pid = loadPersistedProxyPid();
  if (isOllamaProxyProcess(pid) && isProxyTokenValid(token)) {
    ollamaProxyToken = token;
    return;
  }

  // Proxy not running, or running with a different token — restart with persisted token.
  killStaleProxy();
  ollamaProxyToken = token;
  spawnOllamaAuthProxy(token);
  sleep(1);
}
```

This is a small, targeted change with no behavior change in the happy path (first-run or reboot recovery). It only adds an extra `curl` call when the proxy is already running, which is negligible.

Alternatively, `killStaleProxy()` could use `pgrep -f ollama-auth-proxy.js` in addition to the PID file + `lsof` approach, to catch orphaned proxy processes more reliably across all platforms.

## Environment

- Hardware: NVIDIA DGX Spark (GB10 Grace Blackwell, ARM64, 128GB unified memory)
- OS: Ubuntu (ARM64)
- NemoClaw: v0.0.25
- OpenClaw: v2026.4.9
- Model: nemotron-3-super:120b via Ollama

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ollama proxy token diverges from stored token after re-onboard, causing persistent HTTP 401 on inference #2553

Summary

Root Cause

Reproduction Steps

Impact

Proposed Fix

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

ollama proxy token diverges from stored token after re-onboard, causing persistent HTTP 401 on inference #2553

Description

Summary

Root Cause

Reproduction Steps

Impact

Proposed Fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions