Skip to content

[Station][Inference] TUI token counter shows ? after successful ollama-local inference instead of actual token usage #2747

@hulynn

Description

@hulynn

Description

Description

When using the ollama-local provider, the OpenClaw TUI status bar always shows "tokens ?/131k" — the used-token count never updates from ? even after a confirmed successful inference round-trip. Cloud providers (openai, anthropic, nvidia) update the counter correctly after each response.
Environment
Device:        Station
OS:            Ubuntu 24.04.4 LTS
Architecture:  aarch64
Node.js:       v22.22.2
npm:           10.9.7
Docker:        Docker version 29.2.1, build a5c7197
OpenShell CLI: openshell 0.0.36
NemoClaw:      v0.0.31
OpenClaw:      2026.4.24 (cbcfdf6)
Ollama:        latest (installed via https://ollama.com/install.sh)
Model:         qwen2.5:7b
Steps to Reproduce
1. Install NemoClaw and onboard with ollama-local provider (ollama must be running on host)
2. nemoclaw  connect
3. Inside sandbox: openclaw tui
4. Send any prompt, e.g. "say hello in one sentence"
5. Wait for inference to complete (status bar shows "idle")
6. Observe the token count in the status bar
Expected Result
Status bar shows actual token usage after inference completes, e.g.:
  agent main | session main | inference/qwen2.5:7b | tokens 142/131k
Actual Result
Status bar shows ? indefinitely regardless of how many prompts are sent:
  agent main | session main | inference/qwen2.5:7b | tokens ?/131k

The ? never resolves to a number for ollama-local. Inference itself succeeds — gateway log confirms:
  [ws] res chat.history 8983ms
  [sandbox] NET:OPEN ALLOWED inference.local:443
  [sandbox] routing proxy inference request endpoint=http://host.openshell.internal:11435/v1 path=/v1/chat/completions
Logs
Captured from: nemoclaw my-assistant logs + openclaw tui session
Host: galaxy-sku2-018, NemoClaw v0.0.31, provider: ollama-local, model: qwen2.5:7b

Note: "tokens ?" also appears transiently before the first prompt is sent (expected / separate known state).
This bug is specifically about the count remaining ? after inference completes successfully.

Note: This issue does not affect cloud providers. Related upstream context: Ollama token count
inconsistency for specific models tracked in Bug 5667866 (Gemma 3n), but the issue here is
broader — ? for all models via ollama-local, not a count mismatch.

Bug Details

Field Value
Priority Unprioritized
Action Dev - Open - To fix
Disposition Open issue
Module Machine Learning - NemoClaw
Keyword NemoClaw, NemoClaw_Agent&Skills, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Inference

[NVB#6130030]

Metadata

Metadata

Assignees

Labels

NV QABugs found by the NVIDIA QA Teamarea: inferenceInference routing, serving, model selection, or outputsarea: local-modelsLocal model providers, downloads, launch, or connectivityarea: providersInference provider integrations and provider behaviorprovider: ollamaOllama local model provider behavior
No fields configured for Enhancement.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions