feat(credits): usage-aware credits — in-session notices, /usage view, dev readout#40011
Conversation
…sage-aware credits) L0 of the usage-aware-credits feature: a dev-only, env-gated tracer that exercises the real header -> CreditsState -> TUI pipe end-to-end behind HERMES_DEV_CREDITS, de-risking the L1/L5 build before the notice policy exists. - agent/credits_tracker.py: CreditsState + parse_credits_headers (headers are strings -> paid_access via == "true", never bool(); retain-last-known; only subscription_micros may be negative; *_usd kept verbatim). - run_agent.py: _capture_credits / get_credits_state / get_credits_spent_micros, session-start baseline latch, + dev-gated "credits" capture log. - agent/chat_completion_helpers.py: capture on the streaming response. - agent/agent_init.py: init _credits_state + _credits_session_start_micros. - tui_gateway/server.py: _get_usage emits dev_credits_spent_micros only when flagged. - ui-tui appChrome.tsx / types.ts: cents delta status segment + "(dev credits)" banner. Off by default; silent for normal users. Validated live against staging (capture log delta matches the TUI segment). Throwaway consumer (readout/log/ banner); credits_tracker + the capture plumbing are the real feature foundation.
Add tests/agent/test_credits_tracker.py with 92 tests covering the 9-state
matrix (healthy, sub_90pct, grant_exhausted, purchased_only, tool_pool_free,
depleted, debt, missing, no_org) plus validation edge cases: version strict==1
with warn-once latch for v>1, bool-string trap (paid_access/tool_pool_gated_off
== "true"/"false", never bool()), half-pair subscription limit treated as
both-absent while parse succeeds, USD regex ^-?\d+\.\d{2}$, non-int micros
→ None, negative non-subscription micros → None, as_of_ms junk → None, zero
limit ZeroDivision guard.
Harden agent/credits_tracker.py to match the spec:
- Add tool_pool_micros/tool_pool_gated_off/from_header fields to CreditsState
- Add depleted property (== not paid_access, never remaining==0)
- Change used_fraction guard to key off subscription_limit_micros (the actual
denominator) not denominator_kind (metadata)
- Replace fail-soft _safe_int with a sentinel-returning variant; full validation
now returns None on any malformed field rather than silently defaulting
- Add module-level warn-once latch for version > 1
- Add USD regex validation; add denominator_kind allow-list check
- Parse x-nous-tool-pool-* prefix headers (not x-nous-credits-tool-pool-*)
…ear_callback + TUI binding (L1) L1 of usage-aware credits: the driver-agnostic notice delivery spine that L4's policy will fire through and L5's TUI render will consume. - agent/credits_tracker.py: AgentNotice dataclass (text/level/kind/ttl_ms/key/id; kind defaults "sticky", kept TTL-expressive for a future config seam). - run_agent.py: AIAgent gains notice_callback + notice_clear_callback slots and _emit_notice / _emit_notice_clear emitters (swallow all callback errors — a notice must never break the agent loop; no-op when unbound). - agent/agent_init.py: thread both callbacks through init_agent. - tui_gateway/server.py: bind both in _agent_cbs → notification.show / notification.clear WS events (snake_case payload, matching the existing gateway-event convention). - ui-tui/src/gatewayTypes.ts: notification.show / notification.clear arms on GatewayEvent. - tests/run_agent/test_notice_spine.py: 15 tests (emitter fire + fail-open + no-op, signature threading, TUI binding payload shape). Messaging push is out of v1 (binds neither callback). CLI binding + the TUI render/ decode land with L4 (firing) and L5 (render) so turn-end flush is wired correctly.
After a fresh header parse, _capture_credits runs evaluate_credits_notices against
the agent's _credits_latch and emits the result — clears first, then shows (so a
recovered depletion clears before the "restored" success lands, and depleted wins
the latest-wins slot). Gated on a bound notice_callback: messaging (no callbacks)
still caches state for /usage but runs no policy. Parse stays fail-open (miss →
keep last-known); the eval/emit path warns on failure rather than swallowing, so a
depletion-notice bug can't vanish silently.
- run_agent.py: _capture_credits split into parse (swallow→miss) + policy (warn);
latch lazy-guarded (object.__new__ safety).
- agent/agent_init.py: init agent._credits_latch = {"active": set(), "seen_below_90": False}.
The TUI now renders the notification.show / notification.clear gateway events the agent emits — a level-colored notice overrides the status/verb slot when not busy. - Notice state machine on turnController (pendingNotice + dedicated noticeTimer + show/clear/applyNotice/flushPendingNotice/clearNoticeState). createGatewayEventHandler decodes the events and delegates. - Render priority busy > notice > status (appChrome StatusRule); notice text rendered verbatim (its glyph comes from the policy), shrinkable so it never clips model│ctx; dev-credits banner + Δ segment preserved. UiState.notice is snake_case (matches wire). - Busy-wins: a notice arriving mid-turn is held and flushed at the THREE turn-end sites (recordMessageComplete / interruptTurn / recordError) — never idle(), which reset() also calls (would leak across sessions); reset() clears instead. - Dedicated noticeTimer (never statusTimer); TTL starts on visibility with an id-guard; latest-wins cancels the prior timer; clear is key-matched (no-op on mismatch); a sticky survives a turn (flush no-ops with no pending); session reset clears (no cross-session leak). - 20 tests (handler/turnController logic incl. R3-C2 timer isolation + render priority).
A genuinely-new Nous session has no inference header yet, so seed credits state from the authoritative GET /api/oauth/account snapshot at session start (in the new-session branch of _restore_or_build_system_prompt — inline, since the on_session_start plugin hook gets no agent reference). The seed runs the shared notice policy, so a session that opens already depleted warns IMMEDIATELY rather than only after the first turn. - Maps the nested account fields (paid_service_access → paid_access; total_usable / subscription / purchased on paid_service_access_info; rollover on subscription), each None-guarded; float dollars → micros via round(d*1e6), *_usd left "" (render formats from micros — never synthesize a verbatim usd from a float). - Magnitudes-only: no monthlyCredits on the endpoint → subscription_limit_* unset → used_fraction None → no warn90 from the seed (% only once a header lands, per D-E). - Provider-guarded to Nous; fail-open (any error leaves _credits_state None, never blocks startup); paid_access unknown ⇒ True (never falsely depleted). - run_agent.py: extracted the warm-path policy/emit block into a shared _emit_credits_notices() so capture and the seed fire notices identically.
… (L6) Add Nous credit dollar magnitudes to /usage (subscription / top-up / total + rollover + renewal + portal CTA), magnitudes-only per v1 (no % until the account endpoint exposes a denominator). Reuses the existing account-usage render machinery via a new pure build_nous_credits_snapshot() that maps a NousPortalAccountInfo to an AccountUsageSnapshot; no nous branch is added to fetch_account_usage (keeps the per-provider boundary intact). CLI /usage also doubles as a depletion-recovery trigger: a force_fresh account fetch, kept in a SEPARATE local so it never clobbers the header-sourced agent._credits_state (which alone carries used_fraction). If paid access recovered while credits.depleted is latched and a notice consumer is bound, it reuses agent._emit_credits_notices() to clear it. Gateway /usage displays magnitudes only — messaging binds no notice consumer, so it performs no recovery emit. Fail-open throughout: any portal hiccup leaves /usage unaffected.
…lpers The dev-flag truthy check was inlined in three places. Replace with the shared utils.is_truthy_value (run_agent.py, tui_gateway/server.py — also drops a redundant inline `import os`) and a hoisted DEV_CREDITS_MODE export in ui-tui/src/config/env.ts (consumed by appChrome, which also stops recomputing the env check on every render). Behaviour-preserving; identical truthy set.
… (L6 review) Adversarial review found the /usage depletion-recovery trigger dead AND broken: the CLI binds no notice_clear_callback, the TUI runs /usage in a separate slash-worker subprocess (its own agent/latch), and the no-clobber rule made it evaluate stale paid_access anyway. Recovery already happens on the next inference (warm path), so the trigger was redundant — remove it and stop the depleted notice over-promising. - cli.py: remove the dead recovery block; bound the /usage portal fetch with a 10s wall-clock timeout (ThreadPoolExecutor) like the per-provider fetch — urllib's per-socket timeout is not a wall-clock guarantee. - agent/credits_tracker.py: reword the depleted CTA to "run /usage for balance" (no false recovery promise; /usage shows fresh magnitudes, sticky clears next turn). - agent/conversation_loop.py: same wall-clock timeout on the cold-start seed fetch so a stalled portal can't hang session startup; tidy its time import.
🔎 Lint report:
|
| Rule | Count |
|---|---|
unresolved-attribute |
28 |
invalid-argument-type |
5 |
unresolved-import |
4 |
invalid-assignment |
2 |
unsupported-operator |
2 |
not-subscriptable |
1 |
First entries
tests/agent/test_credits_tracker.py:498: [unresolved-attribute] unresolved-attribute: Attribute `tool_pool_gated_off` is not defined on `None` in union `CreditsState | None`
tests/agent/test_credits_tracker.py:12: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
agent/credits_tracker.py:607: [invalid-argument-type] invalid-argument-type: Argument is incorrect: Expected `str | None`, found `int | str | Unknown`
tests/gateway/test_notice_rendering.py:60: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/run_agent/test_notice_spine.py:15: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
gateway/run.py:18016: [invalid-assignment] invalid-assignment: Object of type `None` is not assignable to attribute `notice_clear_callback` on type `(Any & ~None) | AIAgent`
tests/agent/test_credits_tracker.py:171: [unresolved-attribute] unresolved-attribute: Attribute `remaining_usd` is not defined on `None` in union `CreditsState | None`
tests/agent/test_credits_tracker.py:494: [unresolved-attribute] unresolved-attribute: Attribute `tool_pool_micros` is not defined on `None` in union `CreditsState | None`
run_agent.py:819: [unresolved-attribute] unresolved-attribute: Object of type `Self@_emit_notice_clear` has no attribute `notice_clear_callback`
tests/agent/test_credits_tracker.py:166: [unresolved-attribute] unresolved-attribute: Attribute `captured_at` is not defined on `None` in union `CreditsState | None`
tests/run_agent/test_notice_spine.py:81: [unresolved-attribute] unresolved-attribute: Unresolved attribute `notice_callback` on type `AIAgent`
tests/agent/test_credits_tracker.py:339: [unresolved-attribute] unresolved-attribute: Attribute `paid_access` is not defined on `None` in union `CreditsState | None`
tests/agent/test_nous_credits_gauge.py:35: [unresolved-attribute] unresolved-attribute: Attribute `credits_remaining` is not defined on `None` in union `NousPortalSubscriptionInfo | None`
agent/credits_tracker.py:607: [invalid-argument-type] invalid-argument-type: Argument is incorrect: Expected `int | None`, found `int | str | Unknown`
tests/agent/test_credits_tracker.py:335: [unresolved-attribute] unresolved-attribute: Attribute `subscription_usd` is not defined on `None` in union `CreditsState | None`
tests/agent/test_credits_tracker.py:261: [unresolved-attribute] unresolved-attribute: Attribute `denominator_kind` is not defined on `None` in union `CreditsState | None`
tests/agent/test_credits_cold_start.py:36: [unsupported-operator] unsupported-operator: Operator `-` is not supported between objects of type `int | float | None` and `float`
tests/agent/test_credits_tracker.py:204: [unresolved-attribute] unresolved-attribute: Attribute `has_data` is not defined on `None` in union `CreditsState | None`
tests/run_agent/test_notice_spine.py:87: [unresolved-attribute] unresolved-attribute: Unresolved attribute `notice_clear_callback` on type `AIAgent`
run_agent.py:2838: [unresolved-attribute] unresolved-attribute: Object of type `Self@get_credits_spent_micros` has no attribute `_credits_session_start_micros`
agent/credits_tracker.py:607: [invalid-argument-type] invalid-argument-type: Argument is incorrect: Expected `bool`, found `int | str | Unknown`
tests/agent/test_credits_tracker.py:331: [unresolved-attribute] unresolved-attribute: Attribute `subscription_micros` is not defined on `None` in union `CreditsState | None`
run_agent.py:811: [unresolved-attribute] unresolved-attribute: Object of type `Self@_emit_notice` has no attribute `notice_callback`
tests/agent/test_credits_tracker.py:313: [unresolved-attribute] unresolved-attribute: Attribute `disabled_reason` is not defined on `None` in union `CreditsState | None`
tests/agent/test_credits_tracker.py:160: [unresolved-attribute] unresolved-attribute: Attribute `from_header` is not defined on `None` in union `CreditsState | None`
... and 17 more
✅ Fixed issues: none
Unchanged: 5120 pre-existing issues carried over.
Diagnostics are surfaced as warnings — this check never fails the build.
|
@daimon-nous can u review this feature? |
Review — feat(credits): usage-aware creditsVerdict: solid. Display/notify-only, fail-open, and it honors every invariant that matters here. No blockers; the open items below are tradeoffs to confirm before landing, not defects. What I verified
Open items to confirm before landing
Nits (non-blocking)
Nice work — the fail-open discipline and the busy-wins/timer-isolation handling in the TUI are the parts that usually go wrong, and both are handled carefully here. |
Throwaway dev scaffolding to exercise the notice pipeline without real spend or Redis seeding. Set HERMES_DEV_CREDITS_FIXTURE to a state name (healthy / sub_90pct / grant_exhausted / depleted / clear) or a file path whose contents name a state (re-read each turn → flip states live for recovery testing). _capture_credits injects the chosen CreditsState instead of parsing real headers and runs the shared notice policy. Deletable with the rest of the HERMES_DEV_CREDITS scaffolding.
The portal /api/oauth/account subscription block now carries monthly_credits (the per-period grant allowance, the % denominator). The consumer parsed monthly_charge but dropped monthly_credits, so /usage stayed magnitudes-only. Capture monthly_credits into NousPortalSubscriptionInfo + _subscription_from_payload. build_nous_credits_snapshot emits a Subscription usage window (real % used, routed through the existing render machinery) when monthly_credits is a finite positive denominator and credits_remaining is finite and <= cap; otherwise it degrades to magnitudes-only (older portals, rollover-over-cap, or non-finite payloads). Guards (adversarial-review-driven): reject non-finite operands (json.loads parses bare NaN/Infinity by default → would render $nan + a false 100% used), reject bools, guard div-by-zero (cap>0), and suppress the gauge when remaining > cap (rollover spanning the period makes the cap a nonsensical denominator → the $X-of-$Y detail would read as a contradiction). Debt (remaining<0) clamps to 100%. Money rule preserved: the ratio + magnitudes are computed from numeric float account fields via display formatting, never by parsing a server *_usd string (there are none on these dataclasses). 13 gauge tests added (tests/agent/test_nous_credits_gauge.py).
/usage runs in a slash-worker subprocess whose resolved inference provider is
often not "nous" even when the user has a Nous account, so gating the Nous
credits block on (provider == "nous") hid it entirely — the account data was
fully available but never rendered.
Gate instead on "a Nous account is logged in": a cheap local auth-state lookup
(get_provider_auth_state('nous') has an access_token) decides whether to attempt
the portal fetch, regardless of which provider inference runs on. In the gateway
the block is also lifted out of the 'if provider:' scope so a Nous-credentialled
user with another (or no) resident inference provider still sees their balance.
Fail-open and the per-fetch wall-clock timeout are preserved.
…slash-worker) In the TUI, /usage runs in a slash-worker subprocess that resumes the session WITHOUT building an agent (self.agent is None), so _show_usage early-returned "(._.) No active agent" before ever reaching the Nous credits block — which is agent-independent (a portal fetch gated on Nous auth-state). Extract the block into _print_nous_credits_block() and run it at the no-agent / no-calls early-returns too (returns True if it printed, so the fallback message only shows when there's genuinely nothing). Verified live against staging: the block + monthly-grant gauge now render in the slash-worker /usage path (previously hidden). The plain CLI REPL + messaging paths are unchanged (they have a live agent).
Replace the lone 90%-used warning with three escalating bands (50 info, 75 warn, 90 warn) shown as ONE status-bar line: it displays the highest band the subscription grant has crossed, replaces the line as usage climbs, steps back down on recovery, and clears below 50%. No stacking, no per-turn churn. Bands live in a tunable CREDITS_USAGE_BANDS list; the policy derives everything from it. Single notice key (credits.usage) with a usage_band latch field so the notice only re-emits when the band actually changes. The crossing gate (seen_below_90) is preserved so a fresh live session that opens mid-range stays quiet until it has been observed below the lowest band (cold-start primes it when it wants an open-high warning). Denominator math unchanged: % = subscription grant burn (cap - grant_remaining)/cap, clamped [0,1]; top-up never moves the %. Migrated test_credits_policy.py to the new key + added TestUsageBands (climb, step-down, recovery-clear, idempotent, inclusive boundaries).
… first-turn) Notices previously only fired inside a conversation turn (first message), so a session that opened already depleted / past a usage band showed nothing at 'ready'. Extract the cold-start seed into a shared seed_credits_at_session_start() and call it (a) in the TUI/desktop agent build right after the notice callback is wired (fires at 'ready', before any message) and (b) as the first-turn fallback in conversation_loop. Idempotent (skips once _credits_state exists) and fail-open. The seed now maps monthly_credits -> subscription_limit_micros + denominator_kind='subscription_cap', so used_fraction is computable at seed time and usage-band warnings (not just depletion) hydrate on open. Primes the crossing latch so a session opening already in a band warns immediately. Degrades to depletion-only when monthly_credits is absent (older portals). Adds test_credits_cold_start.py covering open-at-band, depletion, debt, no-cap degradation, and the shared seed (fires/idempotent/skips-non-nous).
…urfacing agent/account_usage.py: build_nous_credits_snapshot emits a subscription %% gauge when the portal supplies a positive, finite monthly_credits denominator with remaining <= cap (guards reject NaN/Infinity and rollover-over-cap, which would render $nan or a contradictory $X-of-$Y); degrades to magnitudes-only otherwise. Adds shared nous_credits_lines() (auth-gated, wall-clock-bounded portal fetch) so the CLI and TUI /usage render the same block, and _snapshot_from_credits_state() so HERMES_DEV_CREDITS_FIXTURE drives /usage offline too. TUI: session.usage RPC carries credits_lines (agent-independent) and the /usage panel renders them regardless of API-call count or resume state — previously the TUI's separate /usage implementation only showed token counts. Money rule preserved: %% and magnitudes come from numeric float account fields via display formatting, never by parsing a server *_usd string.
The plain CLI agent bound no notice callbacks, so credit notices were TUI-only. Bind notice_callback/notice_clear_callback on the CLI AIAgent; _on_notice renders a single level-colored line above the prompt (error red / warn yellow / success green / info dim) via _cprint, and seed credits at session open so a depletion or usage-band warning shows before the first message — the same hydration the TUI got. _on_notice_clear is a no-op (the REPL prints lines, no persistent slot).
…ge bands The fixture set jumped 10%% -> 90%%; add sub_50pct (uf 0.5 -> band 50 info) and sub_75pct (uf 0.75 -> band 75 warn) so the new escalating bands are exercisable via HERMES_DEV_CREDITS_FIXTURE across all three surfaces (notice, session-open seed, /usage gauge).
…ever) A 50/75/90 usage heads-up was sticky and camped the status bar indefinitely. Clear the visible credits.usage notice when a new turn starts (startMessage), so it shows until your next prompt then yields. The server latch is unchanged, so it won't re-nag at the same band — it only re-shows when the band actually changes (climb) or clears when usage drops below the lowest band. Depletion stays sticky.
…redits_lines() The CLI (_print_nous_credits_block) and the messaging gateway (_handle_usage_command) each re-implemented the auth-gate + portal fetch + render, and both bypassed the dev-fixture short-circuit that only the TUI honored — so /usage ignored HERMES_DEV_CREDITS_FIXTURE on the CLI and in chat. Route both through the shared agent.account_usage.nous_credits_lines() helper: one fetch/render path, one auth gate, and the fixture works on every surface (~60 fewer duplicated lines). The gateway usage test recorded only the last asyncio.to_thread call; /usage now dispatches both the account fetch and the credits fetch, so it records every call and matches the account fetch by its provider arg.
_is_finite_num is now a TypeGuard[float], so the type checker narrows the gauge operands (monthly_credits / credits_remaining) and the magnitudes passed to _fmt_usd through it — no more None-operand warnings on the arithmetic. Add a debug breadcrumb on the nous_credits_lines portal-fetch fail-open so a dead /usage block is diagnosable in agent.log without a dev flag.
…obe, fire-and-forget seed - Prod-leak guard: dev fixtures (HERMES_DEV_CREDITS_FIXTURE) now also require HERMES_DEV_CREDITS, so a stray fixture var can't surface fabricated balances on a real account. Matches the documented run workflow (both vars set together). - Hot-path probe: parse_credits_headers checks for the version sentinel header before allocating a lowercased copy of the response headers — skips that work on every non-Nous API call. Behaviour-identical and still case-insensitive. - Fire-and-forget seed: the real portal fetch in seed_credits_at_session_start now runs in a daemon thread, so a slow/unreachable portal never delays session "ready" (previously blocked up to 10s). The dev-fixture path stays synchronous; the thread re-checks idempotency before hydrating (a live header may land first). - Diagnostics: debug breadcrumbs on the parse and seed fail-open paths so a crashed parser / dead seed is distinguishable from a legitimate no-headers miss. Cold-start tests set HERMES_DEV_CREDITS alongside the fixture to match the gate.
DEV_CREDITS_MODE is read once at module load (config/env), so mutating process.env.HERMES_DEV_CREDITS inside the test couldn't flip it — the dev-banner assertion only passed if the env was exported before vitest started, and failed in a normal run. Move that assertion to a sibling file that mocks config/env with DEV_CREDITS_MODE: true (scoped, no module-reset / React-identity hazard).
…ar-on-prompt - _snapshot_from_credits_state (the offline /usage renderer) had no direct test: lock the gauge math, the verbatim *_usd magnitudes, the depletion line and the fixture marker, plus the no-cap (no gauge) and None-state cases. - turnController.startMessage had no test for clearing the credits.usage notice on the next prompt while leaving credits.depleted sticky.
Bind notice_callback/notice_clear_callback on the per-turn gateway agent
so usage-band / depletion / restored notices reach Telegram/Discord/Slack/
etc. Previously the messaging gateway bound neither callback, so the agent's
_emit_credits_notices early-returned and a chat user crossing a band got
nothing unless they ran /usage manually.
- render_notice_line(): AgentNotice -> single plaintext line (level glyph +
text), plaintext-only so it renders uniformly without per-platform escaping.
Fail-soft on malformed/empty notices.
- Standalone push for every notice (messaging has no persistent status bar):
route through the shared _deliver_platform_notice rail (honors private/
public delivery + thread metadata), scheduled onto the gateway loop via
safe_schedule_threadsafe from the agent's sync worker thread — same pattern
as _status_callback_sync.
- The fired-once latch lives on the cached (reused-in-place) agent and
persists across turns, so a band crosses once -> one push, no per-turn
re-nag. Re-fires only after idle-eviction rebuilds the agent (a reminder).
- Recovery ('Credit access restored') rides the show path (emitted as a
success notice, not a clear). notice_clear_callback is a no-op: a sent
platform message can't be cleanly retracted.
Tests: render glyph/levels/fail-soft + public/private delivery seam through
_deliver_platform_notice + no-adapter no-op.
|
@daimon-nous -- another review pass pls. session search to find ur prev review |
Re-review — second pass (10 new commits since L6)Verdict: ready. Both open items from my last pass are resolved, and the new gateway-delivery path honors every invariant that matters. No blockers. One cosmetic nit carried over. Both prior open items — closed
New gateway delivery (commit 907d8de) — verified
Money rule still held
Prod-leak gate — verified
Tests176 targeted Python tests green ( Notes (non-blocking)
Nice iteration — the fire-and-forget seed and reusing |
render_notice_line prepended a per-level glyph, but the notice policy already
bakes the glyph into the text (and the TUI + CLI render it verbatim) — so every
credit notice over messaging came out doubled ("⚠ ⚠ Credits 90% used",
"⛔ ✕ Credit access paused"). Emit the text verbatim instead; drop the now-dead
level→glyph map.
The render tests fed glyph-less text (and the success case only checked
startswith), so the doubling slipped through. Rework them around the verbatim
contract and add an end-to-end regression that runs real evaluate_credits_notices
output through render_notice_line and asserts the line is returned unchanged.
… dev readout (NousResearch#40011) * feat(tui): HERMES_DEV_CREDITS live-spend dev readout (L0 tracer for usage-aware credits) L0 of the usage-aware-credits feature: a dev-only, env-gated tracer that exercises the real header -> CreditsState -> TUI pipe end-to-end behind HERMES_DEV_CREDITS, de-risking the L1/L5 build before the notice policy exists. - agent/credits_tracker.py: CreditsState + parse_credits_headers (headers are strings -> paid_access via == "true", never bool(); retain-last-known; only subscription_micros may be negative; *_usd kept verbatim). - run_agent.py: _capture_credits / get_credits_state / get_credits_spent_micros, session-start baseline latch, + dev-gated "credits" capture log. - agent/chat_completion_helpers.py: capture on the streaming response. - agent/agent_init.py: init _credits_state + _credits_session_start_micros. - tui_gateway/server.py: _get_usage emits dev_credits_spent_micros only when flagged. - ui-tui appChrome.tsx / types.ts: cents delta status segment + "(dev credits)" banner. Off by default; silent for normal users. Validated live against staging (capture log delta matches the TUI segment). Throwaway consumer (readout/log/ banner); credits_tracker + the capture plumbing are the real feature foundation. * test(credits): lock parser under 9-state matrix + harden validation (L2) Add tests/agent/test_credits_tracker.py with 92 tests covering the 9-state matrix (healthy, sub_90pct, grant_exhausted, purchased_only, tool_pool_free, depleted, debt, missing, no_org) plus validation edge cases: version strict==1 with warn-once latch for v>1, bool-string trap (paid_access/tool_pool_gated_off == "true"/"false", never bool()), half-pair subscription limit treated as both-absent while parse succeeds, USD regex ^-?\d+\.\d{2}$, non-int micros → None, negative non-subscription micros → None, as_of_ms junk → None, zero limit ZeroDivision guard. Harden agent/credits_tracker.py to match the spec: - Add tool_pool_micros/tool_pool_gated_off/from_header fields to CreditsState - Add depleted property (== not paid_access, never remaining==0) - Change used_fraction guard to key off subscription_limit_micros (the actual denominator) not denominator_kind (metadata) - Replace fail-soft _safe_int with a sentinel-returning variant; full validation now returns None on any malformed field rather than silently defaulting - Add module-level warn-once latch for version > 1 - Add USD regex validation; add denominator_kind allow-list check - Parse x-nous-tool-pool-* prefix headers (not x-nous-credits-tool-pool-*) * feat(credits): notice spine — AgentNotice + notice_callback/notice_clear_callback + TUI binding (L1) L1 of usage-aware credits: the driver-agnostic notice delivery spine that L4's policy will fire through and L5's TUI render will consume. - agent/credits_tracker.py: AgentNotice dataclass (text/level/kind/ttl_ms/key/id; kind defaults "sticky", kept TTL-expressive for a future config seam). - run_agent.py: AIAgent gains notice_callback + notice_clear_callback slots and _emit_notice / _emit_notice_clear emitters (swallow all callback errors — a notice must never break the agent loop; no-op when unbound). - agent/agent_init.py: thread both callbacks through init_agent. - tui_gateway/server.py: bind both in _agent_cbs → notification.show / notification.clear WS events (snake_case payload, matching the existing gateway-event convention). - ui-tui/src/gatewayTypes.ts: notification.show / notification.clear arms on GatewayEvent. - tests/run_agent/test_notice_spine.py: 15 tests (emitter fire + fail-open + no-op, signature threading, TUI binding payload shape). Messaging push is out of v1 (binds neither callback). CLI binding + the TUI render/ decode land with L4 (firing) and L5 (render) so turn-end flush is wired correctly. * feat(credits): threshold reconciliation policy + tests (L4.1) * feat(credits): wire threshold policy into capture + latch (L4.2) After a fresh header parse, _capture_credits runs evaluate_credits_notices against the agent's _credits_latch and emits the result — clears first, then shows (so a recovered depletion clears before the "restored" success lands, and depleted wins the latest-wins slot). Gated on a bound notice_callback: messaging (no callbacks) still caches state for /usage but runs no policy. Parse stays fail-open (miss → keep last-known); the eval/emit path warns on failure rather than swallowing, so a depletion-notice bug can't vanish silently. - run_agent.py: _capture_credits split into parse (swallow→miss) + policy (warn); latch lazy-guarded (object.__new__ safety). - agent/agent_init.py: init agent._credits_latch = {"active": set(), "seen_below_90": False}. * feat(tui): render credits notices in the status bar (L5, Strategy B) The TUI now renders the notification.show / notification.clear gateway events the agent emits — a level-colored notice overrides the status/verb slot when not busy. - Notice state machine on turnController (pendingNotice + dedicated noticeTimer + show/clear/applyNotice/flushPendingNotice/clearNoticeState). createGatewayEventHandler decodes the events and delegates. - Render priority busy > notice > status (appChrome StatusRule); notice text rendered verbatim (its glyph comes from the policy), shrinkable so it never clips model│ctx; dev-credits banner + Δ segment preserved. UiState.notice is snake_case (matches wire). - Busy-wins: a notice arriving mid-turn is held and flushed at the THREE turn-end sites (recordMessageComplete / interruptTurn / recordError) — never idle(), which reset() also calls (would leak across sessions); reset() clears instead. - Dedicated noticeTimer (never statusTimer); TTL starts on visibility with an id-guard; latest-wins cancels the prior timer; clear is key-matched (no-op on mismatch); a sticky survives a turn (flush no-ops with no pending); session reset clears (no cross-session leak). - 20 tests (handler/turnController logic incl. R3-C2 timer isolation + render priority). * feat(credits): cold-start seed for new Nous sessions (L3) A genuinely-new Nous session has no inference header yet, so seed credits state from the authoritative GET /api/oauth/account snapshot at session start (in the new-session branch of _restore_or_build_system_prompt — inline, since the on_session_start plugin hook gets no agent reference). The seed runs the shared notice policy, so a session that opens already depleted warns IMMEDIATELY rather than only after the first turn. - Maps the nested account fields (paid_service_access → paid_access; total_usable / subscription / purchased on paid_service_access_info; rollover on subscription), each None-guarded; float dollars → micros via round(d*1e6), *_usd left "" (render formats from micros — never synthesize a verbatim usd from a float). - Magnitudes-only: no monthlyCredits on the endpoint → subscription_limit_* unset → used_fraction None → no warn90 from the seed (% only once a header lands, per D-E). - Provider-guarded to Nous; fail-open (any error leaves _credits_state None, never blocks startup); paid_access unknown ⇒ True (never falsely depleted). - run_agent.py: extracted the warm-path policy/emit block into a shared _emit_credits_notices() so capture and the seed fire notices identically. * feat(credits): /usage Nous credits magnitudes view + recovery trigger (L6) Add Nous credit dollar magnitudes to /usage (subscription / top-up / total + rollover + renewal + portal CTA), magnitudes-only per v1 (no % until the account endpoint exposes a denominator). Reuses the existing account-usage render machinery via a new pure build_nous_credits_snapshot() that maps a NousPortalAccountInfo to an AccountUsageSnapshot; no nous branch is added to fetch_account_usage (keeps the per-provider boundary intact). CLI /usage also doubles as a depletion-recovery trigger: a force_fresh account fetch, kept in a SEPARATE local so it never clobbers the header-sourced agent._credits_state (which alone carries used_fraction). If paid access recovered while credits.depleted is latched and a notice consumer is bound, it reuses agent._emit_credits_notices() to clear it. Gateway /usage displays magnitudes only — messaging binds no notice consumer, so it performs no recovery emit. Fail-open throughout: any portal hiccup leaves /usage unaffected. * refactor(credits): dedupe HERMES_DEV_CREDITS flag parse via shared helpers The dev-flag truthy check was inlined in three places. Replace with the shared utils.is_truthy_value (run_agent.py, tui_gateway/server.py — also drops a redundant inline `import os`) and a hoisted DEV_CREDITS_MODE export in ui-tui/src/config/env.ts (consumed by appChrome, which also stops recomputing the env check on every render). Behaviour-preserving; identical truthy set. * fix(credits): cut dead /usage recovery trigger + bound portal fetches (L6 review) Adversarial review found the /usage depletion-recovery trigger dead AND broken: the CLI binds no notice_clear_callback, the TUI runs /usage in a separate slash-worker subprocess (its own agent/latch), and the no-clobber rule made it evaluate stale paid_access anyway. Recovery already happens on the next inference (warm path), so the trigger was redundant — remove it and stop the depleted notice over-promising. - cli.py: remove the dead recovery block; bound the /usage portal fetch with a 10s wall-clock timeout (ThreadPoolExecutor) like the per-provider fetch — urllib's per-socket timeout is not a wall-clock guarantee. - agent/credits_tracker.py: reword the depleted CTA to "run /usage for balance" (no false recovery promise; /usage shows fresh magnitudes, sticky clears next turn). - agent/conversation_loop.py: same wall-clock timeout on the cold-start seed fetch so a stalled portal can't hang session startup; tidy its time import. * chore(credits): dev notice-state fixtures (HERMES_DEV_CREDITS_FIXTURE) Throwaway dev scaffolding to exercise the notice pipeline without real spend or Redis seeding. Set HERMES_DEV_CREDITS_FIXTURE to a state name (healthy / sub_90pct / grant_exhausted / depleted / clear) or a file path whose contents name a state (re-read each turn → flip states live for recovery testing). _capture_credits injects the chosen CreditsState instead of parsing real headers and runs the shared notice policy. Deletable with the rest of the HERMES_DEV_CREDITS scaffolding. * feat(credits): /usage monthly-grant % gauge The portal /api/oauth/account subscription block now carries monthly_credits (the per-period grant allowance, the % denominator). The consumer parsed monthly_charge but dropped monthly_credits, so /usage stayed magnitudes-only. Capture monthly_credits into NousPortalSubscriptionInfo + _subscription_from_payload. build_nous_credits_snapshot emits a Subscription usage window (real % used, routed through the existing render machinery) when monthly_credits is a finite positive denominator and credits_remaining is finite and <= cap; otherwise it degrades to magnitudes-only (older portals, rollover-over-cap, or non-finite payloads). Guards (adversarial-review-driven): reject non-finite operands (json.loads parses bare NaN/Infinity by default → would render $nan + a false 100% used), reject bools, guard div-by-zero (cap>0), and suppress the gauge when remaining > cap (rollover spanning the period makes the cap a nonsensical denominator → the $X-of-$Y detail would read as a contradiction). Debt (remaining<0) clamps to 100%. Money rule preserved: the ratio + magnitudes are computed from numeric float account fields via display formatting, never by parsing a server *_usd string (there are none on these dataclasses). 13 gauge tests added (tests/agent/test_nous_credits_gauge.py). * fix(credits): show /usage Nous block whenever a Nous account is present /usage runs in a slash-worker subprocess whose resolved inference provider is often not "nous" even when the user has a Nous account, so gating the Nous credits block on (provider == "nous") hid it entirely — the account data was fully available but never rendered. Gate instead on "a Nous account is logged in": a cheap local auth-state lookup (get_provider_auth_state('nous') has an access_token) decides whether to attempt the portal fetch, regardless of which provider inference runs on. In the gateway the block is also lifted out of the 'if provider:' scope so a Nous-credentialled user with another (or no) resident inference provider still sees their balance. Fail-open and the per-fetch wall-clock timeout are preserved. * fix(credits): show /usage Nous block when there's no live agent (TUI slash-worker) In the TUI, /usage runs in a slash-worker subprocess that resumes the session WITHOUT building an agent (self.agent is None), so _show_usage early-returned "(._.) No active agent" before ever reaching the Nous credits block — which is agent-independent (a portal fetch gated on Nous auth-state). Extract the block into _print_nous_credits_block() and run it at the no-agent / no-calls early-returns too (returns True if it printed, so the fallback message only shows when there's genuinely nothing). Verified live against staging: the block + monthly-grant gauge now render in the slash-worker /usage path (previously hidden). The plain CLI REPL + messaging paths are unchanged (they have a live agent). * feat(credits): escalating 50/75/90 usage bands (single status line) Replace the lone 90%-used warning with three escalating bands (50 info, 75 warn, 90 warn) shown as ONE status-bar line: it displays the highest band the subscription grant has crossed, replaces the line as usage climbs, steps back down on recovery, and clears below 50%. No stacking, no per-turn churn. Bands live in a tunable CREDITS_USAGE_BANDS list; the policy derives everything from it. Single notice key (credits.usage) with a usage_band latch field so the notice only re-emits when the band actually changes. The crossing gate (seen_below_90) is preserved so a fresh live session that opens mid-range stays quiet until it has been observed below the lowest band (cold-start primes it when it wants an open-high warning). Denominator math unchanged: % = subscription grant burn (cap - grant_remaining)/cap, clamped [0,1]; top-up never moves the %. Migrated test_credits_policy.py to the new key + added TestUsageBands (climb, step-down, recovery-clear, idempotent, inclusive boundaries). * feat(credits): hydrate notices at session OPEN via shared seed (TUI + first-turn) Notices previously only fired inside a conversation turn (first message), so a session that opened already depleted / past a usage band showed nothing at 'ready'. Extract the cold-start seed into a shared seed_credits_at_session_start() and call it (a) in the TUI/desktop agent build right after the notice callback is wired (fires at 'ready', before any message) and (b) as the first-turn fallback in conversation_loop. Idempotent (skips once _credits_state exists) and fail-open. The seed now maps monthly_credits -> subscription_limit_micros + denominator_kind='subscription_cap', so used_fraction is computable at seed time and usage-band warnings (not just depletion) hydrate on open. Primes the crossing latch so a session opening already in a band warns immediately. Degrades to depletion-only when monthly_credits is absent (older portals). Adds test_credits_cold_start.py covering open-at-band, depletion, debt, no-cap degradation, and the shared seed (fires/idempotent/skips-non-nous). * feat(credits): /usage monthly-grant % gauge + fixture support + TUI surfacing agent/account_usage.py: build_nous_credits_snapshot emits a subscription %% gauge when the portal supplies a positive, finite monthly_credits denominator with remaining <= cap (guards reject NaN/Infinity and rollover-over-cap, which would render $nan or a contradictory $X-of-$Y); degrades to magnitudes-only otherwise. Adds shared nous_credits_lines() (auth-gated, wall-clock-bounded portal fetch) so the CLI and TUI /usage render the same block, and _snapshot_from_credits_state() so HERMES_DEV_CREDITS_FIXTURE drives /usage offline too. TUI: session.usage RPC carries credits_lines (agent-independent) and the /usage panel renders them regardless of API-call count or resume state — previously the TUI's separate /usage implementation only showed token counts. Money rule preserved: %% and magnitudes come from numeric float account fields via display formatting, never by parsing a server *_usd string. * feat(credits): CLI REPL inline notices (parity with TUI) The plain CLI agent bound no notice callbacks, so credit notices were TUI-only. Bind notice_callback/notice_clear_callback on the CLI AIAgent; _on_notice renders a single level-colored line above the prompt (error red / warn yellow / success green / info dim) via _cprint, and seed credits at session open so a depletion or usage-band warning shows before the first message — the same hydration the TUI got. _on_notice_clear is a no-op (the REPL prints lines, no persistent slot). * test(credits): add sub_50pct + sub_75pct dev fixtures for the new usage bands The fixture set jumped 10%% -> 90%%; add sub_50pct (uf 0.5 -> band 50 info) and sub_75pct (uf 0.75 -> band 75 warn) so the new escalating bands are exercisable via HERMES_DEV_CREDITS_FIXTURE across all three surfaces (notice, session-open seed, /usage gauge). * fix(credits): usage-band notice clears on next prompt (not sticky-forever) A 50/75/90 usage heads-up was sticky and camped the status bar indefinitely. Clear the visible credits.usage notice when a new turn starts (startMessage), so it shows until your next prompt then yields. The server latch is unchanged, so it won't re-nag at the same band — it only re-shows when the band actually changes (climb) or clears when usage drops below the lowest band. Depletion stays sticky. * refactor(credits): consolidate the /usage credits block behind nous_credits_lines() The CLI (_print_nous_credits_block) and the messaging gateway (_handle_usage_command) each re-implemented the auth-gate + portal fetch + render, and both bypassed the dev-fixture short-circuit that only the TUI honored — so /usage ignored HERMES_DEV_CREDITS_FIXTURE on the CLI and in chat. Route both through the shared agent.account_usage.nous_credits_lines() helper: one fetch/render path, one auth gate, and the fixture works on every surface (~60 fewer duplicated lines). The gateway usage test recorded only the last asyncio.to_thread call; /usage now dispatches both the account fetch and the credits fetch, so it records every call and matches the account fetch by its provider arg. * fix(credits): keep the /usage gauge type-safe and log its fail-open path _is_finite_num is now a TypeGuard[float], so the type checker narrows the gauge operands (monthly_credits / credits_remaining) and the magnitudes passed to _fmt_usd through it — no more None-operand warnings on the arithmetic. Add a debug breadcrumb on the nous_credits_lines portal-fetch fail-open so a dead /usage block is diagnosable in agent.log without a dev flag. * fix(credits): harden the header tracker — prod-leak gate, hot-path probe, fire-and-forget seed - Prod-leak guard: dev fixtures (HERMES_DEV_CREDITS_FIXTURE) now also require HERMES_DEV_CREDITS, so a stray fixture var can't surface fabricated balances on a real account. Matches the documented run workflow (both vars set together). - Hot-path probe: parse_credits_headers checks for the version sentinel header before allocating a lowercased copy of the response headers — skips that work on every non-Nous API call. Behaviour-identical and still case-insensitive. - Fire-and-forget seed: the real portal fetch in seed_credits_at_session_start now runs in a daemon thread, so a slow/unreachable portal never delays session "ready" (previously blocked up to 10s). The dev-fixture path stays synchronous; the thread re-checks idempotency before hydrating (a live header may land first). - Diagnostics: debug breadcrumbs on the parse and seed fail-open paths so a crashed parser / dead seed is distinguishable from a legitimate no-headers miss. Cold-start tests set HERMES_DEV_CREDITS alongside the fixture to match the gate. * test(tui): fix env-timing in the StatusRule dev-credits assertion DEV_CREDITS_MODE is read once at module load (config/env), so mutating process.env.HERMES_DEV_CREDITS inside the test couldn't flip it — the dev-banner assertion only passed if the env was exported before vitest started, and failed in a normal run. Move that assertion to a sibling file that mocks config/env with DEV_CREDITS_MODE: true (scoped, no module-reset / React-identity hazard). * test(credits): cover the dev-fixture /usage render and usage-band clear-on-prompt - _snapshot_from_credits_state (the offline /usage renderer) had no direct test: lock the gauge math, the verbatim *_usd magnitudes, the depletion line and the fixture marker, plus the no-cap (no gauge) and None-state cases. - turnController.startMessage had no test for clearing the credits.usage notice on the next prompt while leaving credits.depleted sticky. * feat(credits): deliver credit notices over messaging gateways Bind notice_callback/notice_clear_callback on the per-turn gateway agent so usage-band / depletion / restored notices reach Telegram/Discord/Slack/ etc. Previously the messaging gateway bound neither callback, so the agent's _emit_credits_notices early-returned and a chat user crossing a band got nothing unless they ran /usage manually. - render_notice_line(): AgentNotice -> single plaintext line (level glyph + text), plaintext-only so it renders uniformly without per-platform escaping. Fail-soft on malformed/empty notices. - Standalone push for every notice (messaging has no persistent status bar): route through the shared _deliver_platform_notice rail (honors private/ public delivery + thread metadata), scheduled onto the gateway loop via safe_schedule_threadsafe from the agent's sync worker thread — same pattern as _status_callback_sync. - The fired-once latch lives on the cached (reused-in-place) agent and persists across turns, so a band crosses once -> one push, no per-turn re-nag. Re-fires only after idle-eviction rebuilds the agent (a reminder). - Recovery ('Credit access restored') rides the show path (emitted as a success notice, not a clear). notice_clear_callback is a no-op: a sent platform message can't be cleanly retracted. Tests: render glyph/levels/fail-soft + public/private delivery seam through _deliver_platform_notice + no-adapter no-op. * fix(credits): don't double the glyph on messaging notices render_notice_line prepended a per-level glyph, but the notice policy already bakes the glyph into the text (and the TUI + CLI render it verbatim) — so every credit notice over messaging came out doubled ("⚠ ⚠ Credits 90% used", "⛔ ✕ Credit access paused"). Emit the text verbatim instead; drop the now-dead level→glyph map. The render tests fed glyph-less text (and the success case only checked startswith), so the doubling slipped through. Rework them around the verbatim contract and add an end-to-end regression that runs real evaluate_credits_notices output through render_notice_line and asserts the line is returned unchanged.
… dev readout (NousResearch#40011) * feat(tui): HERMES_DEV_CREDITS live-spend dev readout (L0 tracer for usage-aware credits) L0 of the usage-aware-credits feature: a dev-only, env-gated tracer that exercises the real header -> CreditsState -> TUI pipe end-to-end behind HERMES_DEV_CREDITS, de-risking the L1/L5 build before the notice policy exists. - agent/credits_tracker.py: CreditsState + parse_credits_headers (headers are strings -> paid_access via == "true", never bool(); retain-last-known; only subscription_micros may be negative; *_usd kept verbatim). - run_agent.py: _capture_credits / get_credits_state / get_credits_spent_micros, session-start baseline latch, + dev-gated "credits" capture log. - agent/chat_completion_helpers.py: capture on the streaming response. - agent/agent_init.py: init _credits_state + _credits_session_start_micros. - tui_gateway/server.py: _get_usage emits dev_credits_spent_micros only when flagged. - ui-tui appChrome.tsx / types.ts: cents delta status segment + "(dev credits)" banner. Off by default; silent for normal users. Validated live against staging (capture log delta matches the TUI segment). Throwaway consumer (readout/log/ banner); credits_tracker + the capture plumbing are the real feature foundation. * test(credits): lock parser under 9-state matrix + harden validation (L2) Add tests/agent/test_credits_tracker.py with 92 tests covering the 9-state matrix (healthy, sub_90pct, grant_exhausted, purchased_only, tool_pool_free, depleted, debt, missing, no_org) plus validation edge cases: version strict==1 with warn-once latch for v>1, bool-string trap (paid_access/tool_pool_gated_off == "true"/"false", never bool()), half-pair subscription limit treated as both-absent while parse succeeds, USD regex ^-?\d+\.\d{2}$, non-int micros → None, negative non-subscription micros → None, as_of_ms junk → None, zero limit ZeroDivision guard. Harden agent/credits_tracker.py to match the spec: - Add tool_pool_micros/tool_pool_gated_off/from_header fields to CreditsState - Add depleted property (== not paid_access, never remaining==0) - Change used_fraction guard to key off subscription_limit_micros (the actual denominator) not denominator_kind (metadata) - Replace fail-soft _safe_int with a sentinel-returning variant; full validation now returns None on any malformed field rather than silently defaulting - Add module-level warn-once latch for version > 1 - Add USD regex validation; add denominator_kind allow-list check - Parse x-nous-tool-pool-* prefix headers (not x-nous-credits-tool-pool-*) * feat(credits): notice spine — AgentNotice + notice_callback/notice_clear_callback + TUI binding (L1) L1 of usage-aware credits: the driver-agnostic notice delivery spine that L4's policy will fire through and L5's TUI render will consume. - agent/credits_tracker.py: AgentNotice dataclass (text/level/kind/ttl_ms/key/id; kind defaults "sticky", kept TTL-expressive for a future config seam). - run_agent.py: AIAgent gains notice_callback + notice_clear_callback slots and _emit_notice / _emit_notice_clear emitters (swallow all callback errors — a notice must never break the agent loop; no-op when unbound). - agent/agent_init.py: thread both callbacks through init_agent. - tui_gateway/server.py: bind both in _agent_cbs → notification.show / notification.clear WS events (snake_case payload, matching the existing gateway-event convention). - ui-tui/src/gatewayTypes.ts: notification.show / notification.clear arms on GatewayEvent. - tests/run_agent/test_notice_spine.py: 15 tests (emitter fire + fail-open + no-op, signature threading, TUI binding payload shape). Messaging push is out of v1 (binds neither callback). CLI binding + the TUI render/ decode land with L4 (firing) and L5 (render) so turn-end flush is wired correctly. * feat(credits): threshold reconciliation policy + tests (L4.1) * feat(credits): wire threshold policy into capture + latch (L4.2) After a fresh header parse, _capture_credits runs evaluate_credits_notices against the agent's _credits_latch and emits the result — clears first, then shows (so a recovered depletion clears before the "restored" success lands, and depleted wins the latest-wins slot). Gated on a bound notice_callback: messaging (no callbacks) still caches state for /usage but runs no policy. Parse stays fail-open (miss → keep last-known); the eval/emit path warns on failure rather than swallowing, so a depletion-notice bug can't vanish silently. - run_agent.py: _capture_credits split into parse (swallow→miss) + policy (warn); latch lazy-guarded (object.__new__ safety). - agent/agent_init.py: init agent._credits_latch = {"active": set(), "seen_below_90": False}. * feat(tui): render credits notices in the status bar (L5, Strategy B) The TUI now renders the notification.show / notification.clear gateway events the agent emits — a level-colored notice overrides the status/verb slot when not busy. - Notice state machine on turnController (pendingNotice + dedicated noticeTimer + show/clear/applyNotice/flushPendingNotice/clearNoticeState). createGatewayEventHandler decodes the events and delegates. - Render priority busy > notice > status (appChrome StatusRule); notice text rendered verbatim (its glyph comes from the policy), shrinkable so it never clips model│ctx; dev-credits banner + Δ segment preserved. UiState.notice is snake_case (matches wire). - Busy-wins: a notice arriving mid-turn is held and flushed at the THREE turn-end sites (recordMessageComplete / interruptTurn / recordError) — never idle(), which reset() also calls (would leak across sessions); reset() clears instead. - Dedicated noticeTimer (never statusTimer); TTL starts on visibility with an id-guard; latest-wins cancels the prior timer; clear is key-matched (no-op on mismatch); a sticky survives a turn (flush no-ops with no pending); session reset clears (no cross-session leak). - 20 tests (handler/turnController logic incl. R3-C2 timer isolation + render priority). * feat(credits): cold-start seed for new Nous sessions (L3) A genuinely-new Nous session has no inference header yet, so seed credits state from the authoritative GET /api/oauth/account snapshot at session start (in the new-session branch of _restore_or_build_system_prompt — inline, since the on_session_start plugin hook gets no agent reference). The seed runs the shared notice policy, so a session that opens already depleted warns IMMEDIATELY rather than only after the first turn. - Maps the nested account fields (paid_service_access → paid_access; total_usable / subscription / purchased on paid_service_access_info; rollover on subscription), each None-guarded; float dollars → micros via round(d*1e6), *_usd left "" (render formats from micros — never synthesize a verbatim usd from a float). - Magnitudes-only: no monthlyCredits on the endpoint → subscription_limit_* unset → used_fraction None → no warn90 from the seed (% only once a header lands, per D-E). - Provider-guarded to Nous; fail-open (any error leaves _credits_state None, never blocks startup); paid_access unknown ⇒ True (never falsely depleted). - run_agent.py: extracted the warm-path policy/emit block into a shared _emit_credits_notices() so capture and the seed fire notices identically. * feat(credits): /usage Nous credits magnitudes view + recovery trigger (L6) Add Nous credit dollar magnitudes to /usage (subscription / top-up / total + rollover + renewal + portal CTA), magnitudes-only per v1 (no % until the account endpoint exposes a denominator). Reuses the existing account-usage render machinery via a new pure build_nous_credits_snapshot() that maps a NousPortalAccountInfo to an AccountUsageSnapshot; no nous branch is added to fetch_account_usage (keeps the per-provider boundary intact). CLI /usage also doubles as a depletion-recovery trigger: a force_fresh account fetch, kept in a SEPARATE local so it never clobbers the header-sourced agent._credits_state (which alone carries used_fraction). If paid access recovered while credits.depleted is latched and a notice consumer is bound, it reuses agent._emit_credits_notices() to clear it. Gateway /usage displays magnitudes only — messaging binds no notice consumer, so it performs no recovery emit. Fail-open throughout: any portal hiccup leaves /usage unaffected. * refactor(credits): dedupe HERMES_DEV_CREDITS flag parse via shared helpers The dev-flag truthy check was inlined in three places. Replace with the shared utils.is_truthy_value (run_agent.py, tui_gateway/server.py — also drops a redundant inline `import os`) and a hoisted DEV_CREDITS_MODE export in ui-tui/src/config/env.ts (consumed by appChrome, which also stops recomputing the env check on every render). Behaviour-preserving; identical truthy set. * fix(credits): cut dead /usage recovery trigger + bound portal fetches (L6 review) Adversarial review found the /usage depletion-recovery trigger dead AND broken: the CLI binds no notice_clear_callback, the TUI runs /usage in a separate slash-worker subprocess (its own agent/latch), and the no-clobber rule made it evaluate stale paid_access anyway. Recovery already happens on the next inference (warm path), so the trigger was redundant — remove it and stop the depleted notice over-promising. - cli.py: remove the dead recovery block; bound the /usage portal fetch with a 10s wall-clock timeout (ThreadPoolExecutor) like the per-provider fetch — urllib's per-socket timeout is not a wall-clock guarantee. - agent/credits_tracker.py: reword the depleted CTA to "run /usage for balance" (no false recovery promise; /usage shows fresh magnitudes, sticky clears next turn). - agent/conversation_loop.py: same wall-clock timeout on the cold-start seed fetch so a stalled portal can't hang session startup; tidy its time import. * chore(credits): dev notice-state fixtures (HERMES_DEV_CREDITS_FIXTURE) Throwaway dev scaffolding to exercise the notice pipeline without real spend or Redis seeding. Set HERMES_DEV_CREDITS_FIXTURE to a state name (healthy / sub_90pct / grant_exhausted / depleted / clear) or a file path whose contents name a state (re-read each turn → flip states live for recovery testing). _capture_credits injects the chosen CreditsState instead of parsing real headers and runs the shared notice policy. Deletable with the rest of the HERMES_DEV_CREDITS scaffolding. * feat(credits): /usage monthly-grant % gauge The portal /api/oauth/account subscription block now carries monthly_credits (the per-period grant allowance, the % denominator). The consumer parsed monthly_charge but dropped monthly_credits, so /usage stayed magnitudes-only. Capture monthly_credits into NousPortalSubscriptionInfo + _subscription_from_payload. build_nous_credits_snapshot emits a Subscription usage window (real % used, routed through the existing render machinery) when monthly_credits is a finite positive denominator and credits_remaining is finite and <= cap; otherwise it degrades to magnitudes-only (older portals, rollover-over-cap, or non-finite payloads). Guards (adversarial-review-driven): reject non-finite operands (json.loads parses bare NaN/Infinity by default → would render $nan + a false 100% used), reject bools, guard div-by-zero (cap>0), and suppress the gauge when remaining > cap (rollover spanning the period makes the cap a nonsensical denominator → the $X-of-$Y detail would read as a contradiction). Debt (remaining<0) clamps to 100%. Money rule preserved: the ratio + magnitudes are computed from numeric float account fields via display formatting, never by parsing a server *_usd string (there are none on these dataclasses). 13 gauge tests added (tests/agent/test_nous_credits_gauge.py). * fix(credits): show /usage Nous block whenever a Nous account is present /usage runs in a slash-worker subprocess whose resolved inference provider is often not "nous" even when the user has a Nous account, so gating the Nous credits block on (provider == "nous") hid it entirely — the account data was fully available but never rendered. Gate instead on "a Nous account is logged in": a cheap local auth-state lookup (get_provider_auth_state('nous') has an access_token) decides whether to attempt the portal fetch, regardless of which provider inference runs on. In the gateway the block is also lifted out of the 'if provider:' scope so a Nous-credentialled user with another (or no) resident inference provider still sees their balance. Fail-open and the per-fetch wall-clock timeout are preserved. * fix(credits): show /usage Nous block when there's no live agent (TUI slash-worker) In the TUI, /usage runs in a slash-worker subprocess that resumes the session WITHOUT building an agent (self.agent is None), so _show_usage early-returned "(._.) No active agent" before ever reaching the Nous credits block — which is agent-independent (a portal fetch gated on Nous auth-state). Extract the block into _print_nous_credits_block() and run it at the no-agent / no-calls early-returns too (returns True if it printed, so the fallback message only shows when there's genuinely nothing). Verified live against staging: the block + monthly-grant gauge now render in the slash-worker /usage path (previously hidden). The plain CLI REPL + messaging paths are unchanged (they have a live agent). * feat(credits): escalating 50/75/90 usage bands (single status line) Replace the lone 90%-used warning with three escalating bands (50 info, 75 warn, 90 warn) shown as ONE status-bar line: it displays the highest band the subscription grant has crossed, replaces the line as usage climbs, steps back down on recovery, and clears below 50%. No stacking, no per-turn churn. Bands live in a tunable CREDITS_USAGE_BANDS list; the policy derives everything from it. Single notice key (credits.usage) with a usage_band latch field so the notice only re-emits when the band actually changes. The crossing gate (seen_below_90) is preserved so a fresh live session that opens mid-range stays quiet until it has been observed below the lowest band (cold-start primes it when it wants an open-high warning). Denominator math unchanged: % = subscription grant burn (cap - grant_remaining)/cap, clamped [0,1]; top-up never moves the %. Migrated test_credits_policy.py to the new key + added TestUsageBands (climb, step-down, recovery-clear, idempotent, inclusive boundaries). * feat(credits): hydrate notices at session OPEN via shared seed (TUI + first-turn) Notices previously only fired inside a conversation turn (first message), so a session that opened already depleted / past a usage band showed nothing at 'ready'. Extract the cold-start seed into a shared seed_credits_at_session_start() and call it (a) in the TUI/desktop agent build right after the notice callback is wired (fires at 'ready', before any message) and (b) as the first-turn fallback in conversation_loop. Idempotent (skips once _credits_state exists) and fail-open. The seed now maps monthly_credits -> subscription_limit_micros + denominator_kind='subscription_cap', so used_fraction is computable at seed time and usage-band warnings (not just depletion) hydrate on open. Primes the crossing latch so a session opening already in a band warns immediately. Degrades to depletion-only when monthly_credits is absent (older portals). Adds test_credits_cold_start.py covering open-at-band, depletion, debt, no-cap degradation, and the shared seed (fires/idempotent/skips-non-nous). * feat(credits): /usage monthly-grant % gauge + fixture support + TUI surfacing agent/account_usage.py: build_nous_credits_snapshot emits a subscription %% gauge when the portal supplies a positive, finite monthly_credits denominator with remaining <= cap (guards reject NaN/Infinity and rollover-over-cap, which would render $nan or a contradictory $X-of-$Y); degrades to magnitudes-only otherwise. Adds shared nous_credits_lines() (auth-gated, wall-clock-bounded portal fetch) so the CLI and TUI /usage render the same block, and _snapshot_from_credits_state() so HERMES_DEV_CREDITS_FIXTURE drives /usage offline too. TUI: session.usage RPC carries credits_lines (agent-independent) and the /usage panel renders them regardless of API-call count or resume state — previously the TUI's separate /usage implementation only showed token counts. Money rule preserved: %% and magnitudes come from numeric float account fields via display formatting, never by parsing a server *_usd string. * feat(credits): CLI REPL inline notices (parity with TUI) The plain CLI agent bound no notice callbacks, so credit notices were TUI-only. Bind notice_callback/notice_clear_callback on the CLI AIAgent; _on_notice renders a single level-colored line above the prompt (error red / warn yellow / success green / info dim) via _cprint, and seed credits at session open so a depletion or usage-band warning shows before the first message — the same hydration the TUI got. _on_notice_clear is a no-op (the REPL prints lines, no persistent slot). * test(credits): add sub_50pct + sub_75pct dev fixtures for the new usage bands The fixture set jumped 10%% -> 90%%; add sub_50pct (uf 0.5 -> band 50 info) and sub_75pct (uf 0.75 -> band 75 warn) so the new escalating bands are exercisable via HERMES_DEV_CREDITS_FIXTURE across all three surfaces (notice, session-open seed, /usage gauge). * fix(credits): usage-band notice clears on next prompt (not sticky-forever) A 50/75/90 usage heads-up was sticky and camped the status bar indefinitely. Clear the visible credits.usage notice when a new turn starts (startMessage), so it shows until your next prompt then yields. The server latch is unchanged, so it won't re-nag at the same band — it only re-shows when the band actually changes (climb) or clears when usage drops below the lowest band. Depletion stays sticky. * refactor(credits): consolidate the /usage credits block behind nous_credits_lines() The CLI (_print_nous_credits_block) and the messaging gateway (_handle_usage_command) each re-implemented the auth-gate + portal fetch + render, and both bypassed the dev-fixture short-circuit that only the TUI honored — so /usage ignored HERMES_DEV_CREDITS_FIXTURE on the CLI and in chat. Route both through the shared agent.account_usage.nous_credits_lines() helper: one fetch/render path, one auth gate, and the fixture works on every surface (~60 fewer duplicated lines). The gateway usage test recorded only the last asyncio.to_thread call; /usage now dispatches both the account fetch and the credits fetch, so it records every call and matches the account fetch by its provider arg. * fix(credits): keep the /usage gauge type-safe and log its fail-open path _is_finite_num is now a TypeGuard[float], so the type checker narrows the gauge operands (monthly_credits / credits_remaining) and the magnitudes passed to _fmt_usd through it — no more None-operand warnings on the arithmetic. Add a debug breadcrumb on the nous_credits_lines portal-fetch fail-open so a dead /usage block is diagnosable in agent.log without a dev flag. * fix(credits): harden the header tracker — prod-leak gate, hot-path probe, fire-and-forget seed - Prod-leak guard: dev fixtures (HERMES_DEV_CREDITS_FIXTURE) now also require HERMES_DEV_CREDITS, so a stray fixture var can't surface fabricated balances on a real account. Matches the documented run workflow (both vars set together). - Hot-path probe: parse_credits_headers checks for the version sentinel header before allocating a lowercased copy of the response headers — skips that work on every non-Nous API call. Behaviour-identical and still case-insensitive. - Fire-and-forget seed: the real portal fetch in seed_credits_at_session_start now runs in a daemon thread, so a slow/unreachable portal never delays session "ready" (previously blocked up to 10s). The dev-fixture path stays synchronous; the thread re-checks idempotency before hydrating (a live header may land first). - Diagnostics: debug breadcrumbs on the parse and seed fail-open paths so a crashed parser / dead seed is distinguishable from a legitimate no-headers miss. Cold-start tests set HERMES_DEV_CREDITS alongside the fixture to match the gate. * test(tui): fix env-timing in the StatusRule dev-credits assertion DEV_CREDITS_MODE is read once at module load (config/env), so mutating process.env.HERMES_DEV_CREDITS inside the test couldn't flip it — the dev-banner assertion only passed if the env was exported before vitest started, and failed in a normal run. Move that assertion to a sibling file that mocks config/env with DEV_CREDITS_MODE: true (scoped, no module-reset / React-identity hazard). * test(credits): cover the dev-fixture /usage render and usage-band clear-on-prompt - _snapshot_from_credits_state (the offline /usage renderer) had no direct test: lock the gauge math, the verbatim *_usd magnitudes, the depletion line and the fixture marker, plus the no-cap (no gauge) and None-state cases. - turnController.startMessage had no test for clearing the credits.usage notice on the next prompt while leaving credits.depleted sticky. * feat(credits): deliver credit notices over messaging gateways Bind notice_callback/notice_clear_callback on the per-turn gateway agent so usage-band / depletion / restored notices reach Telegram/Discord/Slack/ etc. Previously the messaging gateway bound neither callback, so the agent's _emit_credits_notices early-returned and a chat user crossing a band got nothing unless they ran /usage manually. - render_notice_line(): AgentNotice -> single plaintext line (level glyph + text), plaintext-only so it renders uniformly without per-platform escaping. Fail-soft on malformed/empty notices. - Standalone push for every notice (messaging has no persistent status bar): route through the shared _deliver_platform_notice rail (honors private/ public delivery + thread metadata), scheduled onto the gateway loop via safe_schedule_threadsafe from the agent's sync worker thread — same pattern as _status_callback_sync. - The fired-once latch lives on the cached (reused-in-place) agent and persists across turns, so a band crosses once -> one push, no per-turn re-nag. Re-fires only after idle-eviction rebuilds the agent (a reminder). - Recovery ('Credit access restored') rides the show path (emitted as a success notice, not a clear). notice_clear_callback is a no-op: a sent platform message can't be cleanly retracted. Tests: render glyph/levels/fail-soft + public/private delivery seam through _deliver_platform_notice + no-adapter no-op. * fix(credits): don't double the glyph on messaging notices render_notice_line prepended a per-level glyph, but the notice policy already bakes the glyph into the text (and the TUI + CLI render it verbatim) — so every credit notice over messaging came out doubled ("⚠ ⚠ Credits 90% used", "⛔ ✕ Credit access paused"). Emit the text verbatim instead; drop the now-dead level→glyph map. The render tests fed glyph-less text (and the success case only checked startswith), so the doubling slipped through. Rework them around the verbatim contract and add an end-to-end regression that runs real evaluate_credits_notices output through render_notice_line and asserts the line is returned unchanged.
What this adds
Hermes now keeps an eye on your Nous credit balance while you work, so you find out you're running low before a request gets cut off — not after.
There are three parts:
1. Live warnings. As you spend your monthly credits, a short notice tells you where you stand — in the TUI status bar, and as a one-off message in chat (Telegram / Discord / Slack / WhatsApp):
In the TUI the notice escalates as you get closer to empty and clears itself on recovery. In chat each crossing is a single push — it won't repeat-nag you every turn.
2. A credit balance in
/usage. Run/usageand you'll see a Nous credits section: how much of your monthly subscription is left (with a "% used" gauge), any top-up credits, your total, when it renews, and a link to manage or top up. It works on the CLI, in chat, and in the TUI — and it shows even before you've sent a message and on resumed sessions.3. It's display-only. This never blocks a request, never charges you, and never changes what Hermes does. If the balance can't be fetched for any reason, Hermes simply shows nothing and carries on as normal.
How to run and test it
Once you're on this version (works against production — no special setup):
/usage— you'll see the Nous credits section with your subscription balance and the % gauge. You don't have to be low on credits, or have sent any messages, to see it.Only Nous accounts are affected; if you're on another provider nothing changes.
Quality
Around 220 automated tests cover the balance parsing, the warning thresholds and recovery logic, the status-bar rendering, the messaging delivery path, and the
/usageviews across the CLI, chat, and TUI surfaces.