Skip to content

feat(#187 AC#6 + AC#8): Smart/Smarter mode toggle + zero-cost helper#253

Merged
jayzalowitz merged 3 commits into
mainfrom
jayzalowitz/issue-187-smart-smarter-toggle
May 12, 2026
Merged

feat(#187 AC#6 + AC#8): Smart/Smarter mode toggle + zero-cost helper#253
jayzalowitz merged 3 commits into
mainfrom
jayzalowitz/issue-187-smart-smarter-toggle

Conversation

@jayzalowitz

Copy link
Copy Markdown
Owner

Summary

Closes the user-visible side of the embedded LLM story. Two pieces:

  • AC#6 — a two-pill Smart/Smarter mode toggle at the top of Settings → AI brain. Active mode is computed from the user's provider chain; clicking the inactive pill reorders priorities and auto-saves.
  • AC#8 — a estimateLlmCostCents() helper in @skytwin/llm-client that returns 0 for embedded/ollama and a per-provider rate for hosted APIs. The future spend-recording call site can drop this in without an embedded-special-case branch.

Together these take #187 from 5/8 → 7/8 closed. Remaining: AC#1 bundling (distribution work paired with #188) and AC#4 (Piper TTS, blocked on piper on PATH).

AC#6 — Smart/Smarter mode toggle

detectAIMode(chain)'smart' | 'smarter' | 'none' based on the top-priority enabled provider:

  • embedded at top → Smart
  • anthropic / openai / google / ollama at top → Smarter
  • no enabled providers → none

Click the inactive pill → priorities reorder + auto-save:

  • Smart click: applySmartMode(chain) promotes embedded to priority 0 (adds a fresh embedded entry with model: 'auto' if missing, re-enables a disabled one).
  • Smarter click: applySmarterMode(chain) promotes the first hosted/Ollama entry to priority 0. Returns null if no candidate exists; in that case the action routes to switch-to-smarter-blocked which focuses the "+ Add a provider…" dropdown so the user's eye is drawn to the next step instead of failing silently.

After the save round-trip, renderSettings re-runs so the pill and the provider chain agree on the persisted state (handles cases like server-side normalization adding fields the optimistic copy didn't have).

Pure helpers exported from apps/web/public/js/pages/settings.js so the pill, the action handler, and any future audit route all agree on one definition. Same module-level export function style other helpers in this file use.

Screenshot (Smart active, Anthropic configured as fallback)

✓ Smart (free, on-device) highlighted; Smarter (paid API) → is the call-to-action; helper text under each pill explains what they do.

AC#8 — estimateLlmCostCents() helper

packages/llm-client/src/cost.ts exports:

estimateLlmCostCents(provider: AIProviderName, tokensIn?: number, tokensOut?: number): number
isZeroCostProvider(provider: AIProviderName): boolean

Rate table is keyed by provider with deci-cents per million tokens — list-price of the cheapest model we expose in PROVIDER_MODELS for each family. Local-runtime providers (embedded, ollama) carry zero per-token cost; that's the load-bearing piece of AC#8.

Rounding is ceil everywhere so spend-cap enforcement stays conservative — the failure direction is "approval required," never "silently past the cap."

No call site wired yet. The current spend-recording path (spendRepository.create) isn't called from any LLM code path today, so AC#8 is trivially satisfied at runtime. This PR adds the helper as a single source of truth so when LLM-cost recording does land (separate issue), it can compute estimateLlmCostCents(response.provider, tokensIn, tokensOut) and trust that local-runtime calls record zero.

Test plan

  • 10 new vitest cases for cost.ts:
    • embedded returns 0 for any token volume (incl. Number.MAX_SAFE_INTEGER).
    • ollama returns 0.
    • Hosted providers compute expected integer cents with round-up.
    • Pure-function determinism check.
    • isZeroCostProvider distinguishes local-runtime from hosted.
  • 16 cases smoke-tested for the JS helpers via Node ESM import (detectAIMode, applySmartMode, applySmarterMode across empty/embedded-top/anthropic-top/ollama-top/all-disabled/lower-priority-embedded/skip-disabled scenarios + null-return when no candidate).
  • Toggle visually verified in Chrome at http://localhost:3201/#/settings across three scenarios:
    • Smart active (embedded at priority 0): Smart pill highlighted, Smarter pill is the CTA.
    • Smarter active (anthropic at priority 0): Smarter pill highlighted, Smart pill is the CTA.
    • No paid provider in chain: Smart pill clickable, Smarter pill routes to switch-to-smarter-blocked.
  • pnpm --filter @skytwin/llm-client build clean.
  • pnpm --filter @skytwin/llm-client test — 136/136 green.

Notes for reviewers

  • The mode toggle is a layer over the existing provider chain, not a replacement. Users who prefer the explicit chain editor still have it; the toggle is a single-click shortcut for the common cases.
  • applySmartMode adding an embedded entry with model: 'auto' mirrors what embedded-llm-card.js uses for fresh installs — the runtime resolves the first GGUF in the detected modelDir.
  • Cost rates are list prices reviewed at commit time. If pricing shifts and the table goes stale, the failure mode is over-estimating the bill (cap fires earlier) — never under-estimating. The unit test enforces embedded === 0 AND ollama === 0 as an invariant.
  • No JS test infrastructure exists in apps/web, so the helpers were smoke-tested via Node ESM import. If/when Playwright lands, the toggle's three-scenario behavior deserves an E2E.

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 11, 2026 11:34

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements two user-facing pieces of the embedded-LLM rollout: (1) a Smart/Smarter mode toggle in the web Settings “AI brain” section that reorders the provider chain and auto-saves, and (2) a shared @skytwin/llm-client helper to estimate per-call LLM cost in integer cents (including a zero-cost invariant for local providers).

Changes:

  • Add estimateLlmCostCents() / isZeroCostProvider() with vitest coverage in @skytwin/llm-client.
  • Add Smart/Smarter mode UI + pure chain-manipulation helpers to apps/web Settings, including a blocked Smarter path that focuses the “Add a provider” control.
  • Document both features in CHANGELOG.md.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
packages/llm-client/src/index.ts Re-export cost helpers from the package entrypoint.
packages/llm-client/src/cost.ts New cost estimation + zero-cost provider helper (rate table + rounding).
packages/llm-client/src/__tests__/cost.test.ts New vitest suite covering cost estimation and zero-cost invariants.
apps/web/public/js/pages/settings.js Adds Smart/Smarter toggle UI, helper functions, and auto-save mode switching.
CHANGELOG.md Adds an unreleased entry describing the new toggle and cost helper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/llm-client/src/cost.ts Outdated
Comment on lines +26 to +37
const RATE_DECICENTS_PER_M_TOKENS: Record<AIProviderName, { input: number; output: number }> = {
// Anthropic — based on Claude 3.5 Haiku list price ($0.80/$4.00 per
// 1M). Deci-cents/M: input 8, output 40.
anthropic: { input: 8, output: 40 },
// OpenAI — based on GPT-4o-mini list price ($0.15/$0.60 per 1M).
// Deci-cents/M: input 1.5 → rounded up to 2 (we use integers and
// round UP everywhere so the cap stays conservative), output 6.
openai: { input: 2, output: 6 },
// Google — based on Gemini 1.5 Flash list price ($0.075/$0.30 per
// 1M for prompts <128k tokens). Deci-cents/M: input 1 (rounded up
// from 0.75), output 3.
google: { input: 1, output: 3 },
Comment on lines +27 to +46
// 1M input + 1M output at 8/40 deci-cents per million → 4.8¢ → 5¢.
expect(estimateLlmCostCents('anthropic', 1_000_000, 1_000_000)).toBe(5);
});

it('estimates openai cost in integer cents', () => {
// 1M output at 6 deci-cents per million → 0.6¢ → 1¢ (rounded up).
expect(estimateLlmCostCents('openai', 0, 1_000_000)).toBe(1);
});

it('estimates google cost in integer cents', () => {
// 1M output at 3 deci-cents per million → 0.3¢ → 1¢ (rounded up).
expect(estimateLlmCostCents('google', 0, 1_000_000)).toBe(1);
});

it('rounds up so the cap-enforcement direction is always safe', () => {
// Anthropic at small token counts (10k input, 10k output) →
// (8*10_000 + 40*10_000) / 1_000_000 = 0.48 deci-cents → ceil = 1 →
// ceil(1/10) = 1¢. The exact float would be 0.048¢; rounding up
// means cap checks see 1¢ instead of zeroing out tiny usage.
expect(estimateLlmCostCents('anthropic', 10_000, 10_000)).toBe(1);
Comment on lines +1516 to +1523
await saveAIProviders(userId, _aiChain.map((p, i) => ({
provider: p.provider,
apiKey: p.apiKey || '',
model: p.model,
baseUrl: p.baseUrl,
priority: i,
enabled: p.enabled !== false,
})));
Comment on lines +1510 to +1535
// Re-render the pill + provider chain optimistically so the click
// produces an immediate visual change while the save round-trips.
document.getElementById('ai-mode-toggle').innerHTML = renderModeToggle(_aiChain);
document.getElementById('ai-provider-chain').innerHTML = renderProviderChain(_aiChain);

try {
await saveAIProviders(userId, _aiChain.map((p, i) => ({
provider: p.provider,
apiKey: p.apiKey || '',
model: p.model,
baseUrl: p.baseUrl,
priority: i,
enabled: p.enabled !== false,
})));
// Re-fetch from the server so the pill reflects the persisted state
// (handles edge cases like an existing-but-disabled embedded entry
// that was re-enabled by applySmartMode, where the server response
// may carry extra fields not in our optimistic copy).
const { renderSettings } = await import('./settings.js');
await renderSettings(document.getElementById('page-content'), userId);
} catch (err) {
document.getElementById('page-content').insertAdjacentHTML(
'afterbegin',
`<div class="error-banner">Failed to switch mode: ${escapeHtml(err.message)}</div>`,
);
}
Comment on lines +1224 to +1357
// #187 AC#6: providers that count as "Smarter" — i.e. external paid APIs
// the user is choosing to delegate the harder thinking to. `ollama` lives
// on a third rail: it's local like `embedded` but the user installed it
// themselves, so we treat it as Smarter too (the operator chose it
// deliberately and may have a beefier model than the embedded default).
const SMARTER_PROVIDERS = new Set(['anthropic', 'openai', 'google', 'ollama']);

/**
* Determine the user's current AI mode from their provider chain.
*
* 'smart' — top enabled provider is `embedded` (Smart mode default
* per #187 AC#6).
* 'smarter' — top enabled provider is hosted / Ollama (BYO API path).
* 'none' — no enabled providers; the LlmClient will return null and
* callers fall back to local AI + built-in rules.
*
* Pure helper so the mode pill, the action handler, and any future audit
* route all agree on one definition.
*/
export function detectAIMode(chain) {
const enabled = chain.filter((p) => p.enabled !== false);
if (enabled.length === 0) return 'none';
const top = enabled.slice().sort((a, b) => (a.priority ?? 0) - (b.priority ?? 0))[0];
if (!top) return 'none';
if (top.provider === 'embedded') return 'smart';
if (SMARTER_PROVIDERS.has(top.provider)) return 'smarter';
return 'none';
}

/**
* Reorder the chain so `embedded` is the top-priority enabled provider.
* Adds an `embedded` entry if one doesn't exist yet so first-time-Smart
* users get a working configuration in one click. Returns the new chain
* (does not mutate the input).
*
* The embedded entry uses `'auto'` as the model so the runtime resolves
* the first GGUF in the detected modelDir — matching the convention
* `apps/web/public/js/components/embedded-llm-card.js` uses for fresh
* installs.
*/
export function applySmartMode(chain) {
const next = chain.map((p) => ({ ...p }));
let embedded = next.find((p) => p.provider === 'embedded');
if (!embedded) {
embedded = {
provider: 'embedded',
model: 'auto',
apiKey: '',
baseUrl: undefined,
priority: 0,
enabled: true,
hasApiKey: false,
apiKeyPreview: '',
};
next.push(embedded);
} else {
embedded.enabled = true;
}
// Rebuild priorities so embedded is at 0 and the rest preserve their
// relative order. This is the contract the API expects (priorities are
// unique sequential integers).
const others = next.filter((p) => p !== embedded);
others.sort((a, b) => (a.priority ?? 0) - (b.priority ?? 0));
return [embedded, ...others].map((p, i) => ({ ...p, priority: i }));
}

/**
* Reorder the chain so the first non-embedded enabled provider becomes
* top-priority. Returns null if there's nothing to switch to (caller
* should surface a "configure a paid provider first" message).
*/
export function applySmarterMode(chain) {
const next = chain.map((p) => ({ ...p }));
next.sort((a, b) => (a.priority ?? 0) - (b.priority ?? 0));
const smarterIdx = next.findIndex((p) => SMARTER_PROVIDERS.has(p.provider));
if (smarterIdx === -1) return null;
const target = next[smarterIdx];
target.enabled = true;
const others = next.filter((p) => p !== target);
return [target, ...others].map((p, i) => ({ ...p, priority: i }));
}

// In-memory state for the current chain being edited
let _aiChain = [];

/**
* #187 AC#6: render the Smart / Smarter mode pill above the provider
* chain. Active mode is highlighted; clicking the inactive pill reorders
* priorities and auto-saves.
*
* Disabled states (rendered as helper text under the inactive pill):
* - Switch-to-Smarter is disabled when no hosted/Ollama provider exists
* in the chain (we don't auto-add one because the user has to supply
* an API key).
* - Switch-to-Smart is always available — if no embedded entry exists
* yet, `applySmartMode` adds one with `model: 'auto'` so the runtime
* picks up the first GGUF in the detected model directory.
*/
function renderModeToggle(providers) {
const mode = detectAIMode(providers);
const hasSmarterCandidate = providers.some((p) => SMARTER_PROVIDERS.has(p.provider));

const pill = (label, isActive, action, helperText) => `
<div style="flex: 1; min-width: 0;">
<button class="btn ${isActive ? 'btn-primary' : 'btn-outline'} btn-sm"
style="width: 100%; padding: 0.5rem 0.75rem; font-size: 0.85rem;"
data-action="${action}"
${isActive ? 'disabled' : ''}>
${isActive ? '✓ ' : ''}${label}${isActive ? '' : ' →'}
</button>
${helperText ? `<div style="font-size: 0.7rem; color: var(--text-dim); margin-top: 0.25rem;">${helperText}</div>` : ''}
</div>
`;

return `
<div style="display: flex; gap: 0.5rem; margin-bottom: 0.75rem;">
${pill(
'Smart (free, on-device)',
mode === 'smart',
'switch-to-smart',
mode === 'smart'
? 'Embedded model is your top choice.'
: 'No API costs, runs offline.',
)}
${pill(
'Smarter (paid API)',
mode === 'smarter',
hasSmarterCandidate ? 'switch-to-smarter' : 'switch-to-smarter-blocked',
mode === 'smarter'
? 'Your paid provider is the top choice.'
: hasSmarterCandidate
? 'Sharper reasoning on tricky calls.'
: 'Add a paid provider below first.',
)}
Comment thread CHANGELOG.md
Comment on lines +3 to +6
## [unreleased] — Smart / Smarter mode toggle + zero-cost helper (#187 AC#6 + AC#8)

Two pieces close out the user-visible side of the embedded LLM story:

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Comment thread packages/llm-client/src/cost.ts Outdated
Comment on lines +28 to +37
// 1M). Deci-cents/M: input 8, output 40.
anthropic: { input: 8, output: 40 },
// OpenAI — based on GPT-4o-mini list price ($0.15/$0.60 per 1M).
// Deci-cents/M: input 1.5 → rounded up to 2 (we use integers and
// round UP everywhere so the cap stays conservative), output 6.
openai: { input: 2, output: 6 },
// Google — based on Gemini 1.5 Flash list price ($0.075/$0.30 per
// 1M for prompts <128k tokens). Deci-cents/M: input 1 (rounded up
// from 0.75), output 3.
google: { input: 1, output: 3 },
Comment thread packages/llm-client/src/cost.ts Outdated
Comment on lines +62 to +64
const rate = RATE_DECICENTS_PER_M_TOKENS[provider];
if (!rate) return 0;
// Deci-cents per million tokens × tokens / 1M = deci-cents.
Comment thread packages/llm-client/src/cost.ts Outdated
Comment on lines +64 to +69
// Deci-cents per million tokens × tokens / 1M = deci-cents.
// Cents = ceil(deci-cents / 10). Integer math throughout.
const deciCents = (rate.input * tokensIn + rate.output * tokensOut);
const tokenScale = 1_000_000;
const scaledDeciCents = Math.ceil(deciCents / tokenScale);
return Math.ceil(scaledDeciCents / 10);
Comment on lines +1290 to +1302
/**
* Reorder the chain so the first non-embedded enabled provider becomes
* top-priority. Returns null if there's nothing to switch to (caller
* should surface a "configure a paid provider first" message).
*/
export function applySmarterMode(chain) {
const next = chain.map((p) => ({ ...p }));
next.sort((a, b) => (a.priority ?? 0) - (b.priority ?? 0));
const smarterIdx = next.findIndex((p) => SMARTER_PROVIDERS.has(p.provider));
if (smarterIdx === -1) return null;
const target = next[smarterIdx];
target.enabled = true;
const others = next.filter((p) => p !== target);
Comment thread apps/web/public/js/pages/settings.js Outdated
Comment on lines +1349 to +1356
'Smarter (paid API)',
mode === 'smarter',
hasSmarterCandidate ? 'switch-to-smarter' : 'switch-to-smarter-blocked',
mode === 'smarter'
? 'Your paid provider is the top choice.'
: hasSmarterCandidate
? 'Sharper reasoning on tricky calls.'
: 'Add a paid provider below first.',
Comment on lines +1530 to +1534
} catch (err) {
document.getElementById('page-content').insertAdjacentHTML(
'afterbegin',
`<div class="error-banner">Failed to switch mode: ${escapeHtml(err.message)}</div>`,
);
Comment on lines +27 to +46
// 1M input + 1M output at 8/40 deci-cents per million → 4.8¢ → 5¢.
expect(estimateLlmCostCents('anthropic', 1_000_000, 1_000_000)).toBe(5);
});

it('estimates openai cost in integer cents', () => {
// 1M output at 6 deci-cents per million → 0.6¢ → 1¢ (rounded up).
expect(estimateLlmCostCents('openai', 0, 1_000_000)).toBe(1);
});

it('estimates google cost in integer cents', () => {
// 1M output at 3 deci-cents per million → 0.3¢ → 1¢ (rounded up).
expect(estimateLlmCostCents('google', 0, 1_000_000)).toBe(1);
});

it('rounds up so the cap-enforcement direction is always safe', () => {
// Anthropic at small token counts (10k input, 10k output) →
// (8*10_000 + 40*10_000) / 1_000_000 = 0.48 deci-cents → ceil = 1 →
// ceil(1/10) = 1¢. The exact float would be 0.048¢; rounding up
// means cap checks see 1¢ instead of zeroing out tiny usage.
expect(estimateLlmCostCents('anthropic', 10_000, 10_000)).toBe(1);
jayzalowitz added a commit that referenced this pull request May 11, 2026
…UT /ai, rollback optimistic state on save failure, copy fix for Ollama

Copilot's review of PR #253 caught four substantive issues:

1. Cost rate table off by 100×. The original draft stored
   { input: 8, output: 40 } for Anthropic and called the unit
   "deci-cents per 1M" — but $0.80 = 80¢ = 800 deci-cents, so the
   table needed 800/4000. Same conversion error on OpenAI (now
   150/600) and Google (now 75/300). Tests updated to pin the
   expected dollar-equivalent cents outputs ($4.80 for 1M+1M
   Anthropic, $0.60 OpenAI, $0.30 Google) so the regression can't
   recur. Embedded + Ollama still return 0¢ at any volume.

2. PUT /api/settings/:userId/ai rejected `embedded` as an invalid
   provider. Smart mode toggle inserts an `embedded` entry via
   applySmartMode; the pre-existing validation set didn't include
   it, so the round-trip 400'd at the API. Added `embedded` to
   validProviders.

3. switchAIBrainMode now snapshots the previous chain before the
   optimistic re-render and rolls back on save failure. Previously
   the UI would show the reordered state with only an error banner
   on top — visually implying success when the server rejected.

4. Smarter pill copy now says "paid API or Ollama" — SMARTER_PROVIDERS
   includes Ollama (local, free), and the earlier copy would have
   confused Ollama users into thinking the option didn't apply.

Test plan: llm-client 137/137 (was 136, +1 new rounding-up regression
test); api 544/544. Build clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
jayzalowitz added a commit that referenced this pull request May 12, 2026
…low, fix smarter-mode doc + err-message defensiveness

Four substantive Copilot round-2 findings on PR #253 addressed:

1. estimateLlmCostCents now THROWS when given a provider name not in
   the rate table, instead of silently returning 0. The
   AIProviderName type is the source of truth at compile time; this
   adds a runtime guard for paths that cast a DB string. Silent-zero
   would have hidden a real data/config bug as fake-free usage.

2. Token-count safe-integer guard: throws on non-finite, negative,
   or > 2e12 token counts. Prevents IEEE-754 rounding from silently
   producing a wrong cents value if an untrusted aggregator passes
   a bogus number. 2e12 is far beyond any real prompt; well below
   the 2^53 safe-integer ceiling.

3. applySmarterMode docstring corrected to match behavior: it scans
   the full chain by priority regardless of enabled state, then
   force-enables the chosen entry. The prior doc claimed "first
   non-embedded *enabled* provider" — that implied a filter we
   never applied.

4. switchAIBrainMode catch block now uses the
   `err instanceof Error ? err.message : String(err)` defensive
   pattern. A non-Error rejection (string, object, undefined) would
   otherwise produce "Failed to switch mode: undefined" on the
   banner.

3 new test cases cover the throw paths (unknown provider, bad token
counts, overflow). llm-client 140/140 green; api builds clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jayzalowitz

Copy link
Copy Markdown
Owner Author

Round-2 review reply:

Already fixed in round 1 (efc0822):

  • Cost rates 100×-too-low → corrected to 800/4000 (Anthropic), 150/600 (OpenAI), 75/300 (Google) deci-cents per 1M tokens.
  • Test expectations updated to pin the dollar-equivalent integer cents (.80 for 1M+1M Anthropic, etc.).
  • Smarter pill copy updated to mention Ollama ("paid API or Ollama").

Round-2 findings fixed in 0d20331:

  • estimateLlmCostCents throws on unknown provider (was: silent 0).
  • Token-count safe-integer guard added (throws on non-finite, negative, or > 2e12).
  • applySmarterMode docstring corrected to match behavior.
  • switchAIBrainMode catch block now uses defensive err instanceof Error ? err.message : String(err).

jayzalowitz and others added 3 commits May 12, 2026 00:23
Closes the user-visible side of the embedded LLM story.

AC#6: Settings → AI brain now leads with a two-pill Smart/Smarter mode
toggle. Active mode is computed from the user's provider chain
(top-priority enabled = embedded → Smart; hosted/Ollama → Smarter).
Clicking the inactive pill reorders priorities and auto-saves through
`PUT /api/settings/:userId/ai`.

- Switching to Smart adds an `embedded` entry with `model: 'auto'`
  if the chain doesn't have one — first-time-Smart users get a working
  configuration in one click.
- Switching to Smarter when no paid provider exists routes to a
  `switch-to-smarter-blocked` action that focuses the "+ Add a
  provider…" dropdown rather than failing silently.

Pure helpers (`detectAIMode`, `applySmartMode`, `applySmarterMode`)
factored out at the top of `settings.js` with module exports so the
mode pill, the action handler, and any future audit route all agree on
one definition.

AC#8: New `estimateLlmCostCents(provider, tokensIn, tokensOut)` helper
in @skytwin/llm-client. Per-provider rate table for hosted APIs;
absolute zero for `embedded` and `ollama`. Rounds up to the nearest
cent so spend-cap enforcement stays conservative — failure direction
is "approval required," never "silently past the cap." Also exports
`isZeroCostProvider(provider)` for callers that want to render a free
badge.

The future spend-recording call site can compute
`estimateLlmCostCents(response.provider, tokensIn, tokensOut)` and
trust local-runtime calls record zero — no embedded-special-case
branch needed at the recording site.

Test plan: 10 new vitest cases for `cost.ts` (load-bearing one:
embedded/ollama return 0 regardless of token volume). 16 cases
smoke-tested for the JS helpers via Node ESM import. Toggle visually
verified in Chrome across three provider-chain scenarios
(Smart-active, Smarter-active, no-paid-provider). Full llm-client
suite: 136/136 green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…UT /ai, rollback optimistic state on save failure, copy fix for Ollama

Copilot's review of PR #253 caught four substantive issues:

1. Cost rate table off by 100×. The original draft stored
   { input: 8, output: 40 } for Anthropic and called the unit
   "deci-cents per 1M" — but $0.80 = 80¢ = 800 deci-cents, so the
   table needed 800/4000. Same conversion error on OpenAI (now
   150/600) and Google (now 75/300). Tests updated to pin the
   expected dollar-equivalent cents outputs ($4.80 for 1M+1M
   Anthropic, $0.60 OpenAI, $0.30 Google) so the regression can't
   recur. Embedded + Ollama still return 0¢ at any volume.

2. PUT /api/settings/:userId/ai rejected `embedded` as an invalid
   provider. Smart mode toggle inserts an `embedded` entry via
   applySmartMode; the pre-existing validation set didn't include
   it, so the round-trip 400'd at the API. Added `embedded` to
   validProviders.

3. switchAIBrainMode now snapshots the previous chain before the
   optimistic re-render and rolls back on save failure. Previously
   the UI would show the reordered state with only an error banner
   on top — visually implying success when the server rejected.

4. Smarter pill copy now says "paid API or Ollama" — SMARTER_PROVIDERS
   includes Ollama (local, free), and the earlier copy would have
   confused Ollama users into thinking the option didn't apply.

Test plan: llm-client 137/137 (was 136, +1 new rounding-up regression
test); api 544/544. Build clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…low, fix smarter-mode doc + err-message defensiveness

Four substantive Copilot round-2 findings on PR #253 addressed:

1. estimateLlmCostCents now THROWS when given a provider name not in
   the rate table, instead of silently returning 0. The
   AIProviderName type is the source of truth at compile time; this
   adds a runtime guard for paths that cast a DB string. Silent-zero
   would have hidden a real data/config bug as fake-free usage.

2. Token-count safe-integer guard: throws on non-finite, negative,
   or > 2e12 token counts. Prevents IEEE-754 rounding from silently
   producing a wrong cents value if an untrusted aggregator passes
   a bogus number. 2e12 is far beyond any real prompt; well below
   the 2^53 safe-integer ceiling.

3. applySmarterMode docstring corrected to match behavior: it scans
   the full chain by priority regardless of enabled state, then
   force-enables the chosen entry. The prior doc claimed "first
   non-embedded *enabled* provider" — that implied a filter we
   never applied.

4. switchAIBrainMode catch block now uses the
   `err instanceof Error ? err.message : String(err)` defensive
   pattern. A non-Error rejection (string, object, undefined) would
   otherwise produce "Failed to switch mode: undefined" on the
   banner.

3 new test cases cover the throw paths (unknown provider, bad token
counts, overflow). llm-client 140/140 green; api builds clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jayzalowitz jayzalowitz force-pushed the jayzalowitz/issue-187-smart-smarter-toggle branch from 0d20331 to 3505fe5 Compare May 12, 2026 04:23
@jayzalowitz jayzalowitz merged commit f58a6b3 into main May 12, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants