Skip to content
This repository was archived by the owner on May 26, 2026. It is now read-only.

feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST2 — handler swap + cred priority flip + anthropic runtime dep#129

Closed
rafe-walker wants to merge 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-FEAT-AI-RESPONSE-LOOP-ST2
Closed

feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST2 — handler swap + cred priority flip + anthropic runtime dep#129
rafe-walker wants to merge 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-FEAT-AI-RESPONSE-LOOP-ST2

Conversation

@rafe-walker

Copy link
Copy Markdown
Owner

Summary

Closes Phase 2 Feature 5's biggest user-facing milestone. Joshua DMs Kora → Kora reasons (real LLM call, system-prompt-driven, context-aware, cost-ladder-aware) → Kora replies with the reasoning output. Echo path is gone.

Bucket spec: `kora_docs/17_cc_bucket_prompts/KR-FEAT-AI-RESPONSE-LOOP_kora_thinks.md`.

Base: `feature/phase2-upgrades` — NOT main.

Three PM rulings baked in

Ruling 1 — Credential cascade FLIPPED to OAuth-first.
ST1 shipped `KORA_ANTHROPIC_API_KEY → CLAUDE_CODE_OAUTH_TOKEN`. ST2 flips to `CLAUDE_CODE_OAUTH_TOKEN → KORA_ANTHROPIC_API_KEY → fail-CLOSED`. Joshua's Max 20x + post-May-15 SDK billing split route the $200/mo Agent SDK pool via OAuth. OAuth = production; API key = test/dev. Test renamed to `test_oauth_wins_over_api_key_when_both_set`.

Ruling 2 — `anthropic` to runtime deps. Same lesson as KR-SLOWAPI-DEP-FIX + aiosmtplib promotion. The reasoning_engine listener imports anthropic unconditionally at boot. Both prior locations (`[anthropic]` standalone extra + ST1's `[web]` addition) removed; single declaration in core dependencies block.

Ruling 3 — §4 PM-opens locked from ST1. Q1 system prompt / Q2 10-turn context / Q3 NO retry / Q4 PAUSED gating all carry through.

New module

`kora_cli/listeners/reasoning_engine_listener.py` (~135 lines):

  • `ReasoningEngineListener` wrapping the engine in the DaemonCoordinator lifecycle
  • Construction failure RE-RAISES so coordinator aborts boot per the spec's "engine startup failure → daemon fails-CLOSED" requirement (daemon that can't reason can't fulfill its primary purpose)
  • Module-level `current_reasoning_engine()` accessor mirroring `current_pool()` pattern from KR-MCP-CONSUMPTION ST1
  • Shutdown closes engine HTTP client best-effort
  • Registered via `register_daemon_listener("reasoning_engine", _factory)` at import time; wired into `kora_cli/listeners/init.py`

Handler swap

`kora_cli/handlers/slack_dm_handler.py` (+350 lines):

  • New optional `reasoning_engine` ctor arg; production resolves daemon singleton at reply time
  • Method body swapped: build `ConversationContext` via `load_slack_dm_context()`, construct `IncomingMessage`, call `engine.respond(...)`, project `ResponseResult`
  • Canned fallback `"Kora is currently unable to respond; operator notified."` sent on 5 distinct failure paths:
    • No engine resolvable (test path / partial daemon boot) → `engine_unavailable`
    • `ResponseResult.error` set → records error code (cost_ladder_halted / operational_state_paused / sdk_5xx / ...)
    • Engine itself raises → `engine_exception:`
    • Engine returns empty text on success → `empty_response_text`
  • Reply failure NEVER crashes inbound handler — Slack always 200 OK

JSONL outbound schema extended

5 new optional fields, backwards-compatible (pre-ST2 entries omit them):

  • `model_used` (e.g. `"claude-opus-4-7"`)
  • `input_tokens` / `output_tokens` (from SDK usage)
  • `reasoning_duration_ms` (engine-side wall-clock)
  • `reasoning_error` (stable code; None on success)

Each field written only when non-None — canned-fallback entries stay lean.

Cost-ladder write integration

After successful reasoning (NOT canned), handler builds `CanonicalUsage(input_tokens, output_tokens)` and calls `holder.record_inference(canonical_usage, model_name, provider="anthropic")`. Fail-soft layers:

  • No cost holder → silently skip (test paths)
  • Zero tokens → skip
  • `record_inference` itself is fail-soft per its docstring

K-DG verified literal API: `record_inference(canonical_usage, *, model_name, provider, base_url)` at `agent/cost_state_holder.py:272`; `CanonicalUsage` shape at `agent/usage_pricing.py:30`; accessor is `get_cost_holder()` not `get_holder()`; `_reset_cost_holder_for_tests()` helper used by the new write-integration tests.

Tests (57 new, 332 total all passing)

`test_reasoning_engine_listener.py` (8 tests):

  • Lifecycle (pre-startup None / startup-sets / shutdown-clears / close-raises-safe / shutdown-without-startup)
  • Production startup re-raises on misconfig (fail-CLOSED asserted)
  • Registered in LISTENER_REGISTRY as "reasoning_engine"
  • _set/_clear singleton helpers

`test_slack_dm_reply.py` updated/extended (15 ST2 tests):

  • ST1 echo-format tests rewritten to assert engine's response text
  • Engine unavailable → canned fallback + `engine_unavailable`
  • Engine returns error (cost_ladder_halted / operational_state_paused) → canned + recorded error
  • Engine raises exception → canned + `engine_exception:`
  • Empty engine text on success → canned + `empty_response_text`
  • IncomingMessage metadata (source / channel_id / thread_ts / user_id / event_ts) verified
  • Cost-ladder write called on success with correct CanonicalUsage + provider="anthropic"
  • Cost-ladder SKIPPED on canned fallback
  • Cost-ladder gracefully skips when holder uninitialized

`test_anthropic_engine.py` — `test_oauth_wins_over_api_key_when_both_set` (Ruling 1 verification).

§5 ship checklist

  • Base `feature/phase2-upgrades`
  • Title per format
  • §4 PM-opens carried (all defaults locked)
  • Credential cascade flipped OAuth-first
  • anthropic moved to runtime deps
  • API key NEVER logged (ST1 diverse-failure test still holds)
  • Cost-ladder write integration verified
  • PAUSED state respected
  • Engine startup failure → daemon fails-CLOSED (re-raise asserted)
  • Engine failure paths → canned fallback (5 distinct paths tested)
  • K-DG: literal field names pasted from grep
  • Tests pass locally (332/332)

Phase 2 Feature 5 closes

After this merges, Joshua DMs Kora and Kora actually thinks before responding — biggest user-facing milestone of Phase 2. Cost-ladder-aware model downshift (opus→sonnet→haiku→halt), operational-state-respecting (PAUSED→canned), thread-context-aware (last 10 turns), system-prompt-shaped voice.

PM picks next bucket from: KR-FEAT-AGENTIC-REASONING (tool use inside reasoning) / KR-MCP-SEND-TOOLS (kora__send_slack_dm via /mcp) / substrate-backed conversation memory (persist context to IsoKron).

🤖 Generated with Claude Code

…ty flip + anthropic runtime dep

Closes Phase 2 Feature 5's biggest user-facing milestone. Joshua
DMs Kora → Kora reasons (real LLM call, system-prompt-driven,
context-aware, cost-ladder-aware) → Kora replies with the
reasoning output. Echo path is gone.

## Three PM rulings baked in

**Ruling 1 — Credential cascade FLIPPED to OAuth-first.**

ST1 shipped `KORA_ANTHROPIC_API_KEY → CLAUDE_CODE_OAUTH_TOKEN`
order. ST2 flips to `CLAUDE_CODE_OAUTH_TOKEN → KORA_ANTHROPIC_API_KEY
→ fail-CLOSED`. Reasoning: Joshua's Max 20x + post-May-15 SDK
billing split route the $200/mo Agent SDK pool via the OAuth path.
OAuth = production; API key = test/dev escape hatch. Engine's
`_auth_mode` test updated to assert OAuth wins when both are set.

**Ruling 2 — `anthropic` to runtime deps.**

Same lesson as KR-SLOWAPI-DEP-FIX + aiosmtplib promotion: the
reasoning_engine listener imports anthropic unconditionally at
boot, so it belongs in the core dependencies block, not under
`[anthropic]` or `[web]` extras. Both prior locations removed;
single declaration in core deps.

**Ruling 3 — §4 PM-opens locked from ST1.**

Q1 system prompt first-draft (composed in ST1); Q2 10-turn context;
Q3 NO retry on 5xx; Q4 PAUSED state gating. All carry through ST2.

## New module

**`kora_cli/listeners/reasoning_engine_listener.py`** (~135 lines):

- `ReasoningEngineListener` wrapping the engine in the
  DaemonCoordinator lifecycle. Startup constructs
  `AnthropicReasoningEngine`; **construction failure RE-RAISES**
  so the coordinator aborts boot per the spec's "engine startup
  failure → daemon fails-CLOSED" requirement (a daemon that
  can't reason can't fulfill its primary purpose; better to abort
  loudly than ship a fallback-only daemon).
- Module-level `current_reasoning_engine()` accessor mirroring
  `current_pool()` from KR-MCP-CONSUMPTION ST1 — cross-cutting
  read from `SlackDMHandler` without import coupling.
- `_set_singleton` / `_clear_singleton` private helpers (same
  shape as the MCP pool listener).
- Shutdown closes the engine's HTTP client best-effort; per-listener
  timeout (10s) caps the wait.
- Registered via `register_daemon_listener("reasoning_engine", _factory)`
  at import time; wired into `kora_cli/listeners/__init__.py`.

## Handler swap

**`kora_cli/handlers/slack_dm_handler.py`** (+350 lines):

- New optional `reasoning_engine` constructor arg (test injection).
  Production resolves the daemon singleton via
  `current_reasoning_engine()` at reply time.
- Method `_send_echo_reply` (name kept for diff hygiene) body
  swapped: builds `ConversationContext` via
  `load_slack_dm_context()`, constructs `IncomingMessage`, calls
  `engine.respond(...)`, projects `ResponseResult` into reply
  text + JSONL meta.
- **Canned fallback** `"Kora is currently unable to respond;
  operator notified."` sent when:
  - No engine resolvable (test path / partial daemon boot)
  - `ResponseResult.error` set (cost_ladder_halted /
    operational_state_paused / sdk_5xx / ...)
  - Engine itself raises (defensive — recorded as
    `engine_exception:<class>`)
  - Engine returns empty text on success (defensive —
    `empty_response_text`)
- Reply failure does NOT crash the inbound handler — Slack
  always gets 200 OK (preserves the KR-FEAT-SLACK-DM ST2
  belt-and-suspenders guarantee).

## JSONL outbound schema extended

5 new optional fields on outbound entries (backwards-compatible —
pre-ST2 entries simply omit them):

- `model_used`: e.g. `"claude-opus-4-7"`
- `input_tokens` / `output_tokens`: from SDK usage
- `reasoning_duration_ms`: engine-side wall-clock
- `reasoning_error`: stable error code (None on success)

Each field is written only when non-None, so canned-fallback
entries that lack reasoning data stay lean.

## Cost-ladder write integration

After a successful reasoning call (NOT canned fallback), the
handler builds `CanonicalUsage(input_tokens, output_tokens)`
and calls `holder.record_inference(canonical_usage, model_name,
provider="anthropic")`. Fail-soft layers:

- No cost holder initialized → silently skip (test paths)
- Zero tokens → skip (no real inference happened)
- `record_inference` itself is fail-soft per its docstring
  (pricing-lookup miss accumulates 0)

K-DG verified literal API: `record_inference(canonical_usage, *,
model_name, provider, base_url)` per
`agent/cost_state_holder.py:272`; `CanonicalUsage` shape per
`agent/usage_pricing.py:30` (input_tokens / output_tokens /
cache_read_tokens / cache_write_tokens / reasoning_tokens /
request_count / raw_usage).

## Tests (57 new, 332 total all passing)

**`test_reasoning_engine_listener.py`** (8 tests):
- current_reasoning_engine() None pre-startup
- Startup with injected engine sets singleton
- Shutdown clears singleton + closes engine
- Shutdown safe when engine.close() raises
- Shutdown safe without prior startup (no-op)
- **Production startup re-raises on misconfig** (fail-CLOSED
  asserted — both credential envs unset → coordinator boot abort)
- Registered in LISTENER_REGISTRY as "reasoning_engine"
- _set_singleton / _clear_singleton helpers tested

**`test_slack_dm_reply.py`** updated/extended (15 ST2 tests):
- ST1 echo-format tests rewritten to assert engine's response
  text (not echo construction)
- `test_oauth_wins_over_api_key_when_both_set` — Ruling 1 verified
- Engine unavailable → canned fallback + reasoning_error="engine_unavailable"
- Engine returns error (cost_ladder_halted / operational_state_paused)
  → canned fallback + recorded error code
- Engine raises exception → canned fallback +
  reasoning_error="engine_exception:<class>"
- Empty engine text on success → canned fallback + "empty_response_text"
- Engine receives correct IncomingMessage metadata (source /
  channel_id / thread_ts / user_id / event_ts)
- **Cost-ladder write called on success** with correct
  CanonicalUsage + model_name + provider="anthropic"
- Cost-ladder write SKIPPED on canned fallback (no real inference)
- Cost-ladder gracefully skips when holder uninitialized

## §5 ship checklist

- [x] Base `feature/phase2-upgrades`
- [x] Title `feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST2 — handler swap + cred priority flip + anthropic runtime dep`
- [x] §4 PM-opens carried from ST1 (all defaults locked)
- [x] Credential cascade flipped OAuth-first per Ruling 1
- [x] anthropic moved to runtime deps per Ruling 2
- [x] API key NEVER logged (existing ST1 diverse-failure test
      still holds — credential value never in any error / log path)
- [x] Cost-ladder write integration verified (new test)
- [x] PAUSED state respected (engine refuses; handler sends canned)
- [x] Engine startup failure → daemon fails-CLOSED (re-raise asserted)
- [x] Engine failure paths → canned fallback (5 distinct paths tested)
- [x] K-DG: literal field names pasted from grep:
      `record_inference(canonical_usage, *, model_name, provider, base_url)`,
      `CanonicalUsage` shape, `get_cost_holder()` accessor,
      `_HOLDER` reset helper
- [x] Tests pass locally (**332/332** across full suite)

## What's next

After ST2 merges, **Phase 2 Feature 5 closes**. Joshua DMs Kora,
Kora reasons (real Anthropic API call against the Max plan pool,
cost-ladder-aware model selection, operational-state-respecting,
10-turn-context-aware), Kora replies intelligently.

PM picks next bucket — candidates:
- **KR-FEAT-AGENTIC-REASONING**: Kora can call kora__* tools inside
  her reasoning (look up ledger, check sea_tickets) for richer
  responses
- **KR-MCP-SEND-TOOLS**: expose kora__send_slack_dm + kora__send_email
  via /mcp for agent-driven sends
- **Substrate-backed conversation memory**: persist context to
  IsoKron substrate, not just JSONL

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rafe-walker rafe-walker deleted the feat/kora-KR-FEAT-AI-RESPONSE-LOOP-ST2 branch May 23, 2026 00:12
rafe-walker added a commit that referenced this pull request May 23, 2026
…ty flip + anthropic runtime dep (#131)

Re-opened replacement for PR #129 (closed unmerged due to pyproject conflict + PM-tooling race; see feedback_pm_merge_then_delete_race memory).

Phase 2 Feature 5 CLOSES. Joshua DMs Kora → Kora reasons → Kora replies intelligently.

Three rulings landed:
1. Cascade flipped OAuth-first — CLAUDE_CODE_OAUTH_TOKEN → KORA_ANTHROPIC_API_KEY → fail-CLOSED. Test test_oauth_wins_over_api_key_when_both_set asserts priority.
2. anthropic==0.86.0 moved to core deps. Same lesson as slowapi promotion + aiosmtplib runtime placement.
3. §4 PM-opens carried from ST1.

Wired:
- NEW reasoning_engine_listener.py with current_reasoning_engine() accessor.
- Startup failure RE-RAISES so coordinator aborts boot per fail-CLOSED spec.
- Handler swaps echo for engine.respond(IncomingMessage, ConversationContext).
- 5 distinct canned-fallback paths.
- JSONL extended with model_used + input_tokens + output_tokens + reasoning_duration_ms + reasoning_error.
- Cost-ladder write integration verified.

332/332 tests.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant