feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST2 — handler swap + cred priority flip + anthropic runtime dep by rafe-walker · Pull Request #129 · rafe-walker/kora

rafe-walker · 2026-05-23T00:10:02Z

Summary

Closes Phase 2 Feature 5's biggest user-facing milestone. Joshua DMs Kora → Kora reasons (real LLM call, system-prompt-driven, context-aware, cost-ladder-aware) → Kora replies with the reasoning output. Echo path is gone.

Bucket spec: `kora_docs/17_cc_bucket_prompts/KR-FEAT-AI-RESPONSE-LOOP_kora_thinks.md`.

Base: `feature/phase2-upgrades` — NOT main.

Three PM rulings baked in

Ruling 1 — Credential cascade FLIPPED to OAuth-first.
ST1 shipped `KORA_ANTHROPIC_API_KEY → CLAUDE_CODE_OAUTH_TOKEN`. ST2 flips to `CLAUDE_CODE_OAUTH_TOKEN → KORA_ANTHROPIC_API_KEY → fail-CLOSED`. Joshua's Max 20x + post-May-15 SDK billing split route the $200/mo Agent SDK pool via OAuth. OAuth = production; API key = test/dev. Test renamed to `test_oauth_wins_over_api_key_when_both_set`.

Ruling 2 — `anthropic` to runtime deps. Same lesson as KR-SLOWAPI-DEP-FIX + aiosmtplib promotion. The reasoning_engine listener imports anthropic unconditionally at boot. Both prior locations (`[anthropic]` standalone extra + ST1's `[web]` addition) removed; single declaration in core dependencies block.

Ruling 3 — §4 PM-opens locked from ST1. Q1 system prompt / Q2 10-turn context / Q3 NO retry / Q4 PAUSED gating all carry through.

New module

`kora_cli/listeners/reasoning_engine_listener.py` (~135 lines):

`ReasoningEngineListener` wrapping the engine in the DaemonCoordinator lifecycle
Construction failure RE-RAISES so coordinator aborts boot per the spec's "engine startup failure → daemon fails-CLOSED" requirement (daemon that can't reason can't fulfill its primary purpose)
Module-level `current_reasoning_engine()` accessor mirroring `current_pool()` pattern from KR-MCP-CONSUMPTION ST1
Shutdown closes engine HTTP client best-effort
Registered via `register_daemon_listener("reasoning_engine", _factory)` at import time; wired into `kora_cli/listeners/init.py`

Handler swap

`kora_cli/handlers/slack_dm_handler.py` (+350 lines):

New optional `reasoning_engine` ctor arg; production resolves daemon singleton at reply time
Method body swapped: build `ConversationContext` via `load_slack_dm_context()`, construct `IncomingMessage`, call `engine.respond(...)`, project `ResponseResult`
Canned fallback `"Kora is currently unable to respond; operator notified."` sent on 5 distinct failure paths:
- No engine resolvable (test path / partial daemon boot) → `engine_unavailable`
- `ResponseResult.error` set → records error code (cost_ladder_halted / operational_state_paused / sdk_5xx / ...)
- Engine itself raises → `engine_exception:`
- Engine returns empty text on success → `empty_response_text`
Reply failure NEVER crashes inbound handler — Slack always 200 OK

JSONL outbound schema extended

5 new optional fields, backwards-compatible (pre-ST2 entries omit them):

`model_used` (e.g. `"claude-opus-4-7"`)
`input_tokens` / `output_tokens` (from SDK usage)
`reasoning_duration_ms` (engine-side wall-clock)
`reasoning_error` (stable code; None on success)

Each field written only when non-None — canned-fallback entries stay lean.

Cost-ladder write integration

After successful reasoning (NOT canned), handler builds `CanonicalUsage(input_tokens, output_tokens)` and calls `holder.record_inference(canonical_usage, model_name, provider="anthropic")`. Fail-soft layers:

No cost holder → silently skip (test paths)
Zero tokens → skip
`record_inference` itself is fail-soft per its docstring

K-DG verified literal API: `record_inference(canonical_usage, *, model_name, provider, base_url)` at `agent/cost_state_holder.py:272`; `CanonicalUsage` shape at `agent/usage_pricing.py:30`; accessor is `get_cost_holder()` not `get_holder()`; `_reset_cost_holder_for_tests()` helper used by the new write-integration tests.

Tests (57 new, 332 total all passing)

`test_reasoning_engine_listener.py` (8 tests):

Lifecycle (pre-startup None / startup-sets / shutdown-clears / close-raises-safe / shutdown-without-startup)
Production startup re-raises on misconfig (fail-CLOSED asserted)
Registered in LISTENER_REGISTRY as "reasoning_engine"
_set/_clear singleton helpers

`test_slack_dm_reply.py` updated/extended (15 ST2 tests):

ST1 echo-format tests rewritten to assert engine's response text
Engine unavailable → canned fallback + `engine_unavailable`
Engine returns error (cost_ladder_halted / operational_state_paused) → canned + recorded error
Engine raises exception → canned + `engine_exception:`
Empty engine text on success → canned + `empty_response_text`
IncomingMessage metadata (source / channel_id / thread_ts / user_id / event_ts) verified
Cost-ladder write called on success with correct CanonicalUsage + provider="anthropic"
Cost-ladder SKIPPED on canned fallback
Cost-ladder gracefully skips when holder uninitialized

`test_anthropic_engine.py` — `test_oauth_wins_over_api_key_when_both_set` (Ruling 1 verification).

§5 ship checklist

Phase 2 Feature 5 closes

After this merges, Joshua DMs Kora and Kora actually thinks before responding — biggest user-facing milestone of Phase 2. Cost-ladder-aware model downshift (opus→sonnet→haiku→halt), operational-state-respecting (PAUSED→canned), thread-context-aware (last 10 turns), system-prompt-shaped voice.

PM picks next bucket from: KR-FEAT-AGENTIC-REASONING (tool use inside reasoning) / KR-MCP-SEND-TOOLS (kora__send_slack_dm via /mcp) / substrate-backed conversation memory (persist context to IsoKron).

🤖 Generated with Claude Code

…ty flip + anthropic runtime dep Closes Phase 2 Feature 5's biggest user-facing milestone. Joshua DMs Kora → Kora reasons (real LLM call, system-prompt-driven, context-aware, cost-ladder-aware) → Kora replies with the reasoning output. Echo path is gone. ## Three PM rulings baked in **Ruling 1 — Credential cascade FLIPPED to OAuth-first.** ST1 shipped `KORA_ANTHROPIC_API_KEY → CLAUDE_CODE_OAUTH_TOKEN` order. ST2 flips to `CLAUDE_CODE_OAUTH_TOKEN → KORA_ANTHROPIC_API_KEY → fail-CLOSED`. Reasoning: Joshua's Max 20x + post-May-15 SDK billing split route the $200/mo Agent SDK pool via the OAuth path. OAuth = production; API key = test/dev escape hatch. Engine's `_auth_mode` test updated to assert OAuth wins when both are set. **Ruling 2 — `anthropic` to runtime deps.** Same lesson as KR-SLOWAPI-DEP-FIX + aiosmtplib promotion: the reasoning_engine listener imports anthropic unconditionally at boot, so it belongs in the core dependencies block, not under `[anthropic]` or `[web]` extras. Both prior locations removed; single declaration in core deps. **Ruling 3 — §4 PM-opens locked from ST1.** Q1 system prompt first-draft (composed in ST1); Q2 10-turn context; Q3 NO retry on 5xx; Q4 PAUSED state gating. All carry through ST2. ## New module **`kora_cli/listeners/reasoning_engine_listener.py`** (~135 lines): - `ReasoningEngineListener` wrapping the engine in the DaemonCoordinator lifecycle. Startup constructs `AnthropicReasoningEngine`; **construction failure RE-RAISES** so the coordinator aborts boot per the spec's "engine startup failure → daemon fails-CLOSED" requirement (a daemon that can't reason can't fulfill its primary purpose; better to abort loudly than ship a fallback-only daemon). - Module-level `current_reasoning_engine()` accessor mirroring `current_pool()` from KR-MCP-CONSUMPTION ST1 — cross-cutting read from `SlackDMHandler` without import coupling. - `_set_singleton` / `_clear_singleton` private helpers (same shape as the MCP pool listener). - Shutdown closes the engine's HTTP client best-effort; per-listener timeout (10s) caps the wait. - Registered via `register_daemon_listener("reasoning_engine", _factory)` at import time; wired into `kora_cli/listeners/__init__.py`. ## Handler swap **`kora_cli/handlers/slack_dm_handler.py`** (+350 lines): - New optional `reasoning_engine` constructor arg (test injection). Production resolves the daemon singleton via `current_reasoning_engine()` at reply time. - Method `_send_echo_reply` (name kept for diff hygiene) body swapped: builds `ConversationContext` via `load_slack_dm_context()`, constructs `IncomingMessage`, calls `engine.respond(...)`, projects `ResponseResult` into reply text + JSONL meta. - **Canned fallback** `"Kora is currently unable to respond; operator notified."` sent when: - No engine resolvable (test path / partial daemon boot) - `ResponseResult.error` set (cost_ladder_halted / operational_state_paused / sdk_5xx / ...) - Engine itself raises (defensive — recorded as `engine_exception:<class>`) - Engine returns empty text on success (defensive — `empty_response_text`) - Reply failure does NOT crash the inbound handler — Slack always gets 200 OK (preserves the KR-FEAT-SLACK-DM ST2 belt-and-suspenders guarantee). ## JSONL outbound schema extended 5 new optional fields on outbound entries (backwards-compatible — pre-ST2 entries simply omit them): - `model_used`: e.g. `"claude-opus-4-7"` - `input_tokens` / `output_tokens`: from SDK usage - `reasoning_duration_ms`: engine-side wall-clock - `reasoning_error`: stable error code (None on success) Each field is written only when non-None, so canned-fallback entries that lack reasoning data stay lean. ## Cost-ladder write integration After a successful reasoning call (NOT canned fallback), the handler builds `CanonicalUsage(input_tokens, output_tokens)` and calls `holder.record_inference(canonical_usage, model_name, provider="anthropic")`. Fail-soft layers: - No cost holder initialized → silently skip (test paths) - Zero tokens → skip (no real inference happened) - `record_inference` itself is fail-soft per its docstring (pricing-lookup miss accumulates 0) K-DG verified literal API: `record_inference(canonical_usage, *, model_name, provider, base_url)` per `agent/cost_state_holder.py:272`; `CanonicalUsage` shape per `agent/usage_pricing.py:30` (input_tokens / output_tokens / cache_read_tokens / cache_write_tokens / reasoning_tokens / request_count / raw_usage). ## Tests (57 new, 332 total all passing) **`test_reasoning_engine_listener.py`** (8 tests): - current_reasoning_engine() None pre-startup - Startup with injected engine sets singleton - Shutdown clears singleton + closes engine - Shutdown safe when engine.close() raises - Shutdown safe without prior startup (no-op) - **Production startup re-raises on misconfig** (fail-CLOSED asserted — both credential envs unset → coordinator boot abort) - Registered in LISTENER_REGISTRY as "reasoning_engine" - _set_singleton / _clear_singleton helpers tested **`test_slack_dm_reply.py`** updated/extended (15 ST2 tests): - ST1 echo-format tests rewritten to assert engine's response text (not echo construction) - `test_oauth_wins_over_api_key_when_both_set` — Ruling 1 verified - Engine unavailable → canned fallback + reasoning_error="engine_unavailable" - Engine returns error (cost_ladder_halted / operational_state_paused) → canned fallback + recorded error code - Engine raises exception → canned fallback + reasoning_error="engine_exception:<class>" - Empty engine text on success → canned fallback + "empty_response_text" - Engine receives correct IncomingMessage metadata (source / channel_id / thread_ts / user_id / event_ts) - **Cost-ladder write called on success** with correct CanonicalUsage + model_name + provider="anthropic" - Cost-ladder write SKIPPED on canned fallback (no real inference) - Cost-ladder gracefully skips when holder uninitialized ## §5 ship checklist - [x] Base `feature/phase2-upgrades` - [x] Title `feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST2 — handler swap + cred priority flip + anthropic runtime dep` - [x] §4 PM-opens carried from ST1 (all defaults locked) - [x] Credential cascade flipped OAuth-first per Ruling 1 - [x] anthropic moved to runtime deps per Ruling 2 - [x] API key NEVER logged (existing ST1 diverse-failure test still holds — credential value never in any error / log path) - [x] Cost-ladder write integration verified (new test) - [x] PAUSED state respected (engine refuses; handler sends canned) - [x] Engine startup failure → daemon fails-CLOSED (re-raise asserted) - [x] Engine failure paths → canned fallback (5 distinct paths tested) - [x] K-DG: literal field names pasted from grep: `record_inference(canonical_usage, *, model_name, provider, base_url)`, `CanonicalUsage` shape, `get_cost_holder()` accessor, `_HOLDER` reset helper - [x] Tests pass locally (**332/332** across full suite) ## What's next After ST2 merges, **Phase 2 Feature 5 closes**. Joshua DMs Kora, Kora reasons (real Anthropic API call against the Max plan pool, cost-ladder-aware model selection, operational-state-respecting, 10-turn-context-aware), Kora replies intelligently. PM picks next bucket — candidates: - **KR-FEAT-AGENTIC-REASONING**: Kora can call kora__* tools inside her reasoning (look up ledger, check sea_tickets) for richer responses - **KR-MCP-SEND-TOOLS**: expose kora__send_slack_dm + kora__send_email via /mcp for agent-driven sends - **Substrate-backed conversation memory**: persist context to IsoKron substrate, not just JSONL Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ty flip + anthropic runtime dep (#131) Re-opened replacement for PR #129 (closed unmerged due to pyproject conflict + PM-tooling race; see feedback_pm_merge_then_delete_race memory). Phase 2 Feature 5 CLOSES. Joshua DMs Kora → Kora reasons → Kora replies intelligently. Three rulings landed: 1. Cascade flipped OAuth-first — CLAUDE_CODE_OAUTH_TOKEN → KORA_ANTHROPIC_API_KEY → fail-CLOSED. Test test_oauth_wins_over_api_key_when_both_set asserts priority. 2. anthropic==0.86.0 moved to core deps. Same lesson as slowapi promotion + aiosmtplib runtime placement. 3. §4 PM-opens carried from ST1. Wired: - NEW reasoning_engine_listener.py with current_reasoning_engine() accessor. - Startup failure RE-RAISES so coordinator aborts boot per fail-CLOSED spec. - Handler swaps echo for engine.respond(IncomingMessage, ConversationContext). - 5 distinct canned-fallback paths. - JSONL extended with model_used + input_tokens + output_tokens + reasoning_duration_ms + reasoning_error. - Cost-ladder write integration verified. 332/332 tests.

rafe-walker closed this May 23, 2026

rafe-walker deleted the feat/kora-KR-FEAT-AI-RESPONSE-LOOP-ST2 branch May 23, 2026 00:12

rafe-walker mentioned this pull request May 23, 2026

feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST2 — handler swap + cred priority flip + anthropic runtime dep (re-opened post #129) #131

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST2 — handler swap + cred priority flip + anthropic runtime dep#129

feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST2 — handler swap + cred priority flip + anthropic runtime dep#129
rafe-walker wants to merge 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-FEAT-AI-RESPONSE-LOOP-ST2

rafe-walker commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant