This repository was archived by the owner on May 26, 2026. It is now read-only.
feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST2 — handler swap + cred priority flip + anthropic runtime dep#129
Closed
rafe-walker wants to merge 1 commit into
Closed
Conversation
…ty flip + anthropic runtime dep
Closes Phase 2 Feature 5's biggest user-facing milestone. Joshua
DMs Kora → Kora reasons (real LLM call, system-prompt-driven,
context-aware, cost-ladder-aware) → Kora replies with the
reasoning output. Echo path is gone.
## Three PM rulings baked in
**Ruling 1 — Credential cascade FLIPPED to OAuth-first.**
ST1 shipped `KORA_ANTHROPIC_API_KEY → CLAUDE_CODE_OAUTH_TOKEN`
order. ST2 flips to `CLAUDE_CODE_OAUTH_TOKEN → KORA_ANTHROPIC_API_KEY
→ fail-CLOSED`. Reasoning: Joshua's Max 20x + post-May-15 SDK
billing split route the $200/mo Agent SDK pool via the OAuth path.
OAuth = production; API key = test/dev escape hatch. Engine's
`_auth_mode` test updated to assert OAuth wins when both are set.
**Ruling 2 — `anthropic` to runtime deps.**
Same lesson as KR-SLOWAPI-DEP-FIX + aiosmtplib promotion: the
reasoning_engine listener imports anthropic unconditionally at
boot, so it belongs in the core dependencies block, not under
`[anthropic]` or `[web]` extras. Both prior locations removed;
single declaration in core deps.
**Ruling 3 — §4 PM-opens locked from ST1.**
Q1 system prompt first-draft (composed in ST1); Q2 10-turn context;
Q3 NO retry on 5xx; Q4 PAUSED state gating. All carry through ST2.
## New module
**`kora_cli/listeners/reasoning_engine_listener.py`** (~135 lines):
- `ReasoningEngineListener` wrapping the engine in the
DaemonCoordinator lifecycle. Startup constructs
`AnthropicReasoningEngine`; **construction failure RE-RAISES**
so the coordinator aborts boot per the spec's "engine startup
failure → daemon fails-CLOSED" requirement (a daemon that
can't reason can't fulfill its primary purpose; better to abort
loudly than ship a fallback-only daemon).
- Module-level `current_reasoning_engine()` accessor mirroring
`current_pool()` from KR-MCP-CONSUMPTION ST1 — cross-cutting
read from `SlackDMHandler` without import coupling.
- `_set_singleton` / `_clear_singleton` private helpers (same
shape as the MCP pool listener).
- Shutdown closes the engine's HTTP client best-effort; per-listener
timeout (10s) caps the wait.
- Registered via `register_daemon_listener("reasoning_engine", _factory)`
at import time; wired into `kora_cli/listeners/__init__.py`.
## Handler swap
**`kora_cli/handlers/slack_dm_handler.py`** (+350 lines):
- New optional `reasoning_engine` constructor arg (test injection).
Production resolves the daemon singleton via
`current_reasoning_engine()` at reply time.
- Method `_send_echo_reply` (name kept for diff hygiene) body
swapped: builds `ConversationContext` via
`load_slack_dm_context()`, constructs `IncomingMessage`, calls
`engine.respond(...)`, projects `ResponseResult` into reply
text + JSONL meta.
- **Canned fallback** `"Kora is currently unable to respond;
operator notified."` sent when:
- No engine resolvable (test path / partial daemon boot)
- `ResponseResult.error` set (cost_ladder_halted /
operational_state_paused / sdk_5xx / ...)
- Engine itself raises (defensive — recorded as
`engine_exception:<class>`)
- Engine returns empty text on success (defensive —
`empty_response_text`)
- Reply failure does NOT crash the inbound handler — Slack
always gets 200 OK (preserves the KR-FEAT-SLACK-DM ST2
belt-and-suspenders guarantee).
## JSONL outbound schema extended
5 new optional fields on outbound entries (backwards-compatible —
pre-ST2 entries simply omit them):
- `model_used`: e.g. `"claude-opus-4-7"`
- `input_tokens` / `output_tokens`: from SDK usage
- `reasoning_duration_ms`: engine-side wall-clock
- `reasoning_error`: stable error code (None on success)
Each field is written only when non-None, so canned-fallback
entries that lack reasoning data stay lean.
## Cost-ladder write integration
After a successful reasoning call (NOT canned fallback), the
handler builds `CanonicalUsage(input_tokens, output_tokens)`
and calls `holder.record_inference(canonical_usage, model_name,
provider="anthropic")`. Fail-soft layers:
- No cost holder initialized → silently skip (test paths)
- Zero tokens → skip (no real inference happened)
- `record_inference` itself is fail-soft per its docstring
(pricing-lookup miss accumulates 0)
K-DG verified literal API: `record_inference(canonical_usage, *,
model_name, provider, base_url)` per
`agent/cost_state_holder.py:272`; `CanonicalUsage` shape per
`agent/usage_pricing.py:30` (input_tokens / output_tokens /
cache_read_tokens / cache_write_tokens / reasoning_tokens /
request_count / raw_usage).
## Tests (57 new, 332 total all passing)
**`test_reasoning_engine_listener.py`** (8 tests):
- current_reasoning_engine() None pre-startup
- Startup with injected engine sets singleton
- Shutdown clears singleton + closes engine
- Shutdown safe when engine.close() raises
- Shutdown safe without prior startup (no-op)
- **Production startup re-raises on misconfig** (fail-CLOSED
asserted — both credential envs unset → coordinator boot abort)
- Registered in LISTENER_REGISTRY as "reasoning_engine"
- _set_singleton / _clear_singleton helpers tested
**`test_slack_dm_reply.py`** updated/extended (15 ST2 tests):
- ST1 echo-format tests rewritten to assert engine's response
text (not echo construction)
- `test_oauth_wins_over_api_key_when_both_set` — Ruling 1 verified
- Engine unavailable → canned fallback + reasoning_error="engine_unavailable"
- Engine returns error (cost_ladder_halted / operational_state_paused)
→ canned fallback + recorded error code
- Engine raises exception → canned fallback +
reasoning_error="engine_exception:<class>"
- Empty engine text on success → canned fallback + "empty_response_text"
- Engine receives correct IncomingMessage metadata (source /
channel_id / thread_ts / user_id / event_ts)
- **Cost-ladder write called on success** with correct
CanonicalUsage + model_name + provider="anthropic"
- Cost-ladder write SKIPPED on canned fallback (no real inference)
- Cost-ladder gracefully skips when holder uninitialized
## §5 ship checklist
- [x] Base `feature/phase2-upgrades`
- [x] Title `feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST2 — handler swap + cred priority flip + anthropic runtime dep`
- [x] §4 PM-opens carried from ST1 (all defaults locked)
- [x] Credential cascade flipped OAuth-first per Ruling 1
- [x] anthropic moved to runtime deps per Ruling 2
- [x] API key NEVER logged (existing ST1 diverse-failure test
still holds — credential value never in any error / log path)
- [x] Cost-ladder write integration verified (new test)
- [x] PAUSED state respected (engine refuses; handler sends canned)
- [x] Engine startup failure → daemon fails-CLOSED (re-raise asserted)
- [x] Engine failure paths → canned fallback (5 distinct paths tested)
- [x] K-DG: literal field names pasted from grep:
`record_inference(canonical_usage, *, model_name, provider, base_url)`,
`CanonicalUsage` shape, `get_cost_holder()` accessor,
`_HOLDER` reset helper
- [x] Tests pass locally (**332/332** across full suite)
## What's next
After ST2 merges, **Phase 2 Feature 5 closes**. Joshua DMs Kora,
Kora reasons (real Anthropic API call against the Max plan pool,
cost-ladder-aware model selection, operational-state-respecting,
10-turn-context-aware), Kora replies intelligently.
PM picks next bucket — candidates:
- **KR-FEAT-AGENTIC-REASONING**: Kora can call kora__* tools inside
her reasoning (look up ledger, check sea_tickets) for richer
responses
- **KR-MCP-SEND-TOOLS**: expose kora__send_slack_dm + kora__send_email
via /mcp for agent-driven sends
- **Substrate-backed conversation memory**: persist context to
IsoKron substrate, not just JSONL
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rafe-walker
added a commit
that referenced
this pull request
May 23, 2026
…ty flip + anthropic runtime dep (#131) Re-opened replacement for PR #129 (closed unmerged due to pyproject conflict + PM-tooling race; see feedback_pm_merge_then_delete_race memory). Phase 2 Feature 5 CLOSES. Joshua DMs Kora → Kora reasons → Kora replies intelligently. Three rulings landed: 1. Cascade flipped OAuth-first — CLAUDE_CODE_OAUTH_TOKEN → KORA_ANTHROPIC_API_KEY → fail-CLOSED. Test test_oauth_wins_over_api_key_when_both_set asserts priority. 2. anthropic==0.86.0 moved to core deps. Same lesson as slowapi promotion + aiosmtplib runtime placement. 3. §4 PM-opens carried from ST1. Wired: - NEW reasoning_engine_listener.py with current_reasoning_engine() accessor. - Startup failure RE-RAISES so coordinator aborts boot per fail-CLOSED spec. - Handler swaps echo for engine.respond(IncomingMessage, ConversationContext). - 5 distinct canned-fallback paths. - JSONL extended with model_used + input_tokens + output_tokens + reasoning_duration_ms + reasoning_error. - Cost-ladder write integration verified. 332/332 tests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes Phase 2 Feature 5's biggest user-facing milestone. Joshua DMs Kora → Kora reasons (real LLM call, system-prompt-driven, context-aware, cost-ladder-aware) → Kora replies with the reasoning output. Echo path is gone.
Bucket spec: `kora_docs/17_cc_bucket_prompts/KR-FEAT-AI-RESPONSE-LOOP_kora_thinks.md`.
Base: `feature/phase2-upgrades` — NOT main.
Three PM rulings baked in
Ruling 1 — Credential cascade FLIPPED to OAuth-first.
ST1 shipped `KORA_ANTHROPIC_API_KEY → CLAUDE_CODE_OAUTH_TOKEN`. ST2 flips to `CLAUDE_CODE_OAUTH_TOKEN → KORA_ANTHROPIC_API_KEY → fail-CLOSED`. Joshua's Max 20x + post-May-15 SDK billing split route the $200/mo Agent SDK pool via OAuth. OAuth = production; API key = test/dev. Test renamed to `test_oauth_wins_over_api_key_when_both_set`.
Ruling 2 — `anthropic` to runtime deps. Same lesson as KR-SLOWAPI-DEP-FIX + aiosmtplib promotion. The reasoning_engine listener imports anthropic unconditionally at boot. Both prior locations (`[anthropic]` standalone extra + ST1's `[web]` addition) removed; single declaration in core dependencies block.
Ruling 3 — §4 PM-opens locked from ST1. Q1 system prompt / Q2 10-turn context / Q3 NO retry / Q4 PAUSED gating all carry through.
New module
`kora_cli/listeners/reasoning_engine_listener.py` (~135 lines):
Handler swap
`kora_cli/handlers/slack_dm_handler.py` (+350 lines):
JSONL outbound schema extended
5 new optional fields, backwards-compatible (pre-ST2 entries omit them):
Each field written only when non-None — canned-fallback entries stay lean.
Cost-ladder write integration
After successful reasoning (NOT canned), handler builds `CanonicalUsage(input_tokens, output_tokens)` and calls `holder.record_inference(canonical_usage, model_name, provider="anthropic")`. Fail-soft layers:
K-DG verified literal API: `record_inference(canonical_usage, *, model_name, provider, base_url)` at `agent/cost_state_holder.py:272`; `CanonicalUsage` shape at `agent/usage_pricing.py:30`; accessor is `get_cost_holder()` not `get_holder()`; `_reset_cost_holder_for_tests()` helper used by the new write-integration tests.
Tests (57 new, 332 total all passing)
`test_reasoning_engine_listener.py` (8 tests):
`test_slack_dm_reply.py` updated/extended (15 ST2 tests):
`test_anthropic_engine.py` — `test_oauth_wins_over_api_key_when_both_set` (Ruling 1 verification).
§5 ship checklist
Phase 2 Feature 5 closes
After this merges, Joshua DMs Kora and Kora actually thinks before responding — biggest user-facing milestone of Phase 2. Cost-ladder-aware model downshift (opus→sonnet→haiku→halt), operational-state-respecting (PAUSED→canned), thread-context-aware (last 10 turns), system-prompt-shaped voice.
PM picks next bucket from: KR-FEAT-AGENTIC-REASONING (tool use inside reasoning) / KR-MCP-SEND-TOOLS (kora__send_slack_dm via /mcp) / substrate-backed conversation memory (persist context to IsoKron).
🤖 Generated with Claude Code