feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST1 — ReasoningEngine + Anthropic impl + context loader by rafe-walker · Pull Request #126 · rafe-walker/kora

rafe-walker · 2026-05-22T17:32:23Z

Summary

First ST of Phase 2's biggest user-facing milestone. Lays the clean reasoning surface (Protocol + value classes + Anthropic impl + JSONL context loader + operator-editable system prompt) that ST2 will wire into the Slack DM handler to replace the echo.

NOT yet plugged into any handler — that's ST2.

Bucket spec: `kora_docs/17_cc_bucket_prompts/KR-FEAT-AI-RESPONSE-LOOP_kora_thinks.md`.

Base: `feature/phase2-upgrades` — NOT main.

New package: `kora_cli/reasoning/`

`engine.py` (~180 lines) — `ReasoningEngine` Protocol + `IncomingMessage` / `ConversationContext` / `ConversationTurn` / `ResponseResult` value classes. Protocol over ABC for test-seam composition.
`anthropic_engine.py` (~350 lines) — `AnthropicReasoningEngine` implementing the protocol against `anthropic==0.86.0`. Two-mode credential cascade (see K-DG drift below). Cost-ladder-aware model selection. Operational-state gating. 60s per-call timeout. NO retry on 5xx per PM Q3. Credential-sanitized error mapping.
`context_loader.py` (~200 lines) — `load_slack_dm_context()` reads `${HERMES_HOME}/slack_dm_log.jsonl`, filters to (channel, thread), skips filtered/failed entries, projects to `ConversationTurn` list (oldest→newest, capped at max_turns). Resolves operational + cost state strings from holders. Defensive malformed-line handling.

New doc: `kora_docs/00_canonical_current_state/kora_system_prompt.md`

~150 lines, first-draft Kora identity per PM Q1 YES. Composed from project_kora memory: digital extension framing / brevity by default / no preamble or sycophancy / technical fluency assumption / operational-state respect / cost-ladder awareness / NOT-Claude-the-public-assistant boundary. Operator-editable; fail-CLOSED if missing at daemon boot.

K-DG drift findings (surfaced + acted on)

Drift 1 — credential env name conflict. Bucket spec said `KORA_ANTHROPIC_API_KEY` only. Existing env-mapping (PR #110) + gate-2 anti-secrets + "Max plan via Agent SDK billing" framing all point to `CLAUDE_CODE_OAUTH_TOKEN`. Implementation supports BOTH via cascade:

`KORA_ANTHROPIC_API_KEY` (if set) → SDK `api_key=...` (Console billing path)
`CLAUDE_CODE_OAUTH_TOKEN` (fallback) → SDK `auth_token=...` (Max plan path)
Both unset → `ReasoningEngineNotConfigured` fail-CLOSED

Operator chooses; existing OAuth secret in `kora-runtime-anthropic` works without any new provisioning. PM can tighten to single env path in a follow-up if preferred.

Drift 2 — `anthropic` lib was extra-only. Moved to `[web]` extra alongside fastapi/uvicorn/slowapi so daemon deploys don't need a separate `pip install -e ".[anthropic]"` invocation.

Drift 3 — CostRung enum value naming. Bucket spec used `normal / warned / constrained / halted`; actual enum is `NORMAL / WARN_75 / DOWNSHIFT_90 / HARD_STOP_100` per `agent/cost_state_holder.py:114-117`. Engine consumes canonical `.value` strings via `ConversationContext.current_cost_ladder_rung` literal type.

Drift 4 — accessor shapes. `holder.active_rung()` is a method, not a property (`agent/cost_state_holder.py:259`); the accessor is `get_cost_holder()` (`agent/cost_state_holder.py:502`), not `get_holder()`. Verified by grep before implementation.

Cost-ladder model selection

CostRung	Model
`normal`	`claude-opus-4-7`
`warn_75`	`claude-sonnet-4-6`
`downshift_90`	`claude-haiku-4-5-20251001`
`hard_stop_100`	refuse (no API call; `error="cost_ladder_halted"`)
`unknown` (defensive)	opus + WARN log

Error code taxonomy

`sdk_auth` / `sdk_rate_limited` / `sdk_5xx` / `sdk_4xx_` / `sdk_timeout` / `sdk_transport` / `sdk_unknown_` / `cost_ladder_halted` / `operational_state_paused` / `response_projection_failed`. Stable across the codebase.


§4 PM-opens — all defaults accepted

Q1: YES first-draft system prompt — composed
Q2: 10-turn context — `DEFAULT_MAX_TURNS=10` in loader
Q3: NO retry on 5xx — single SDK call (asserted by test)
Q4: PAUSED/STOPPED gating — implemented + tested

Tests (42 new, 275 total all passing)
`test_anthropic_engine.py` (24 tests):

Construction fail-CLOSED on both creds unset; API key only; OAuth only; API key wins; whitespace-only treated as unset
System prompt fail-CLOSED on missing/empty/unreadable; env-path override
Cost-ladder: NORMAL→opus, WARN_75→sonnet, DOWNSHIFT_90→haiku, HARD_STOP_100→refuse, unknown→opus + WARN
Operational-state gating: paused/stopped refuse, no SDK call
Happy path: text + tokens + duration in ResponseResult
Message history: alternating roles preserved; consecutive same-role concatenated; system prompt passed to SDK
SDK error mapping (6 modes: 401/429/500/400/timeout/unknown)
NO retry on 5xx or 429 (single SDK call asserted)
SECURITY: credential NEVER in error codes / text / model_used / log lines across 6 failure modes

`test_context_loader.py` (18 tests):

Missing/empty file → empty context with "unknown" state strings
Channel filtering, thread-ts exact match, None-thread isolation
Filtered inbound entries skipped (filtered_non_joshua / dropped_paused / handler_error)
Failed outbound entries skipped (`send_status: failed`)
Turns ordered oldest→newest
max_turns slicing keeps most-recent
`DEFAULT_MAX_TURNS == 10` asserted
Malformed JSON lines logged + skipped; valid lines still loaded
Operational state surfaces from initialized holder
Log path env override

§5 ship checklist

 Base `feature/phase2-upgrades`
 Title `feat(kora): KR-FEAT-AI-RESPONSE-LOOP STn — `
 §4 questions resolved
 `anthropic` dep present in pyproject (moved into `[web]` extra)
 API key NEVER logged (asserted across 6 failure modes)
 Cost-ladder rung respected on every reasoning call
 PAUSED state respected (test for both PAUSED + STOPPED)
 Engine failure → ResponseResult.error (no crash; canned fallback in ST2)
 K-DG: literal field names pasted from grep; `.current` @Property vs `active_rung()` method caught; CostRung values verified
 Tests pass locally (275/275 across full suite)

What's next
ST2 — wire reasoning into the Slack DM handler:

New `kora_cli/listeners/reasoning_engine_listener.py` with `current_reasoning_engine()` accessor mirroring the `current_pool()` pattern
Update `SlackDMHandler` to call `reasoning_engine.respond()` instead of constructing the echo text
Extend outbound JSONL schema with `model_used` / `input_tokens` / `output_tokens` / `reasoning_duration_ms` / `reasoning_error`
Engine failure → canned fallback ("Kora is currently unable to respond; operator notified") — NOT a re-echo, NOT a crash
Cost-ladder write integration: `record_inference()` with CanonicalUsage built from the response

🤖 Generated with Claude Code

…c impl + context loader First ST of Feature 5's biggest user-facing milestone. Lays the clean reasoning surface (Protocol + value classes + Anthropic impl + JSONL context loader + operator-editable system prompt) that ST2 will wire into the Slack DM handler to replace the echo. NOT yet plugged into any handler — that's ST2. ## New package **`kora_cli/reasoning/`**: - `engine.py` (~180 lines) — `ReasoningEngine` Protocol + `IncomingMessage` / `ConversationContext` / `ConversationTurn` / `ResponseResult` value classes. Protocol over ABC for test-seam composition. - `anthropic_engine.py` (~350 lines) — `AnthropicReasoningEngine` implementing the protocol against `anthropic==0.86.0`. Two-mode credential cascade (see K-DG drift below). Cost-ladder-aware model selection. Operational-state gating. 60s per-call timeout. NO retry on 5xx per PM Q3 default. Credential-sanitized error mapping. - `context_loader.py` (~200 lines) — `load_slack_dm_context()` reads `${HERMES_HOME}/slack_dm_log.jsonl`, filters to (channel, thread), skips filtered/failed entries, projects to `ConversationTurn` list (oldest→newest, capped at max_turns). Resolves operational-state + cost-rung strings from holders. Defensive parsing (malformed lines logged + skipped). **`kora_docs/00_canonical_current_state/kora_system_prompt.md`** (~150 lines) — first-draft Kora identity prompt per PM Q1 YES. Authored from project_kora memory: digital extension framing, brevity-by-default, no preamble/sycophancy, technical fluency assumption, operational-state respect, cost-ladder awareness, NOT-Claude-the-public-assistant boundary. Operator-editable; fail-CLOSED if missing at daemon boot. **`[web]` extra extended** with `anthropic==0.86.0` so daemon deploys auto-pick up the SDK. Drift 2 resolution. ## K-DG drift findings (surfaced to PM + acted on) **Drift 1 — credential env name conflict**: bucket spec said `KORA_ANTHROPIC_API_KEY`; existing env-mapping (PR #110) + gate-2 anti-secrets + "Max plan via Agent SDK billing" framing all point to `CLAUDE_CODE_OAUTH_TOKEN`. **Implementation supports BOTH via cascade**: - `KORA_ANTHROPIC_API_KEY` (if set) → SDK `api_key=...` (Console billing) - `CLAUDE_CODE_OAUTH_TOKEN` (fallback) → SDK `auth_token=...` (Max plan) - Both unset → `ReasoningEngineNotConfigured` fail-CLOSED Operator chooses; existing OAuth token works without any new Doppler secret. PM may tighten to a single env path in a follow-up if preferred. **Drift 2 — `anthropic` lib was extra-only**: moved to `[web]` extra alongside fastapi/uvicorn/slowapi so daemon deploys don't need a separate `pip install -e ".[anthropic]"` invocation. **Drift 3 — CostRung enum value naming**: bucket spec used `normal / warned / constrained / halted`; actual enum is `NORMAL / WARN_75 / DOWNSHIFT_90 / HARD_STOP_100` (`agent/cost_state_holder.py:114-117`). Engine consumes the canonical `.value` strings via `ConversationContext.current_cost_ladder_rung` literal type: `"normal" / "warn_75" / "downshift_90" / "hard_stop_100" / "unknown"`. **Drift 4 — `holder.active_rung()` is a method, not a property**; `cost_state_holder.get_cost_holder()` (not `get_holder`). All verified via grep at `agent/cost_state_holder.py:259`, `:502`. ## Cost-ladder model selection (per R4.1 §9.6 rung mapping) | CostRung | Model used | |---|---| | `normal` | `claude-opus-4-7` | | `warn_75` | `claude-sonnet-4-6` | | `downshift_90` | `claude-haiku-4-5-20251001` | | `hard_stop_100` | refuse — no API call; `error="cost_ladder_halted"` | | `unknown` (defensive) | `claude-opus-4-7` + WARN log | ## Operational-state gating (per PM Q4 YES default) `PAUSED` or `STOPPED` → refuse, no API call; `error="operational_state_paused"`. The handler maps this to a canned acknowledgment so Joshua isn't met with silence. ## Error code taxonomy (stable across the codebase) `sdk_auth` (401/403) / `sdk_rate_limited` (429) / `sdk_5xx` / `sdk_4xx_<code>` / `sdk_timeout` / `sdk_transport` / `sdk_unknown_<class>` / `cost_ladder_halted` / `operational_state_paused` / `response_projection_failed`. ## §4 PM-opens — all defaults accepted - Q1: YES first-draft system prompt — composed (~150 lines) - Q2: 10-turn context — default in loader (`DEFAULT_MAX_TURNS=10`) - Q3: NO retry on 5xx — single SDK call exactly (tested) - Q4: PAUSED/STOPPED gating — implemented + tested ## Tests (42 new, 275 total all passing) **`test_anthropic_engine.py`** (24 tests): - Construction: fail-CLOSED on both creds unset; API key only; OAuth only; API key wins when both set; whitespace-only treated as unset - System prompt: fail-CLOSED on missing/empty/unreadable; env-path override works - Cost-ladder: NORMAL→opus, WARN_75→sonnet, DOWNSHIFT_90→haiku, HARD_STOP_100→refuse (no SDK call), unknown→opus default + WARN - Operational-state gating: paused/stopped refuse (no SDK call) - Happy path: successful call returns text + tokens + duration - Message history: alternating roles preserved; consecutive same-role concatenated; system prompt passed to SDK - SDK error mapping: 401/429/500/400/timeout/unknown — all 6 distinct - NO retry on 5xx or 429 (single SDK call asserted) - **SECURITY**: diverse failure-mode sequence (401/429/500/timeout/ transport/unknown) — credential env value NEVER in error codes, text, model_used, or any log line **`test_context_loader.py`** (18 tests): - Missing/empty file → empty context + "unknown" state strings - Channel filtering (same/diff channel) - Thread filtering (exact match; None matches None) - Filtered inbound entries skipped (filtered_non_joshua / dropped_paused / handler_error) - Failed outbound entries skipped (send_status: failed) - Turns ordered oldest→newest (sort by timestamp) - max_turns slices most-recent - DEFAULT_MAX_TURNS == 10 (asserted) - Malformed JSON lines logged + skipped, valid lines still loaded - Operational state surfaces from initialized holder (paused) - Log path env override ## §5 ship checklist - [x] Base `feature/phase2-upgrades` - [x] Title format `feat(kora): KR-FEAT-AI-RESPONSE-LOOP STn — <scope>` - [x] §4 questions resolved (all defaults accepted) - [x] anthropic dep moved into `[web]` extra - [x] API key NEVER logged (asserted across 6 failure modes) - [x] Cost-ladder rung respected (5 enum values tested) - [x] PAUSED state respected (tested for both PAUSED + STOPPED) - [x] Engine failure paths return ResponseResult.error (no crash) - [x] K-DG verified: literal field names pasted from grep; `.current` (property) vs `active_rung()` (method) caught; CostRung values verified against agent/cost_state_holder.py - [x] Tests pass locally (**275/275** across full suite) ## What's next **ST2** — wire reasoning into the Slack DM handler: 1. New `kora_cli/listeners/reasoning_engine_listener.py` registering via `register_daemon_listener("reasoning_engine", ...)` mirroring the `current_pool()` pattern from KR-MCP-CONSUMPTION ST1 with a `current_reasoning_engine()` accessor. 2. Update `SlackDMHandler` to call the reasoning engine instead of constructing the echo text. 3. Extend outbound JSONL schema with `model_used`, `input_tokens`, `output_tokens`, `reasoning_duration_ms`, `reasoning_error`. 4. Engine failure → canned fallback ("Kora is currently unable to respond; operator notified") — NOT a re-echo, NOT a crash. 5. Cost-ladder write integration: call `record_inference()` with the canonical usage from the response. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rafe-walker merged commit 0fc9a03 into feature/phase2-upgrades May 22, 2026

rafe-walker deleted the feat/kora-KR-FEAT-AI-RESPONSE-LOOP-ST1 branch May 22, 2026 23:49

This was referenced May 23, 2026

feat(kora): KR-REASONING-PANEL — Kora reasoning activity lens (stub) #132

Merged

feat(kora): KR-ALERTS-PANEL-FLIP — aggregate real alerts from existing sources #145

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST1 — ReasoningEngine + Anthropic impl + context loader#126

feat(kora): KR-FEAT-AI-RESPONSE-LOOP ST1 — ReasoningEngine + Anthropic impl + context loader#126
rafe-walker merged 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-FEAT-AI-RESPONSE-LOOP-ST1

rafe-walker commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant