feat(kora): KR-REASONING-ROUTE-THROUGH-GATEWAY-ST2 — actual wire-up by rafe-walker · Pull Request #178 · rafe-walker/kora

rafe-walker · 2026-05-24T04:14:41Z

Summary

ST2 of 2. Replaces ST1's NotImplementedError scaffold with the actual AIAgent.run_conversation route-through. KORA_REASONING_USE_GATEWAY toggle stays default OFF; ST3 (tiny follow-on) will flip the default after parity validates in production.

AIAgent ctor mapping (the critical table)

Kwarg	Value	Rationale
`model`	`MODEL_HAIKU` constant	Default; plugin's `pre_api_request_mutable` hook overrides per-call based on cost-router decision
`provider`	`"anthropic"`	Kora's only provider
`api_mode`	`"anthropic_messages"`	Anthropic native messages API (matches Kora's bypass)
`max_iterations`	`5`	PINNED to Kora's `MAX_TOOL_USE_ITERATIONS`; Hermes default 90 would quintuple monthly cost (the spec's critical pin)
`max_tokens`	`self._max_output_tokens`	Forwarded from engine config
`quiet_mode`	`True`	Daemon path; suppress stdout
other ~35 kwargs	Hermes defaults	Acceptable for v1 — Kora doesn't customize session_id mgmt, fallback chains, providers_*, callbacks, etc.

Post-construction overrides:

agent.tools = [] — bypass Hermes auto-loaded toolset (toolless v1; tool-bridge is ST2B)
agent.valid_tool_names = set()
agent.route = derived_route — threaded into kora_hermes plugin hooks

New local Hermes extensions added

None in this PR. The tool-bridge (Kora reasoning tools → Hermes tool dispatch) WOULD need a new local extension — most plausibly a hook like pre_tool_call_can_provide_result that lets plugins short-circuit Hermes's dispatch with a plugin-computed result. But that design + impl is non-trivial AND lands cleanly in a separate KR-HERMES-LOCAL-EXT-TOOL-BRIDGE bucket. Per spec §4 STOP-ASK on non-trivial extensions: deferred + flagged here so PM can ratify the design before implementation.

Behavior parity test results

All parity tests with toggle ON pass:

Behavior	Status
Cost-ladder default → Haiku	✓ (verified via plugin override dict's `model` key)
`/opus` prefix → Opus	✓
Decision-language → Opus	✓
`KORA_FORCE_OPUS` env → Opus	✓
`api_call_count >= 2` → Opus (iteration earning signal)	✓
Caching markers on system (content-block list) + last tool	✓
Empty tools list → no caching override on tools	✓
ResponseResult projection: text/model/tokens/cache/error mapping	✓
Refuse-paths: paused / hard_stop_100 short-circuit BEFORE AIAgent ctor	✓
Route threading: slack_dm / email_inbound / mcp_tool	✓
Exception in run_conversation → `gateway_exception:<ClassName>` error	✓

Bypass-path-unchanged tests (toggle OFF or unset):

Default behavior runs the existing bypass cleanly ✓
No spurious AIAgent construction ✓
ResponseResult shape identical to pre-ST2 ✓

ResponseResult projection (`run_conversation` dict → dataclass)

Per agent/conversation_loop.py:4077-4102 Hermes return shape:

Hermes dict key	ResponseResult field
`final_response`	`text`
`model`	`model_used` (with engine-side fallback)
`input_tokens` / `output_tokens`	same
`cache_write_tokens`	`cache_creation_input_tokens` (name differs — Hermes uses cache_write, Kora uses cache_creation_input per Anthropic SDK usage object naming)
`cache_read_tokens`	`cache_read_input_tokens`
`completed` + `interrupted`	`error` (`gateway_interrupted` / `gateway_incomplete` / `None`)
(n/a)	`tools_used = []` (ST2B populates from messages history)

Deferred to follow-on buckets

Documented as work remaining beyond ST2:

KR-HERMES-LOCAL-EXT-TOOL-BRIDGE: tool bridge design + impl (likely a new pre_tool_call_can_provide_result extension OR registering Kora reasoning tools as Hermes tools via PluginContext.register_tool). Without this, the toggle ON path has NO tool-use capability — acceptable for parity-validation phase but blocks default flip.
KR-PLUGIN-CONSTITUTION: pre_tool_call constitution pre-screen wiring (depends on tool bridge).
KR-PLUGIN-TOOL-AUDIT: post_tool_call audit emit wiring (depends on tool bridge).
ST3: flip KORA_REASONING_USE_GATEWAY default to ON (after ST2B + parity validation in dev).

ST3 flip timing recommendation

Wait for ST2B before flipping default. Reasoning:

ST2 ships toolless route-through. With toggle ON, Kora's reasoning loses tool-use capability — she can't fetch operational state, check ledger, look up health, etc.
That degrades reasoning quality on questions that need tool data. Acceptable in test runs (operator can grep for expected behavior); UNACCEPTABLE for Joshua's production DMs.
ST2B's tool bridge restores parity. After ST2B + a 24h-burn-in test (operator sends ~10 representative DMs), ST3 flips default.

Recommended ST3 timing: 24-48h after ST2B merges. Not immediate.

Test plan

All ST2 tests in tests/plugins/test_kora_hermes_plugin_st2.py pass (cost-ladder routing + caching markers + e2e gateway path + refuse-paths + route threading + interrupt mapping + exception mapping + bypass unchanged)
50/50 kora_hermes plugin tests (ST1 + ST2 combined) pass
616/616 in-scope serial regression green (reasoning + handlers + listeners + agent + plugins)
Full repo xdist: 10598 passed, 70 failed. 48 are established baseline flakes carried forward; +22 are pre-existing tests/plugins/{memory,web} environment-dependent failures (require optional blake3 install extras not in shared test venv). Zero failures in tests/plugins/test_kora_hermes_plugin*.py or in any code path ST2 touches.
Toggle default OFF → bypass path runs unchanged; zero production behavior change

Files changed

File	Change
`kora_cli/reasoning/anthropic_engine.py`	`_respond_via_gateway` implementation + `_project_gateway_result` helper
`plugins/kora_hermes/__init__.py`	Real wiring for `_pre_api_request_mutable` (cost-router + caching) + `_post_llm_call` (structured-log marker); `_current_cost_rung` helper
`tests/plugins/test_kora_hermes_plugin.py`	Removed 4 ST1-specific tests pinning the NotImplementedError stub (ST2 tests cover the new behavior more comprehensively)
`tests/plugins/test_kora_hermes_plugin_st2.py`	NEW — 20 ST2 tests

🤖 Generated with Claude Code

ST2 of 2. Replaces ST1's NotImplementedError scaffold with the actual AIAgent route-through. KORA_REASONING_USE_GATEWAY toggle stays default OFF; ST3 will be a tiny PR flipping the default after parity validates in production. # What ST2 ships (a) `_respond_via_gateway` implementation: - Per-call AIAgent construction (one-shot reply pattern; ctor is cheap) - **max_iterations=5 PINNED** (matches Kora's existing MAX_TOOL_USE_ITERATIONS; Hermes default 90 would quintuple monthly cost — the spec's critical pin) - agent.tools=[] override (toolless v1 route-through; tool-bridge is explicit ST2B follow-on) - agent.route set from _source_to_kora_route(message.source) so kora_hermes plugin hooks fire correctly - run_conversation runs sync; offloaded via asyncio.to_thread so the daemon's event loop isn't blocked - Refuse-paths (paused, hard_stop_100) preserved before AIAgent construction — toggle is behavior-neutral on these (b) `_project_gateway_result` helper at module level: - Maps run_conversation's dict return → ResponseResult dataclass (per agent/conversation_loop.py:4077-4102 shape) - cache_write_tokens → cache_creation_input_tokens (name differs; Hermes uses cache_write, Kora uses cache_creation_input per Anthropic SDK usage object naming) - completed + interrupted → error field (gateway_interrupted / gateway_incomplete / None) - tools_used=[] (ST2B populates from messages history) (c) kora_hermes plugin handlers — real wiring: - `_pre_api_request_mutable`: calls cost_router.select_model_pre_call (existing #165 logic) with iteration from api_call_count + cost_rung from CostStateHolder; wraps system + last tool with cache_control: ephemeral via existing _wrap_system_as_cacheable / _wrap_tools_as_cacheable helpers (KR-CHEAP-PROMPT-CACHING #158 semantic). Returns {"override": {model, system, tools}}. Failures fail-safe: leave api_kwargs unchanged + WARN log. - `_post_llm_call`: structured-log marker [kora.gateway.post_llm_call] with route + model. Per-call CanonicalUsage accumulation stays at the handler layer (slack_dm_handler's _record_inference_to_cost_ladder), same as bypass. - `_pre_tool_call`, `_post_tool_call`, `_pre_tool_list_ finalized`, `_on_session_start` remain no-op gated on _is_kora_call (await ST2B / dedicated buckets). # AIAgent ctor kwarg mapping (the critical table) | Kwarg | Value | Rationale | |---|---|---| | model | MODEL_HAIKU constant | Default; plugin's pre_api_request_mutable hook overrides per-call based on cost-router decision | | provider | "anthropic" | Kora's only provider | | api_mode | "anthropic_messages" | Anthropic native messages API (matches Kora's bypass) | | max_iterations | **5** | PINNED to Kora's MAX_TOOL_USE_ITERATIONS; Hermes default 90 would quintuple monthly cost | | max_tokens | self._max_output_tokens | Forwarded from engine config | | quiet_mode | True | Daemon path; suppress print() to stdout | | other ~35 kwargs | Hermes defaults | Acceptable for v1 — Kora doesn't customize session_id mgmt, fallback chains, providers_*, callbacks, etc. | Post-construction overrides: - agent.tools = [] # bypass Hermes auto-loaded toolset - agent.valid_tool_names = set() - agent.route = derived_route # threaded into hooks # No new Hermes local extensions in this PR The tool-bridge (Kora reasoning tools → Hermes tool dispatch) WOULD need a new local extension (most plausibly a hook like ``pre_tool_call_can_provide_result`` that lets plugins short- circuit Hermes's dispatch with a plugin-computed result). But that design + impl is non-trivial AND lands cleanly in a separate KR-HERMES-LOCAL-EXT-TOOL-BRIDGE bucket. Per spec §4 STOP-ASK on non-trivial extensions: deferred + flagged in PR body so PM can ratify the design before implementation. # Entry-point callers NO caller-side updates needed in this bucket — the engine's ``_respond_via_gateway`` reads ``message.source`` and calls ``_source_to_kora_route`` internally. The 3 source-mapped callers (slack_dm_handler, email_inbound_handler, MCP-driven) already pass IncomingMessage with the right source; the engine maps source → route automatically. For explicit-route callers (probe_wake_consumer → probe_ investigation; cron → scheduled_task), today they fall through to "" route (no-op in plugin) when the toggle is on. Future KR-CALLER-ROUTE-EXPLICIT (small follow-on) can either extend _source_to_kora_route's table OR plumb route as a constructor arg through engine.respond. Out of scope for ST2; toggle stays OFF in production so this doesn't matter today. # Behavior parity test results ALL parity tests with toggle ON pass: - Cost-ladder default → Haiku (verified via plugin override dict's "model" key) - /opus prefix → Opus - Decision-language → Opus - Force-Opus env → Opus - api_call_count >= 2 → Opus (iteration earning signal) - Caching markers on system (content-block list) + last tool - Empty tools list → no caching override on tools - ResponseResult projection: text + model + tokens + cache tokens + error mapping (interrupted / incomplete / completed-cleanly) - Refuse-paths: paused / hard_stop_100 short-circuit BEFORE AIAgent construction (no spurious agent build) - Route threading: slack_dm / email_inbound / mcp_tool all propagate correctly to agent.route - Exception in run_conversation → ResponseResult.error = "gateway_exception:<ClassName>" Bypass-path-unchanged tests (toggle OFF or unset): - Default behavior runs the existing bypass cleanly - No spurious AIAgent construction (fake_class.assert_not_called) - ResponseResult shape identical to pre-ST2 # Deferred to follow-on buckets (documented as work remaining BEYOND this ST2) - **ST2B / KR-HERMES-LOCAL-EXT-TOOL-BRIDGE**: tool bridge design + impl (likely a new pre_tool_call result-providing extension, OR registering Kora reasoning tools as Hermes tools via PluginContext.register_tool). Without this, the toggle ON path has NO tool-use capability — acceptable for parity-validation phase but blocks default flip. - **KR-PLUGIN-CONSTITUTION**: pre_tool_call constitution pre-screen wiring (depends on tool bridge). - **KR-PLUGIN-TOOL-AUDIT**: post_tool_call audit emit wiring (depends on tool bridge). - **ST3**: flip KORA_REASONING_USE_GATEWAY default to ON (after ST2B + parity validation in dev). # ST3 flip timing recommendation **Wait for ST2B** before flipping default. Reasoning: - ST2 ships toolless route-through. With toggle ON, Kora's reasoning loses tool-use capability — she can't fetch operational state, check ledger, look up health, etc. - That degrades reasoning quality on questions that need tool data. Acceptable in test runs (operator can grep for expected behavior); UNACCEPTABLE for Joshua's production DMs. - ST2B's tool bridge restores parity. After ST2B + a 24h-burn-in test (operator sends ~10 representative DMs), ST3 flips default. Recommended ST3 timing: 24-48h after ST2B merges. Not immediate. # Regression 616/616 in-scope (reasoning + handlers + listeners + agent + plugins) pass serially. Full repo xdist: 10598 passed, 70 failed. 48 are the established baseline flakes carried forward from prior buckets (test_anthropic_adapter + test_backup + test_config + test_ gateway_* + test_kanban_db + test_list_picker_providers + test_model_switch_* + test_startup_plugin_gating + test_web_ server* family + the 5 pre-existing FE-snapshot pin drifts). The +22 failures are in tests/plugins/{memory,web} — these require optional install extras (blake3 for isokron) that aren't in the shared test venv; pre-existing environment- dependent failures, not introduced by ST2. Zero failures in tests/plugins/test_kora_hermes_plugin*.py or in any code path ST2 touches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rafe-walker merged commit 9f04999 into feature/phase2-upgrades May 24, 2026

rafe-walker deleted the feat/kora-KR-REASONING-ROUTE-THROUGH-GATEWAY-ST2 branch May 24, 2026 04:17

rafe-walker mentioned this pull request May 24, 2026

feat(kora): KR-REASONING-ROUTE-THROUGH-GATEWAY-ST2B — tool-bridge #181

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kora): KR-REASONING-ROUTE-THROUGH-GATEWAY-ST2 — actual wire-up#178

feat(kora): KR-REASONING-ROUTE-THROUGH-GATEWAY-ST2 — actual wire-up#178
rafe-walker merged 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-REASONING-ROUTE-THROUGH-GATEWAY-ST2

rafe-walker commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rafe-walker commented May 24, 2026

Summary

AIAgent ctor mapping (the critical table)

New local Hermes extensions added

Behavior parity test results

ResponseResult projection (run_conversation dict → dataclass)

Deferred to follow-on buckets

ST3 flip timing recommendation

Test plan

Files changed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ResponseResult projection (`run_conversation` dict → dataclass)