Skip to content
This repository was archived by the owner on May 26, 2026. It is now read-only.

feat(kora): KR-REASONING-ROUTE-THROUGH-GATEWAY-ST2 — actual wire-up#178

Merged
rafe-walker merged 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-REASONING-ROUTE-THROUGH-GATEWAY-ST2
May 24, 2026
Merged

feat(kora): KR-REASONING-ROUTE-THROUGH-GATEWAY-ST2 — actual wire-up#178
rafe-walker merged 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-REASONING-ROUTE-THROUGH-GATEWAY-ST2

Conversation

@rafe-walker

Copy link
Copy Markdown
Owner

Summary

ST2 of 2. Replaces ST1's NotImplementedError scaffold with the actual AIAgent.run_conversation route-through. KORA_REASONING_USE_GATEWAY toggle stays default OFF; ST3 (tiny follow-on) will flip the default after parity validates in production.

AIAgent ctor mapping (the critical table)

Kwarg Value Rationale
model MODEL_HAIKU constant Default; plugin's pre_api_request_mutable hook overrides per-call based on cost-router decision
provider "anthropic" Kora's only provider
api_mode "anthropic_messages" Anthropic native messages API (matches Kora's bypass)
max_iterations 5 PINNED to Kora's MAX_TOOL_USE_ITERATIONS; Hermes default 90 would quintuple monthly cost (the spec's critical pin)
max_tokens self._max_output_tokens Forwarded from engine config
quiet_mode True Daemon path; suppress stdout
other ~35 kwargs Hermes defaults Acceptable for v1 — Kora doesn't customize session_id mgmt, fallback chains, providers_*, callbacks, etc.

Post-construction overrides:

  • agent.tools = [] — bypass Hermes auto-loaded toolset (toolless v1; tool-bridge is ST2B)
  • agent.valid_tool_names = set()
  • agent.route = derived_route — threaded into kora_hermes plugin hooks

New local Hermes extensions added

None in this PR. The tool-bridge (Kora reasoning tools → Hermes tool dispatch) WOULD need a new local extension — most plausibly a hook like pre_tool_call_can_provide_result that lets plugins short-circuit Hermes's dispatch with a plugin-computed result. But that design + impl is non-trivial AND lands cleanly in a separate KR-HERMES-LOCAL-EXT-TOOL-BRIDGE bucket. Per spec §4 STOP-ASK on non-trivial extensions: deferred + flagged here so PM can ratify the design before implementation.

Behavior parity test results

All parity tests with toggle ON pass:

Behavior Status
Cost-ladder default → Haiku ✓ (verified via plugin override dict's model key)
/opus prefix → Opus
Decision-language → Opus
KORA_FORCE_OPUS env → Opus
api_call_count >= 2 → Opus (iteration earning signal)
Caching markers on system (content-block list) + last tool
Empty tools list → no caching override on tools
ResponseResult projection: text/model/tokens/cache/error mapping
Refuse-paths: paused / hard_stop_100 short-circuit BEFORE AIAgent ctor
Route threading: slack_dm / email_inbound / mcp_tool
Exception in run_conversation → gateway_exception:<ClassName> error

Bypass-path-unchanged tests (toggle OFF or unset):

  • Default behavior runs the existing bypass cleanly ✓
  • No spurious AIAgent construction ✓
  • ResponseResult shape identical to pre-ST2 ✓

ResponseResult projection (run_conversation dict → dataclass)

Per agent/conversation_loop.py:4077-4102 Hermes return shape:

Hermes dict key ResponseResult field
final_response text
model model_used (with engine-side fallback)
input_tokens / output_tokens same
cache_write_tokens cache_creation_input_tokens (name differs — Hermes uses cache_write, Kora uses cache_creation_input per Anthropic SDK usage object naming)
cache_read_tokens cache_read_input_tokens
completed + interrupted error (gateway_interrupted / gateway_incomplete / None)
(n/a) tools_used = [] (ST2B populates from messages history)

Deferred to follow-on buckets

Documented as work remaining beyond ST2:

  • KR-HERMES-LOCAL-EXT-TOOL-BRIDGE: tool bridge design + impl (likely a new pre_tool_call_can_provide_result extension OR registering Kora reasoning tools as Hermes tools via PluginContext.register_tool). Without this, the toggle ON path has NO tool-use capability — acceptable for parity-validation phase but blocks default flip.
  • KR-PLUGIN-CONSTITUTION: pre_tool_call constitution pre-screen wiring (depends on tool bridge).
  • KR-PLUGIN-TOOL-AUDIT: post_tool_call audit emit wiring (depends on tool bridge).
  • ST3: flip KORA_REASONING_USE_GATEWAY default to ON (after ST2B + parity validation in dev).

ST3 flip timing recommendation

Wait for ST2B before flipping default. Reasoning:

  • ST2 ships toolless route-through. With toggle ON, Kora's reasoning loses tool-use capability — she can't fetch operational state, check ledger, look up health, etc.
  • That degrades reasoning quality on questions that need tool data. Acceptable in test runs (operator can grep for expected behavior); UNACCEPTABLE for Joshua's production DMs.
  • ST2B's tool bridge restores parity. After ST2B + a 24h-burn-in test (operator sends ~10 representative DMs), ST3 flips default.

Recommended ST3 timing: 24-48h after ST2B merges. Not immediate.

Test plan

  • All ST2 tests in tests/plugins/test_kora_hermes_plugin_st2.py pass (cost-ladder routing + caching markers + e2e gateway path + refuse-paths + route threading + interrupt mapping + exception mapping + bypass unchanged)
  • 50/50 kora_hermes plugin tests (ST1 + ST2 combined) pass
  • 616/616 in-scope serial regression green (reasoning + handlers + listeners + agent + plugins)
  • Full repo xdist: 10598 passed, 70 failed. 48 are established baseline flakes carried forward; +22 are pre-existing tests/plugins/{memory,web} environment-dependent failures (require optional blake3 install extras not in shared test venv). Zero failures in tests/plugins/test_kora_hermes_plugin*.py or in any code path ST2 touches.
  • Toggle default OFF → bypass path runs unchanged; zero production behavior change

Files changed

File Change
kora_cli/reasoning/anthropic_engine.py _respond_via_gateway implementation + _project_gateway_result helper
plugins/kora_hermes/__init__.py Real wiring for _pre_api_request_mutable (cost-router + caching) + _post_llm_call (structured-log marker); _current_cost_rung helper
tests/plugins/test_kora_hermes_plugin.py Removed 4 ST1-specific tests pinning the NotImplementedError stub (ST2 tests cover the new behavior more comprehensively)
tests/plugins/test_kora_hermes_plugin_st2.py NEW — 20 ST2 tests

🤖 Generated with Claude Code

ST2 of 2. Replaces ST1's NotImplementedError scaffold with the
actual AIAgent route-through. KORA_REASONING_USE_GATEWAY toggle
stays default OFF; ST3 will be a tiny PR flipping the default
after parity validates in production.

# What ST2 ships

(a) `_respond_via_gateway` implementation:
    - Per-call AIAgent construction (one-shot reply pattern;
      ctor is cheap)
    - **max_iterations=5 PINNED** (matches Kora's existing
      MAX_TOOL_USE_ITERATIONS; Hermes default 90 would
      quintuple monthly cost — the spec's critical pin)
    - agent.tools=[] override (toolless v1 route-through;
      tool-bridge is explicit ST2B follow-on)
    - agent.route set from _source_to_kora_route(message.source)
      so kora_hermes plugin hooks fire correctly
    - run_conversation runs sync; offloaded via
      asyncio.to_thread so the daemon's event loop isn't blocked
    - Refuse-paths (paused, hard_stop_100) preserved before
      AIAgent construction — toggle is behavior-neutral on these

(b) `_project_gateway_result` helper at module level:
    - Maps run_conversation's dict return → ResponseResult
      dataclass (per agent/conversation_loop.py:4077-4102 shape)
    - cache_write_tokens → cache_creation_input_tokens (name
      differs; Hermes uses cache_write, Kora uses
      cache_creation_input per Anthropic SDK usage object
      naming)
    - completed + interrupted → error field
      (gateway_interrupted / gateway_incomplete / None)
    - tools_used=[] (ST2B populates from messages history)

(c) kora_hermes plugin handlers — real wiring:
    - `_pre_api_request_mutable`: calls
      cost_router.select_model_pre_call (existing #165 logic)
      with iteration from api_call_count + cost_rung from
      CostStateHolder; wraps system + last tool with
      cache_control: ephemeral via existing
      _wrap_system_as_cacheable / _wrap_tools_as_cacheable
      helpers (KR-CHEAP-PROMPT-CACHING #158 semantic).
      Returns {"override": {model, system, tools}}. Failures
      fail-safe: leave api_kwargs unchanged + WARN log.
    - `_post_llm_call`: structured-log marker
      [kora.gateway.post_llm_call] with route + model. Per-call
      CanonicalUsage accumulation stays at the handler layer
      (slack_dm_handler's _record_inference_to_cost_ladder),
      same as bypass.
    - `_pre_tool_call`, `_post_tool_call`, `_pre_tool_list_
      finalized`, `_on_session_start` remain no-op gated on
      _is_kora_call (await ST2B / dedicated buckets).

# AIAgent ctor kwarg mapping (the critical table)

| Kwarg | Value | Rationale |
|---|---|---|
| model | MODEL_HAIKU constant | Default; plugin's pre_api_request_mutable hook overrides per-call based on cost-router decision |
| provider | "anthropic" | Kora's only provider |
| api_mode | "anthropic_messages" | Anthropic native messages API (matches Kora's bypass) |
| max_iterations | **5** | PINNED to Kora's MAX_TOOL_USE_ITERATIONS; Hermes default 90 would quintuple monthly cost |
| max_tokens | self._max_output_tokens | Forwarded from engine config |
| quiet_mode | True | Daemon path; suppress print() to stdout |
| other ~35 kwargs | Hermes defaults | Acceptable for v1 — Kora doesn't customize session_id mgmt, fallback chains, providers_*, callbacks, etc. |

Post-construction overrides:
  - agent.tools = []           # bypass Hermes auto-loaded toolset
  - agent.valid_tool_names = set()
  - agent.route = derived_route  # threaded into hooks

# No new Hermes local extensions in this PR

The tool-bridge (Kora reasoning tools → Hermes tool dispatch)
WOULD need a new local extension (most plausibly a hook like
``pre_tool_call_can_provide_result`` that lets plugins short-
circuit Hermes's dispatch with a plugin-computed result). But
that design + impl is non-trivial AND lands cleanly in a
separate KR-HERMES-LOCAL-EXT-TOOL-BRIDGE bucket. Per spec §4
STOP-ASK on non-trivial extensions: deferred + flagged in PR
body so PM can ratify the design before implementation.

# Entry-point callers

NO caller-side updates needed in this bucket — the engine's
``_respond_via_gateway`` reads ``message.source`` and calls
``_source_to_kora_route`` internally. The 3 source-mapped
callers (slack_dm_handler, email_inbound_handler, MCP-driven)
already pass IncomingMessage with the right source; the engine
maps source → route automatically.

For explicit-route callers (probe_wake_consumer → probe_
investigation; cron → scheduled_task), today they fall through
to "" route (no-op in plugin) when the toggle is on. Future
KR-CALLER-ROUTE-EXPLICIT (small follow-on) can either extend
_source_to_kora_route's table OR plumb route as a constructor
arg through engine.respond. Out of scope for ST2; toggle stays
OFF in production so this doesn't matter today.

# Behavior parity test results

ALL parity tests with toggle ON pass:
  - Cost-ladder default → Haiku (verified via plugin override
    dict's "model" key)
  - /opus prefix → Opus
  - Decision-language → Opus
  - Force-Opus env → Opus
  - api_call_count >= 2 → Opus (iteration earning signal)
  - Caching markers on system (content-block list) + last tool
  - Empty tools list → no caching override on tools
  - ResponseResult projection: text + model + tokens + cache
    tokens + error mapping (interrupted / incomplete /
    completed-cleanly)
  - Refuse-paths: paused / hard_stop_100 short-circuit BEFORE
    AIAgent construction (no spurious agent build)
  - Route threading: slack_dm / email_inbound / mcp_tool all
    propagate correctly to agent.route
  - Exception in run_conversation → ResponseResult.error =
    "gateway_exception:<ClassName>"

Bypass-path-unchanged tests (toggle OFF or unset):
  - Default behavior runs the existing bypass cleanly
  - No spurious AIAgent construction (fake_class.assert_not_called)
  - ResponseResult shape identical to pre-ST2

# Deferred to follow-on buckets

(documented as work remaining BEYOND this ST2)

  - **ST2B / KR-HERMES-LOCAL-EXT-TOOL-BRIDGE**: tool bridge
    design + impl (likely a new pre_tool_call result-providing
    extension, OR registering Kora reasoning tools as Hermes
    tools via PluginContext.register_tool). Without this, the
    toggle ON path has NO tool-use capability — acceptable for
    parity-validation phase but blocks default flip.
  - **KR-PLUGIN-CONSTITUTION**: pre_tool_call constitution
    pre-screen wiring (depends on tool bridge).
  - **KR-PLUGIN-TOOL-AUDIT**: post_tool_call audit emit wiring
    (depends on tool bridge).
  - **ST3**: flip KORA_REASONING_USE_GATEWAY default to ON
    (after ST2B + parity validation in dev).

# ST3 flip timing recommendation

**Wait for ST2B** before flipping default. Reasoning:
  - ST2 ships toolless route-through. With toggle ON, Kora's
    reasoning loses tool-use capability — she can't fetch
    operational state, check ledger, look up health, etc.
  - That degrades reasoning quality on questions that need
    tool data. Acceptable in test runs (operator can grep for
    expected behavior); UNACCEPTABLE for Joshua's production
    DMs.
  - ST2B's tool bridge restores parity. After ST2B + a
    24h-burn-in test (operator sends ~10 representative DMs),
    ST3 flips default.

Recommended ST3 timing: 24-48h after ST2B merges. Not
immediate.

# Regression

616/616 in-scope (reasoning + handlers + listeners + agent +
plugins) pass serially.

Full repo xdist: 10598 passed, 70 failed. 48 are the
established baseline flakes carried forward from prior buckets
(test_anthropic_adapter + test_backup + test_config + test_
gateway_* + test_kanban_db + test_list_picker_providers +
test_model_switch_* + test_startup_plugin_gating + test_web_
server* family + the 5 pre-existing FE-snapshot pin drifts).
The +22 failures are in tests/plugins/{memory,web} — these
require optional install extras (blake3 for isokron) that
aren't in the shared test venv; pre-existing environment-
dependent failures, not introduced by ST2. Zero failures in
tests/plugins/test_kora_hermes_plugin*.py or in any code path
ST2 touches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rafe-walker rafe-walker merged commit 9f04999 into feature/phase2-upgrades May 24, 2026
@rafe-walker rafe-walker deleted the feat/kora-KR-REASONING-ROUTE-THROUGH-GATEWAY-ST2 branch May 24, 2026 04:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant