This repository was archived by the owner on May 26, 2026. It is now read-only.
feat(KR-P2-I-integration ST2): chain-event emit on operational-state transitions#33
Closed
rafe-walker wants to merge 1 commit into
Closed
Conversation
…transitions
Adds the listener that writes a chain event for every transition
through the OperationalStateHolder. PM-verified vocabulary against
foundation/0159_kora_r41_operational_state_event_vocabulary.sql on
isokron-prod (2026-05-21):
* ALWAYS emit ``kora.operational_state.transitioned`` — single
generic edge event per substrate-team design. Payload carries
from/to primary_state, new claim_permission, sorted
degradation_reasons, and trigger.
* ADDITIONALLY emit when a per-trigger literal exists:
- (BOOTING → READY, "all §9.2 gates pass") → kora.boot.ready
- (BOOTING → STOPPED, "invariant gate failure …") →
kora.boot.failed
- (any → PAUSED, trigger contains "cost 100%") →
kora.paused.cost_limit
* Operator-pause and substrate-pause: only the generic event.
Per-trigger literals can land in a follow-on substrate vocab
migration if cockpit needs the signal.
Fail-LOUD: preflight failures (missing provider / connection /
workspace_id) and substrate-side raises both surface as
OperationalStateEmitError. The state-machine listener-error
handler logs but doesn't roll back — listeners are observability,
not policy — so a broken emit is loudly logged but the state
machine keeps moving.
agent/operational_state_emit.py exposes:
- GENERIC_TRANSITION_EVENT, BOOT_READY_EVENT, BOOT_FAILED_EVENT,
PAUSED_COST_LIMIT_EVENT — constants matched against
foundation/0159
- emit_state_transition(provider, from, to, trigger) — async
callable, fail-LOUD
- make_emit_listener(provider) — factory returning a
StateTransitionListener; ST3 wires this into the holder at
provider init
Tests (tests/test_operational_state_emit.py): 16 cases covering
payload shape, per-trigger literal selection (incl. lenient
substring match for varied caller wording), the BOOTING → STOPPED
disambiguation (invariant-failure vs STOP-KORA L4/L5), fail-LOUD
preflight (None provider, missing connection, workspace_id raise
or empty), generic-plus-extra dual-emit, substrate-failure
propagation, and the make_emit_listener factory.
Builds on ST1 (#32). ST3 will land the wire-in.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 21, 2026
rafe-walker
added a commit
that referenced
this pull request
May 21, 2026
…transitions (#36) Always emit kora.operational_state.transitioned (generic edge event) + conditional per-trigger literals (kora.boot.ready / kora.boot.failed / kora.paused.cost_limit) matching substrate foundation/0159 source-of-truth vocab. Fail-LOUD via OperationalStateEmitError. 16 unit tests. Replaces auto-closed PR #33.
rafe-walker
pushed a commit
that referenced
this pull request
May 21, 2026
…tate holder
agent/operational_state_wire.py (new): wire_operational_state(provider)
is the one boot-time call that:
1. init_holder(OperationalState(primary_state=BOOTING,
claim_permission=NONE)) — module-level singleton, idempotent.
2. holder.add_listener(make_emit_listener(provider)) — registers
the chain-event listener from ST2.
3. Triggers BOOTING → READY via the connection's submit_and_wait
(agent_init.py is synchronous; we run the coro on the IsoKron
dedicated IO loop and block boot until the emit listener returns).
Per spec §3 ST3: the transition is UNCONDITIONAL in v1. KR-P2-H
follow-on adds the §9.2 invariant-gate-check guard that decides
whether to transition READY or fall through to STOPPED — that
bucket is blocked on substrate-round Bucket C event vocab and
ships later.
Fail-soft posture: every error here is caught + logged with the
greppable [kora.operational_state.wire_in] tag. Operator boot
proceeds even if holder construction or emit raises — only the
observability surface is degraded. KR-P2-H will revisit which
failures should block boot.
agent/agent_init.py: after agent._memory_manager.initialize_all
succeeds, look up the IsoKron provider and call
wire_operational_state(provider). Wrapped in a defensive
try/except so a wire-in module-import failure also doesn't break
boot.
Tests (tests/test_operational_state_wire.py): 4 cases covering
the happy path (holder in READY + generic + boot.ready emits with
the canonical "all §9.2 gates pass" trigger), the no-connection
branch (holder created in BOOTING + greppable WARNING), the
submit-raises branch (wire-in stays fail-soft + WARNING), and
idempotence (second call with second listener doubles the per-
transition emit but state stays at READY).
Stacked on top of ST1 (#32) + ST2 (#33). ST4 wires SeaTicketPoller
claim/release (depends on KR-P2-E ST1 merged), ST5 flips the
/api/operational-state endpoint from stub to read get_holder().
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
KR-P2-I-integration ST2 of 5. Adds the listener that writes a chain event for every transition through
OperationalStateHolder. Stacked on top of #32 (ST1) — base isfeat/kora-KR-P2-I-integration-st1-holder. Retarget tomainafter ST1 merges.Event vocabulary (PM-verified against substrate)
Verified against
packages/db/migrations/foundation/0159_kora_r41_operational_state_event_vocabulary.sqlon rafe-walker/isokron (bd165ebHEAD):Always emit
kora.operational_state.transitioned— substrate-team's documented design: one generic edge event per transition; payload carries the from/to state, claim_permission, sorted degradation_reasons, and trigger.Additionally emit a per-trigger informational literal when one exists in the vocab:
"all §9.2 gates pass"in triggerkora.boot.ready"invariant gate failure"in triggerkora.boot.failed*→ PAUSED"cost 100%"in triggerkora.paused.cost_limitBOOTING → STOPPED is intentionally disambiguated: STOP-KORA L4/L5 from BOOTING reaches the same arrow as invariant-gate-failure but is operator action, not a boot failure —
kora.boot.failedonly fires when the trigger string contains "invariant gate failure".Trigger matching is substring, so caller wording like
"cost 100% — auto-pause"or"STOP-KORA L2: cost 100% triggered budget watcher"all land onkora.paused.cost_limit.Failure mode — fail-LOUD
Mirrors
agent/constitution_audit.py.emit_state_transitionraisesOperationalStateEmitErroron:provider is Noneprovider._connection is None_resolve_workspace_id()raises or returns emptykora__append_eventcall raises (Sea MCP unavailable, CHECK violation, chain lock failure, etc.)The holder's listener-exception handler (ST1) catches and logs so one broken emit doesn't deadlock the state machine — but the raise + log is the observability path. No silent swallow.
agent/operational_state_emit.py(new, 273 LOC)GENERIC_TRANSITION_EVENT,BOOT_READY_EVENT,BOOT_FAILED_EVENT,PAUSED_COST_LIMIT_EVENT(named so the literal value is greppable from logs)_select_extra_literal(from, to, trigger)— returns the supplementary literal orNone. Pure function — no I/O._build_payload(from, to, trigger)— builds the payload dict. Pure function. Cockpit relies on stable sort fordegradation_reasonsso wesorted(...)rather thanlist(...).emit_state_transition(provider, from, to, trigger)— async. Always emits generic; conditionally emits extra. Raises on any failure.make_emit_listener(provider)— factory returning aStateTransitionListenerbound to the provider. ST3 calls this once andholder.add_listener(...)registers it.Tests —
tests/test_operational_state_emit.py(399 LOC)16 cases:
provider=None, missing connection, workspace_id raise, workspace_id emptySea MCP unavailableraises bubble asOperationalStateEmitErrorwith the underlying cause in.causemake_emit_listenerreturns a usable listenerTests use a stubbed IsoKron connection that runs the coro on the test's event loop and wraps in a
concurrent.futures.Futuresoasyncio.wrap_futureworks — no real MCP transport in the unit tests. Integration test against a live provider lands in ST3.§1 verifications — all green
Sub-task chain
Test plan
pytest tests/test_operational_state_emit.pymake_emit_listeneroutside this PR's tests, that's an early wire-in to flag🤖 Generated with Claude Code