This repository was archived by the owner on May 26, 2026. It is now read-only.
feat(KR-P2-I-integration ST1): OperationalStateHolder singleton + transition broker#32
Merged
Merged
Conversation
…nsition broker
The skeleton at agent/operational_state.py ships the immutable
state shape, the §9.1 transition table, and query helpers — but
has no notion of "what is Kora's current state right now." This
module adds that.
agent/operational_state_holder.py:
- OperationalStateHolder: holds the live OperationalState behind
an asyncio.Lock so concurrent transition_to() calls serialize.
- transition_to() validates (from, to) against TRANSITION_TABLE
via is_valid_transition; raises InvalidStateTransitionError on
bad arrows. Same-state calls (degradation_reason / claim_permission
updates with primary_state unchanged) bypass the table check —
R4.1 §9.1 models DEGRADED as a flag, not a primary_state edge.
- Listeners fire AFTER the held state is swapped and the lock is
released. A listener exception is logged but does not roll back
the transition — listeners are observability, not policy. ST2
registers the chain-event emit as a listener.
- Module-level init_holder() is idempotent (first call wins;
mirrors the IsoKronMemoryProvider singleton pattern).
get_holder() returns None before init.
Tests (tests/test_operational_state_holder.py): 14 cases covering
the table-check, same-state bypass, the with_* composition
(claim_permission + add/remove reasons), listener fire ordering +
exception isolation + non-rollback, asyncio-lock serialization of
concurrent transitions, and the init_holder idempotence /
get_holder / reset-helper semantics.
This is ST1 of 5; ST2 lands the emit listener, ST3 wires
IsoKronMemoryProvider init to call init_holder, ST4 wires the
SeaTicketPoller claim/release transitions (depends on KR-P2-E ST1),
ST5 flips the /api/operational-state stub to read from get_holder().
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 21, 2026
rafe-walker
pushed a commit
that referenced
this pull request
May 21, 2026
…transitions
Adds the listener that writes a chain event for every transition
through the OperationalStateHolder. PM-verified vocabulary against
foundation/0159_kora_r41_operational_state_event_vocabulary.sql on
isokron-prod (2026-05-21):
* ALWAYS emit ``kora.operational_state.transitioned`` — single
generic edge event per substrate-team design. Payload carries
from/to primary_state, new claim_permission, sorted
degradation_reasons, and trigger.
* ADDITIONALLY emit when a per-trigger literal exists:
- (BOOTING → READY, "all §9.2 gates pass") → kora.boot.ready
- (BOOTING → STOPPED, "invariant gate failure …") →
kora.boot.failed
- (any → PAUSED, trigger contains "cost 100%") →
kora.paused.cost_limit
* Operator-pause and substrate-pause: only the generic event.
Per-trigger literals can land in a follow-on substrate vocab
migration if cockpit needs the signal.
Fail-LOUD: preflight failures (missing provider / connection /
workspace_id) and substrate-side raises both surface as
OperationalStateEmitError. The state-machine listener-error
handler logs but doesn't roll back — listeners are observability,
not policy — so a broken emit is loudly logged but the state
machine keeps moving.
agent/operational_state_emit.py exposes:
- GENERIC_TRANSITION_EVENT, BOOT_READY_EVENT, BOOT_FAILED_EVENT,
PAUSED_COST_LIMIT_EVENT — constants matched against
foundation/0159
- emit_state_transition(provider, from, to, trigger) — async
callable, fail-LOUD
- make_emit_listener(provider) — factory returning a
StateTransitionListener; ST3 wires this into the holder at
provider init
Tests (tests/test_operational_state_emit.py): 16 cases covering
payload shape, per-trigger literal selection (incl. lenient
substring match for varied caller wording), the BOOTING → STOPPED
disambiguation (invariant-failure vs STOP-KORA L4/L5), fail-LOUD
preflight (None provider, missing connection, workspace_id raise
or empty), generic-plus-extra dual-emit, substrate-failure
propagation, and the make_emit_listener factory.
Builds on ST1 (#32). ST3 will land the wire-in.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rafe-walker
pushed a commit
that referenced
this pull request
May 21, 2026
…tate holder
agent/operational_state_wire.py (new): wire_operational_state(provider)
is the one boot-time call that:
1. init_holder(OperationalState(primary_state=BOOTING,
claim_permission=NONE)) — module-level singleton, idempotent.
2. holder.add_listener(make_emit_listener(provider)) — registers
the chain-event listener from ST2.
3. Triggers BOOTING → READY via the connection's submit_and_wait
(agent_init.py is synchronous; we run the coro on the IsoKron
dedicated IO loop and block boot until the emit listener returns).
Per spec §3 ST3: the transition is UNCONDITIONAL in v1. KR-P2-H
follow-on adds the §9.2 invariant-gate-check guard that decides
whether to transition READY or fall through to STOPPED — that
bucket is blocked on substrate-round Bucket C event vocab and
ships later.
Fail-soft posture: every error here is caught + logged with the
greppable [kora.operational_state.wire_in] tag. Operator boot
proceeds even if holder construction or emit raises — only the
observability surface is degraded. KR-P2-H will revisit which
failures should block boot.
agent/agent_init.py: after agent._memory_manager.initialize_all
succeeds, look up the IsoKron provider and call
wire_operational_state(provider). Wrapped in a defensive
try/except so a wire-in module-import failure also doesn't break
boot.
Tests (tests/test_operational_state_wire.py): 4 cases covering
the happy path (holder in READY + generic + boot.ready emits with
the canonical "all §9.2 gates pass" trigger), the no-connection
branch (holder created in BOOTING + greppable WARNING), the
submit-raises branch (wire-in stays fail-soft + WARNING), and
idempotence (second call with second listener doubles the per-
transition emit but state stays at READY).
Stacked on top of ST1 (#32) + ST2 (#33). ST4 wires SeaTicketPoller
claim/release (depends on KR-P2-E ST1 merged), ST5 flips the
/api/operational-state endpoint from stub to read get_holder().
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
KR-P2-I-integration ST1 of 5. Adds the live state holder that the KR-P2-I-skeleton was missing — the skeleton ships the immutable shape + the §9.1 transition table + query helpers, but nothing to actually hold the current state.
agent/operational_state_holder.py(new, 203 LOC)OperationalStateHolder— holds the liveOperationalStatebehind anasyncio.Lockso concurrenttransition_to()calls serialize.transition_to(new_primary_state, trigger, *, new_claim_permission, add_reasons, remove_reasons)— validates(from, to)againstTRANSITION_TABLEviais_valid_transition; raisesInvalidStateTransitionErroron bad arrows. Same-state calls bypass the table check — R4.1 §9.1 models DEGRADED as the presence ofdegradation_reasons, not aprimary_stateedge, so(READY, READY)while addingDegradationReason.AUTHmust succeed even though no(READY, READY)row exists.init_holder()— idempotent; first call wins (mirrors theIsoKronMemoryProvidersingleton pattern).get_holder()returnsNonebefore init.Tests —
tests/test_operational_state_holder.py(288 LOC)14 cases:
currentsnapshotBOOTING → BOOTING, "transient gate failure") still workswith_*composition (claim_permission+add_reasons+remove_reasons)(old, new, trigger);holder.currentis alreadynewwhen the listener firestransition_tocalls serialize via the asyncio lock (verified via a sleep-in-listener race)get_holder()returnsNonebefore initinit_holderconstructs the singleton + is idempotent (first state wins)_reset_holder_for_testsclears the singletonHonest scope
IsoKronMemoryProviderwiring (ST3), no emit listener (ST2), noSeaTicketPollerintegration (ST4), no endpoint flip (ST5).Sub-task chain (this bucket)
…st1-holder…st2-emit…st3-provider-wire…st4-poller-wire…st5-flip-endpoint§1 verifications
agent/operational_state.pyskeleton API surface (PrimaryState / DegradationReason / ClaimPermission enums +OperationalState+TRANSITION_TABLE+is_valid_transition/transitions_from/transitions_to) — imports verified./api/operational-statereturnsprimary_state/claim_permission/degradation_reasons/is_degraded/transition_history/valid_next_states/stub) — read atkora_cli/web_server.py:3160. ST5 will preserve.kora.operational_state.transitioned— seepackages/db/migrations/foundation/0159_kora_r41_operational_state_event_vocabulary.sqlline 277) plus per-trigger informational literals forkora.boot.ready,kora.boot.failed,kora.paused.cost_limit. ST2 will implement the always-emit-generic + conditionally-emit-per-trigger shape.Test plan
pytest tests/test_operational_state_holder.pyinit_holderoutside this PR's tests, the CC#3-side CI grep should flag it🤖 Generated with Claude Code