Skip to content
This repository was archived by the owner on May 26, 2026. It is now read-only.

feat(kora): KR-PIP-PACKAGING-FOUNDATION-AND-DAEMON-FATAL-FLAG — Marvin pip POC + structural FATAL flag#204

Merged
rafe-walker merged 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-PIP-PACKAGING-FOUNDATION-AND-DAEMON-FATAL-FLAG-MEGABUCKET
May 24, 2026
Merged

feat(kora): KR-PIP-PACKAGING-FOUNDATION-AND-DAEMON-FATAL-FLAG — Marvin pip POC + structural FATAL flag#204
rafe-walker merged 1 commit into
feature/phase2-upgradesfrom
feat/kora-KR-PIP-PACKAGING-FOUNDATION-AND-DAEMON-FATAL-FLAG-MEGABUCKET

Conversation

@rafe-walker

Copy link
Copy Markdown
Owner

Summary

Two deliverables in one bucket. Closes the last gap from #203 ("no actual pip install publish path validated yet" — §6.4) AND the deferred follow-up from #200 (FATAL contract was documentation-only).

Per Joshua's amended feedback-local-first-upstream-after: structure only, no PyPI publish, no upstream PR this dispatch.

Companion kora-docs PR: https://github.com/rafe-walker/kora-docs/pull/new/docs/kora-KR-PIP-PACKAGING-FOUNDATION-AND-DAEMON-FATAL-FLAG-MEGABUCKET (AUDIT.md + MARVIN_DEMO §7 addendum + daemon audit §10.4.x update)

Deliverable A — Pip-packaging foundation

A.1 — Marvin restructured to relocatable layout

plugins/marvin/
├── pyproject.toml          # NEW — setuptools build + hermes_agent.plugins entry point
├── README.md               # NEW — operator-facing install doc
├── plugin.yaml             # KEPT — Hermes bundled-plugin discovery (in-tree dev)
├── __init__.py             # REPLACED — sys.path-adjusting compat shim
└── src/marvin/
    ├── __init__.py         # canonical module (relocated)
    └── data/
        ├── MARVIN.md
        └── marvin_system_prompt.md

pyproject.toml declares [project.entry-points."hermes_agent.plugins"] marvin = "marvin:register" — the exact entry-point group Hermes's _scan_entry_points already reads. Package_data ships the markdown identity files in the wheel.

A.2 — Dry-run install transcript (captured this session)

$ uv build --wheel
adding 'marvin/__init__.py'
adding 'marvin/data/MARVIN.md'
adding 'marvin/data/marvin_system_prompt.md'
Successfully built dist/marvin_runtime-0.1.0a1-py3-none-any.whl

$ python3.11 -m venv /tmp/marvin-dry-install
$ source /tmp/marvin-dry-install/bin/activate
$ pip install plugins/marvin/dist/marvin_runtime-0.1.0a1-py3-none-any.whl
Successfully installed marvin-runtime-0.1.0a1

$ python -c '... entry-point discovery ...'
Discovered: [('marvin', 'marvin:register')]

$ python -c 'import marvin; print(marvin.__file__)'
/private/tmp/marvin-dry-install/lib/python3.11/site-packages/marvin/__init__.py
  ✓ loaded from site-packages (the installed wheel, NOT in-tree)

$ provider call → IdentitySpec(agent_name='Marvin', soul_chars=1306, system_chars=1486)
$ register(stub_ctx) → wires identity provider via stub PluginContext ✓

A.3 — CI-runnable regression guard

tests/plugins/test_marvin_pip_install_dry_run.py (NEW) — 6 tests pinning pyproject structure + wheel contents + entry-point declaration. Builds wheel via uv build or python -m build. ~0.5s per run.

A.4 — Kora pip-packaging audit (in kora-docs)

kora_docs/14_research/kora_pip_packaging_2026-05-24/AUDIT.md — shopping list for the future Kora restructure:

  • Recommended 4-package split: kora-runtime + kora-cli + kora-cockpit + kora-promote-loops
  • Per-package CC-day estimates (2 + 1 + 3-4 + 1-2 = 7-9 days total across 4 sequential dispatches)
  • 4 open questions for operator (Hermes pip-installability, IsoKron client, schema migrations, versioning cadence) — must resolve §7.1 + §7.2 before Phase 1 dispatches

Deliverable B — Structural FATAL flag

Replaces #200's documentation-driven FATAL contract with structural enforcement.

Surface changes

# agent/background_daemon_registry.py
@dataclass(frozen=True)
class BackgroundDaemonEntry:
    # ...
    fatal_on_startup_failure: bool = False  # NEW

# kora_cli/plugins.py
class PluginContext:
    def register_background_daemon(self, ..., *, fatal_on_startup_failure: bool = False):
        # NEW kwarg threaded into BackgroundDaemonEntry

# kora_cli/daemon.py
class DaemonCoordinator:
    def _is_startup_failure_fatal(self, listener_name: str) -> bool:
        # Looks up BackgroundDaemonEntry by name; returns its flag.
        # Returns True if no Hermes entry (preserves pre-flag behavior
        # for Kora-only HTTP service mounts: web/mcp/webhooks).
        ...
    
    async def run(self) -> int:
        for listener in self._listeners:
            try:
                await listener.startup()
            except Exception:
                if self._is_startup_failure_fatal(listener.name):
                    # abort daemon (existing behavior — preserved for
                    # reasoning_engine + Kora-only listeners)
                else:
                    # log + continue (NEW — lenient default for non-
                    # critical daemons that catch their own exceptions
                    # internally but might surface unexpected raises)

# kora_cli/listeners/reasoning_engine_listener.py
_hermes_entry = BackgroundDaemonEntry(
    # ...
    fatal_on_startup_failure=True,  # CRITICAL — engine failure = daemon abort
)

Lookup semantics

Listener category Lookup result Behavior on startup raise
reasoning_engine (Hermes entry, flag=True) True Abort daemon (existing behavior)
14 other Hermes-registered listeners (default flag=False) False Log + continue (new lenient default)
Kora-only listeners (web/mcp/webhooks — no Hermes entry) True Abort daemon (preserves pre-flag behavior)
Unknown name / lookup error True Abort daemon (defensive default)

Tests

18 in tests/kora_cli/test_daemon_fatal_on_startup_failure.py:

  • Dataclass field defaults + frozen invariant
  • PluginContext kwarg passthrough
  • Lookup behavior (4 cases above)
  • End-to-end coordinator behavior (fatal → abort; non-fatal → log + continue past)
  • Production pins: reasoning_engine has True; 7 phase-1/2/3 listeners (snapshot/heartbeat_probes/slack_client/purelymail_client/alert_notifier/cost_telemetry/mcp_consumption) default to False

Test plan

  • 473/473 focused regression set green (Marvin tests + pip dry-run tests + FATAL flag tests + listener tests + identity tests + plugin tests)
  • Live dry-run install validated in /tmp/marvin-dry-install (transcript above)
  • Wheel structural-pin test passes via uv build (the build backend the repo standardizes on)
  • Post-merge: 24h monitoring for any daemon-coordinator behavior surprise (only reasoning_engine + HTTP service mounts hit the fatal path; everything else now log + continues on raise, which is more lenient than today)
  • Rollback plan: the flag defaults to False so reverting this PR alone would lose reasoning_engine's fatal-enforcement; but the documentation contract from feat(kora): KR-DAEMON-PHASE-2.5-AND-PHASE-3 — 3 promotion listeners + 3 singleton-holders #200 (in the listener docstring) still describes the intent — operator could manually restore via revert + re-deploy

Recommendation for next CC#3 dispatch

Per the new AUDIT.md, the 4 open questions in §7 (Hermes pip-installability, IsoKron client packaging, schema migrations, versioning cadence) must be answered before Phase 1 (kora-runtime) dispatches. Otherwise the dispatch hits STOP-ASK on its first 30 minutes.

Options:

  • (a) Start KR-KORA-PIP-RESTRUCTURE-PHASE-1-RUNTIME — once §7.1 + §7.2 in AUDIT.md are resolved
  • (b) Phase 5 daemon migration (heartbeat dissolution) — still blocked on gateway-consumer infra; not CC#3's lane
  • (c) Upstream PR submission batch — 3 ready branches + the new FATAL-flag extension (4 total) — gated on operator approval per feedback-local-first-upstream-after
  • (d) Idle until operator picks one

🤖 Generated with Claude Code

…n pip-installable POC + structural FATAL flag

Two deliverables closing #203's §6.4 gap (no pip-install path validated) + #200's deferred follow-up (FATAL contract was documentation-only). Per Joshua's amended feedback-local-first-upstream-after: structure only, no PyPI publish, no upstream PR.

Deliverable A — Pip-packaging foundation

plugins/marvin/ restructured to relocatable src/ layout:

  plugins/marvin/
  ├── pyproject.toml          # NEW — setuptools build + hermes_agent.plugins entry point
  ├── README.md               # NEW — operator-facing install doc
  ├── plugin.yaml             # KEPT — Hermes bundled-plugin discovery
  ├── __init__.py             # REPLACED — sys.path-adjusting compat shim
  └── src/marvin/
      ├── __init__.py         # canonical module (relocated)
      └── data/
          ├── MARVIN.md       # relocated
          └── marvin_system_prompt.md

pyproject.toml declares "[project.entry-points.hermes_agent.plugins] marvin = marvin:register" — the exact entry-point group Hermes's _scan_entry_points already reads. Package_data includes the markdown files so the wheel ships relocatable identity assets.

The in-tree compat shim at plugins/marvin/__init__.py prepends plugins/marvin/src to sys.path + re-exports register/marvin_identity_provider from the canonical module. Existing 11 multi-tenant tests from #203 continue to pass with only one updated assertion (data files moved to src/marvin/data/).

Live dry-run install validated this session:
  $ uv build --wheel → marvin_runtime-0.1.0a1-py3-none-any.whl
  $ pip install <wheel> into fresh /tmp/marvin-dry-install venv
  $ importlib.metadata.entry_points() discovers ('marvin', 'marvin:register')
  $ marvin.__file__ = /tmp/.../site-packages/marvin/__init__.py (NOT in-tree)
  $ provider call → IdentitySpec(agent_name='Marvin', soul_chars=1306, system_chars=1486)
  $ register(stub_ctx) → wires identity provider via stub PluginContext ✓

CI-runnable regression guard: tests/plugins/test_marvin_pip_install_dry_run.py — 6 tests covering pyproject structure pins, wheel content pins, entry-point declaration pin, METADATA Requires-Python pin. ~0.5s per run via uv build (or python -m build if uv unavailable).

Companion kora-docs deliverables (separate PR):
  - kora_docs/14_research/kora_pip_packaging_2026-05-24/AUDIT.md — concrete shopping list for the future 4-package Kora restructure (kora-runtime + kora-cli + kora-cockpit + kora-promote-loops). 7-9 CC-days estimate across 4 sequential phases. 4 open questions for operator (Hermes pip-installability, IsoKron client packaging, schema migrations, versioning cadence).
  - kora_docs/14_research/plugin_identity_option_c_2026-05-24/MARVIN_DEMO_TRANSCRIPT.md §7 addendum — gap §6.4 closed; full wheel build + dry-run install transcript captured.

Deliverable B — Structural FATAL flag

Replaces #200's documentation-driven FATAL contract with structural enforcement:

agent/background_daemon_registry.py:
  - New field BackgroundDaemonEntry.fatal_on_startup_failure: bool = False
  - Field docstring documents the semantics + the looked-up-by-name path for the Path B thin-shim shape

kora_cli/plugins.py:
  - PluginContext.register_background_daemon accepts the new fatal_on_startup_failure kwarg + threads it into the BackgroundDaemonEntry construction

kora_cli/daemon.py:
  - New method DaemonCoordinator._is_startup_failure_fatal(listener_name):
    - Looks up the listener's BackgroundDaemonEntry by name
    - Returns entry.fatal_on_startup_failure if found
    - Returns True if not found (preserves pre-flag behavior for Kora-only HTTP service mounts: web/mcp/webhooks)
    - Returns True on lookup failure (defensive — never silently degrade on infrastructure error)
  - run() loop checks the flag at the listener-startup-raise site:
    - True → abort daemon boot (existing behavior)
    - False → log + continue starting subsequent listeners (NEW: lenient default for non-critical daemons whose own try/except didn't catch an unexpected exception)

kora_cli/listeners/reasoning_engine_listener.py:
  - _hermes_entry now passes fatal_on_startup_failure=True explicitly
  - Module docstring updated to reflect structural-not-documentation contract

Tests: 18 in tests/kora_cli/test_daemon_fatal_on_startup_failure.py covering:
  - Dataclass field defaults + frozen invariant
  - PluginContext kwarg passthrough
  - Lookup behavior (fatal entry / non-fatal entry / no entry / unknown name)
  - End-to-end coordinator behavior (fatal raises abort; non-fatal raises log + continue past)
  - Production pin: reasoning_engine has fatal=True; 7 phase-1/2/3 listeners (snapshot/heartbeat_probes/slack_client/purelymail_client/alert_notifier/cost_telemetry/mcp_consumption) default to fatal=False

Per the [[feedback-local-first-upstream-after]] amendment: this Hermes extension lives in the fork only this dispatch. When operator approves the next upstream-PR batch, this becomes upstream candidate #4 (joining the 3 already-deferred branches from #196).

Tests: 473/473 focused regression set green (Marvin tests + pip dry-run tests + FATAL flag tests + listener tests + identity tests + plugin tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rafe-walker rafe-walker merged commit cf4dd20 into feature/phase2-upgrades May 24, 2026
2 of 4 checks passed
@rafe-walker rafe-walker deleted the feat/kora-KR-PIP-PACKAGING-FOUNDATION-AND-DAEMON-FATAL-FLAG-MEGABUCKET branch May 24, 2026 17:33
rafe-walker added a commit that referenced this pull request May 24, 2026
…ndation + kora-runtime first pip package (#209)

Two new pip-installable packages under packages/, extracted from the
in-tree code with full backward-compat shims:

* packages/isokron-client/ — 20 substrate-functional modules lifted
  from plugins/memory/isokron/ (non-provider files only). Pure library
  with zero Kora / Hermes coupling at import time. ANY Hermes-based
  agent can `pip install isokron-client` to gain IsoKron substrate
  primitives: events / reads / control / scratchpad / policy /
  constitution / capability / sea_tickets. README §"BYOA usage
  example" shows a non-Kora custom plugin consuming the library.
  5/5 dry-run wheel-install smoke tests pass in a clean venv.

* packages/kora-runtime/ — 28 modules (7 sub-plugins + orchestrator)
  lifted from kora_cli/reasoning/kora_hermes_plugin/. Declares
  `isokron-client` as a PyPI dep; Hermes is source-only per the
  2026-05-25 operator decision (README documents `git clone
  hermes-agent && pip install -e . && pip install kora-runtime`).
  Entry point `hermes_agent.plugins.kora = kora_runtime:register`
  validated via `importlib.metadata.entry_points()` after dry-run
  install. 5/5 dry-run wheel-install smoke tests + 142/151 in-tree
  tests pass (8 pre-existing Marvin data-dir failures unrelated;
  0 regressions from extraction).

Back-compat shims at plugins/memory/isokron/__init__.py and
kora_cli/reasoning/kora_hermes_plugin/__init__.py do sys.path
bootstrap (mirrors the plugins/marvin/__init__.py POC pattern from
#204) plus sys.modules aliasing of the moved sub-modules. Existing
in-tree imports of the form `from plugins.memory.isokron.X import Y`
and `from kora_cli.reasoning.kora_hermes_plugin.X import Y` continue
working unchanged — no per-file shims, no caller sweep across the
rest of the Kora tree.

§4 STOP-ASK A.1 resolved by operator decision: (B) tight-scoped
extraction. provider.py + active_provider.py + sea_ticket_poller*
stay in plugins/memory/isokron/ as the Hermes-plugin half importing
from isokron-client. kora_cli/clients/kora_control_writer.py +
kora_cli/audit/jsonl_sink.py deferred to Phase 1b (operator task
NousResearch#450). API surface uses real grep'd names (events / reads / control /
scratchpad / policy / constitution / capability / sea_tickets), not
the bucket's audit/conversation/workspace strawman that didn't map
to what was in the tree.

Companion kora-docs PR amends AUDIT.md from 4-package to 5-package
split + closes §7.1 (Hermes source-only) + §7.2 (isokron-client
separate package).

Co-authored-by: CC#3 Kora Runtime <kora-pm@stormhavenenterprises.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rafe-walker pushed a commit that referenced this pull request May 24, 2026
…lose 137 of 139 baseline failures + per-tenant audit JSONL

Deliverable A — test stability follow-up

Closes 137 of 139 baseline failures (post-#206) by cluster. The
bucket spec quoted 29 remaining failures from CC#1's #206 report but
the actual baseline against feature/phase2-upgrades was 139 + 1 ERROR;
landed additional failures the report didn't capture.

Cluster fixes:

  - **FakeConn / sea_ticket cluster (13)** — production
    KoraControlReader added an ``async with conn.transaction(): ...
    await conn.execute(...)`` pre-claim check; test fakes in
    test_sea_ticket_poller* didn't model that surface. Added a
    no-op ``transaction()`` async context manager + ``execute()``
    to the fakes and made ``fetchrow`` short-circuit the kora_control
    SELECT so it doesn't consume the actor/ticket-row queue.
  - **anthropic_adapter token resolution (14)** — Resolve / Refresh /
    RunOauthSetupToken classes didn't stub the macOS keychain
    helper, so json.loads got a MagicMock from a subprocess.run
    patch and crashed. Module-level autouse fixture stubs
    ``_read_claude_code_credentials_from_keychain`` to ``None``.
  - **Marvin plugin (10)** — #204 added ``plugins/marvin/`` code
    that read ``data/MARVIN.md`` + ``data/marvin_system_prompt.md``
    at import time, but the data files themselves never landed.
    Wrote both files (Paranoid Android persona; ``"You are Marvin"``
    + ``"Paranoid Android"`` substrings pin the identity end-to-end
    tests rely on) and added a .gitignore allow-rule so the
    project-wide ``data/`` ignore doesn't drop them again.
  - **/private/var/folders false-positive (20)** — tools/file_tools.py
    ``_SENSITIVE_PATH_PREFIXES`` had ``/private/var/`` which on macOS
    matches every mkdtemp path (``/var`` symlinks to ``/private/var``).
    Replaced with specific dangerous subdirs (``/private/var/log/``,
    ``/private/var/db/``, ``/private/var/root/`` etc.) so user temp
    stays writable. Updated test_file_tools_live tilde-expansion
    test to read its own file.
  - **container_base /root/.hermes → /root/.kora (8)** —
    tools/credential_files.py default container_base was still the
    legacy ``.hermes`` name even though every test expected
    ``.kora``. Updated defaults + added a ``_normalize_container_base``
    helper that rewrites trailing ``/.hermes`` → ``/.kora`` so
    older callers passing the legacy form keep working.
  - **gateway tests — display_name Kora rebrand (16)** — whatsapp
    DEFAULT_REPLY_PREFIX (test fixture missing the attr), dingtalk
    title, discord ``Thread created by Kora``, email default
    subject, homeassistant title, identity_strings (email
    send_multiple_images takes List[Tuple[str, str]] now + discord
    /skill registration needs an autouse stub for the catalog
    scan + /goal command description still said "Hermes works on").
    Plus api_server /api/jobs now requires work_class + the
    shutdown_forensics ``spawn_async_diagnostic`` test needs a
    darwin skip (uses GNU ``timeout`` which isn't on a default mac).
  - **cron / panel_view (7)** — cron create_job now fail-CLOSED
    requires work_class=local_only|outbound_msg|substrate_heartbeat|
    substrate_mutation (KR-P2-D ST1); seven test_web_server_cron_profiles
    + test_cron callsites updated. test_panel_inventory_count
    bumped 46 → 47 for the post-#205 CronPage.tsx addition.
  - **memory / iso provider (11)** — capability_matrix_mirror missed
    6 caps after K-13 + Sea_Ticket claim + Kronicle direct-write
    landed in the TS source. Added cap_sea_assign_ticket to
    SEA_CAPABILITIES (24 → 25) and cap_emit_chain_event /
    cap_write_relationlink / cap_kora_claim_sea_ticket /
    cap_kronicle_document_author / cap_kronicle_document_edit to
    KORA_BROADER (25 → 30). Updated count assertions accordingly
    (22 → 28 granted, 49 → 55 total). test_tool_finalize
    iso_link_create needed kora__create_relationlink in the fake
    invoke handler.
  - **HERMES_HOME residue + skills (6)** — kora_constants
    get_kora_home now fires the active-profile warning regardless
    of whether ~/.kora or ~/.hermes exists (the wrongness is
    KORA_HOME unset, not which dir we land in). _hermes_home.py
    fallback display_kora_home rewrites legacy .hermes/* →
    .kora/* in display strings. Backup _detect_prefix accepts
    .hermes/ and .kora/ in zip archive entries. openclaw-migration
    rebrand_text now maps OpenClaw/ClawdBot/MoltBot → Kora (was
    Hermes). test_tirith_security mocks Path.home so a dev mac
    with ~/.hermes doesn't trip the BC fallback.
  - **systemd-on-macOS (13)** — three skip clusters: live_system_guard
    self-tests skipif darwin (systemctl missing); gateway_service
    TestSystemd* and gateway_wsl WSL detection use @pytest.mark.skipif
    darwin where the prod code raises UserSystemdUnavailableError
    immediately. gateway_wsl tests that exercise pure logic mock
    shutil.which so they keep running on either platform.
  - **ACP edit_approval / registry_manifest (3)** — agent.json
    version bumped to match pyproject (0.14.0 → 0.1.0 per the
    KR-1 ST4 version-stream split). Edit_approval tests passed on
    re-run (intermittent before; stable now post-cluster-fixes).
  - **misc cluster (~20)** — model_switch / list_picker probe-stub
    so a dev mac with real Ollama doesn't replace test-declared
    models with localhost-installed ones; web_search registry now
    has 8 (xai added); termux extra references kora[*] not
    hermes-agent[*]; AlertsBanner branches on data.total_active
    (snapshot path may have data.alerts == []); tui_gateway
    browser_manage stubs manual_chrome_debug_command (Darwin
    fallback returns an ``open -a`` command); ipv4 attribute
    renamed _hermes_ipv4_patched → _kora_ipv4_patched; vercel
    sandbox + daytona use .kora container paths; file_sync
    rewrites both /root/.hermes and /root/.kora to container_base.
  - **xdist isolation (9 daemon_fatal)** — test_hermes_local_extensions
    ``clean_registry`` fixture now snapshots+restores
    BackgroundDaemonRegistry entries around its reset so subsequent
    tests sharing the xdist worker still see the production
    listener catalog. (Python module cache means re-importing
    kora_cli.listeners doesn't re-run the register() calls.)
  - **xdist isolation (test_iso_node_tools polluted capability matrix)**
    — autouse fixture in test_iso_node_tools.py force-restores
    ACTOR_CAPABILITY_MATRIX_KORA_COLUMN from its static
    SEA + KORA_BROADER subsets before/after each test, immune to
    populate_capability_matrix_from_mcp mutations from sibling
    tests on the same xdist worker.

Per-cluster failure resolution table (baseline 139 → 2):

  | Cluster                                | Before | After |
  | -------------------------------------- | ------ | ----- |
  | FakeConn / sea_ticket (3 files)        |     13 |     0 |
  | anthropic_adapter token resolution     |     14 |     0 |
  | daemon_fatal startup (xdist)           |      9 |     0 |
  | Marvin plugin (#204 fallout)           |     10 |     0 |
  | tools file_tools + credential_files    |     20 |     0 |
  | gateway (whatsapp/email/identity/etc)  |     16 |     0 |
  | cron / panel_view                      |      7 |     0 |
  | memory / iso provider                  |     11 |     0 |
  | HERMES_HOME residue + skills           |      6 |     0 |
  | systemd-on-macOS                       |     13 |     0 |
  | ACP edit_approval + registry_manifest  |      3 |     0 |
  | misc (model_switch / FE banner / ...)  |     17 |     0 |
  | xdist flakes (approve_deny + iso_node) |      2 |     2 |
  | **TOTAL**                              |  **139** | **2** |

Deliverable B — KR-PER-TENANT-AUDIT-JSONL

Threads ``tenant_id`` through emit_audit + reader + BE endpoints:

  - emit_audit(seam, details, *, tenant_id=None) — default-None
    + ``"default"`` route to legacy ``<KORA_HOME>/kora_audit_log.jsonl``
    (every existing call site stays correct). Any other tenant_id
    routes to ``<KORA_HOME>/audit/<tenant_id>/kora_audit_log.jsonl``.
    Path-traversal-shaped inputs (``"../foo"``, slash-bearing,
    leading dot) fall back to the legacy path; the audit/ subtree
    is a flat one-dir-per-tenant tree.
  - read_audit_entries(..., tenant_id=None) — mirror semantics. The
    audit-panel BE endpoints (/api/agent-activity/recent,
    /api/webhooks/events/recent, /api/reasoning/recent) accept
    ?tenant_id=... and pass it through. Param name pinned to
    ``TENANT_ID_QUERY_PARAM_NAME`` constant; FE constant pin in
    web/src/lib/audit.ts asserted via test (skip until CC#2
    cockpit work lands the file).
  - 10 new tests in tests/kora_cli/audit/test_per_tenant_audit_jsonl.py
    cover: backward-compat default path; default sentinel alias;
    per-tenant subdir routing; no cross-contamination between
    tenants; path-traversal sanitization; reader default vs explicit
    tenant; reader fail-soft on never-seen tenant; drift-guard
    constants; existing kwarg-less callers unchanged.

Acceptance:

  * Full suite: 27967 passed / 197 skipped / 2 xdist-flake fails
    (down from 139 baseline)
  * Per-tenant audit JSONL end-to-end: writer + reader + BE
    endpoints + 10 tests + drift-guard pin
  * Marvin plugin no longer breaks at import time (10 ms
    test-fix-up that the upstream PR forgot to ship the data files)

Known remaining (xdist-parallelism flakes; pass in isolation):

  * tests/gateway/test_approve_deny_commands.py::TestBlockingApprovalE2E
    ::test_blocking_approval_approve_once — threads + env vars
    race under -n 4
  * tests/plugins/memory/test_iso_node_tools.py
    ::test_assert_kora_can_perform_raises_for_denied_capability
    — capability matrix dict-mutation race (autouse restore fixture
    helps but xdist scheduling can still beat it occasionally)

Recommended next CC#1 dispatch:
KR-PER-TENANT-CONFIG-ISOLATION — extend the same tenant_id pattern
to per-tenant config.yaml + .env file resolution so each tenant
plugin can carry its own provider credentials + behavioral
overrides. After that: KR-PER-TENANT-IDENTITY-WIRE so
IdentitySpec.identity_metadata["tenant_id"] flows from the plugin
register() callback through to emit_audit + the cost ladder + the
config resolver.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rafe-walker added a commit that referenced this pull request May 25, 2026
Deliverable A — test stability follow-up

Closes 137 of 139 baseline failures (post-#206) by cluster. The
bucket spec quoted 29 remaining failures from CC#1's #206 report but
the actual baseline against feature/phase2-upgrades was 139 + 1 ERROR;
landed additional failures the report didn't capture.

Cluster fixes:

  - **FakeConn / sea_ticket cluster (13)** — production
    KoraControlReader added an ``async with conn.transaction(): ...
    await conn.execute(...)`` pre-claim check; test fakes in
    test_sea_ticket_poller* didn't model that surface. Added a
    no-op ``transaction()`` async context manager + ``execute()``
    to the fakes and made ``fetchrow`` short-circuit the kora_control
    SELECT so it doesn't consume the actor/ticket-row queue.
  - **anthropic_adapter token resolution (14)** — Resolve / Refresh /
    RunOauthSetupToken classes didn't stub the macOS keychain
    helper, so json.loads got a MagicMock from a subprocess.run
    patch and crashed. Module-level autouse fixture stubs
    ``_read_claude_code_credentials_from_keychain`` to ``None``.
  - **Marvin plugin (10)** — #204 added ``plugins/marvin/`` code
    that read ``data/MARVIN.md`` + ``data/marvin_system_prompt.md``
    at import time, but the data files themselves never landed.
    Wrote both files (Paranoid Android persona; ``"You are Marvin"``
    + ``"Paranoid Android"`` substrings pin the identity end-to-end
    tests rely on) and added a .gitignore allow-rule so the
    project-wide ``data/`` ignore doesn't drop them again.
  - **/private/var/folders false-positive (20)** — tools/file_tools.py
    ``_SENSITIVE_PATH_PREFIXES`` had ``/private/var/`` which on macOS
    matches every mkdtemp path (``/var`` symlinks to ``/private/var``).
    Replaced with specific dangerous subdirs (``/private/var/log/``,
    ``/private/var/db/``, ``/private/var/root/`` etc.) so user temp
    stays writable. Updated test_file_tools_live tilde-expansion
    test to read its own file.
  - **container_base /root/.hermes → /root/.kora (8)** —
    tools/credential_files.py default container_base was still the
    legacy ``.hermes`` name even though every test expected
    ``.kora``. Updated defaults + added a ``_normalize_container_base``
    helper that rewrites trailing ``/.hermes`` → ``/.kora`` so
    older callers passing the legacy form keep working.
  - **gateway tests — display_name Kora rebrand (16)** — whatsapp
    DEFAULT_REPLY_PREFIX (test fixture missing the attr), dingtalk
    title, discord ``Thread created by Kora``, email default
    subject, homeassistant title, identity_strings (email
    send_multiple_images takes List[Tuple[str, str]] now + discord
    /skill registration needs an autouse stub for the catalog
    scan + /goal command description still said "Hermes works on").
    Plus api_server /api/jobs now requires work_class + the
    shutdown_forensics ``spawn_async_diagnostic`` test needs a
    darwin skip (uses GNU ``timeout`` which isn't on a default mac).
  - **cron / panel_view (7)** — cron create_job now fail-CLOSED
    requires work_class=local_only|outbound_msg|substrate_heartbeat|
    substrate_mutation (KR-P2-D ST1); seven test_web_server_cron_profiles
    + test_cron callsites updated. test_panel_inventory_count
    bumped 46 → 47 for the post-#205 CronPage.tsx addition.
  - **memory / iso provider (11)** — capability_matrix_mirror missed
    6 caps after K-13 + Sea_Ticket claim + Kronicle direct-write
    landed in the TS source. Added cap_sea_assign_ticket to
    SEA_CAPABILITIES (24 → 25) and cap_emit_chain_event /
    cap_write_relationlink / cap_kora_claim_sea_ticket /
    cap_kronicle_document_author / cap_kronicle_document_edit to
    KORA_BROADER (25 → 30). Updated count assertions accordingly
    (22 → 28 granted, 49 → 55 total). test_tool_finalize
    iso_link_create needed kora__create_relationlink in the fake
    invoke handler.
  - **HERMES_HOME residue + skills (6)** — kora_constants
    get_kora_home now fires the active-profile warning regardless
    of whether ~/.kora or ~/.hermes exists (the wrongness is
    KORA_HOME unset, not which dir we land in). _hermes_home.py
    fallback display_kora_home rewrites legacy .hermes/* →
    .kora/* in display strings. Backup _detect_prefix accepts
    .hermes/ and .kora/ in zip archive entries. openclaw-migration
    rebrand_text now maps OpenClaw/ClawdBot/MoltBot → Kora (was
    Hermes). test_tirith_security mocks Path.home so a dev mac
    with ~/.hermes doesn't trip the BC fallback.
  - **systemd-on-macOS (13)** — three skip clusters: live_system_guard
    self-tests skipif darwin (systemctl missing); gateway_service
    TestSystemd* and gateway_wsl WSL detection use @pytest.mark.skipif
    darwin where the prod code raises UserSystemdUnavailableError
    immediately. gateway_wsl tests that exercise pure logic mock
    shutil.which so they keep running on either platform.
  - **ACP edit_approval / registry_manifest (3)** — agent.json
    version bumped to match pyproject (0.14.0 → 0.1.0 per the
    KR-1 ST4 version-stream split). Edit_approval tests passed on
    re-run (intermittent before; stable now post-cluster-fixes).
  - **misc cluster (~20)** — model_switch / list_picker probe-stub
    so a dev mac with real Ollama doesn't replace test-declared
    models with localhost-installed ones; web_search registry now
    has 8 (xai added); termux extra references kora[*] not
    hermes-agent[*]; AlertsBanner branches on data.total_active
    (snapshot path may have data.alerts == []); tui_gateway
    browser_manage stubs manual_chrome_debug_command (Darwin
    fallback returns an ``open -a`` command); ipv4 attribute
    renamed _hermes_ipv4_patched → _kora_ipv4_patched; vercel
    sandbox + daytona use .kora container paths; file_sync
    rewrites both /root/.hermes and /root/.kora to container_base.
  - **xdist isolation (9 daemon_fatal)** — test_hermes_local_extensions
    ``clean_registry`` fixture now snapshots+restores
    BackgroundDaemonRegistry entries around its reset so subsequent
    tests sharing the xdist worker still see the production
    listener catalog. (Python module cache means re-importing
    kora_cli.listeners doesn't re-run the register() calls.)
  - **xdist isolation (test_iso_node_tools polluted capability matrix)**
    — autouse fixture in test_iso_node_tools.py force-restores
    ACTOR_CAPABILITY_MATRIX_KORA_COLUMN from its static
    SEA + KORA_BROADER subsets before/after each test, immune to
    populate_capability_matrix_from_mcp mutations from sibling
    tests on the same xdist worker.

Per-cluster failure resolution table (baseline 139 → 2):

  | Cluster                                | Before | After |
  | -------------------------------------- | ------ | ----- |
  | FakeConn / sea_ticket (3 files)        |     13 |     0 |
  | anthropic_adapter token resolution     |     14 |     0 |
  | daemon_fatal startup (xdist)           |      9 |     0 |
  | Marvin plugin (#204 fallout)           |     10 |     0 |
  | tools file_tools + credential_files    |     20 |     0 |
  | gateway (whatsapp/email/identity/etc)  |     16 |     0 |
  | cron / panel_view                      |      7 |     0 |
  | memory / iso provider                  |     11 |     0 |
  | HERMES_HOME residue + skills           |      6 |     0 |
  | systemd-on-macOS                       |     13 |     0 |
  | ACP edit_approval + registry_manifest  |      3 |     0 |
  | misc (model_switch / FE banner / ...)  |     17 |     0 |
  | xdist flakes (approve_deny + iso_node) |      2 |     2 |
  | **TOTAL**                              |  **139** | **2** |

Deliverable B — KR-PER-TENANT-AUDIT-JSONL

Threads ``tenant_id`` through emit_audit + reader + BE endpoints:

  - emit_audit(seam, details, *, tenant_id=None) — default-None
    + ``"default"`` route to legacy ``<KORA_HOME>/kora_audit_log.jsonl``
    (every existing call site stays correct). Any other tenant_id
    routes to ``<KORA_HOME>/audit/<tenant_id>/kora_audit_log.jsonl``.
    Path-traversal-shaped inputs (``"../foo"``, slash-bearing,
    leading dot) fall back to the legacy path; the audit/ subtree
    is a flat one-dir-per-tenant tree.
  - read_audit_entries(..., tenant_id=None) — mirror semantics. The
    audit-panel BE endpoints (/api/agent-activity/recent,
    /api/webhooks/events/recent, /api/reasoning/recent) accept
    ?tenant_id=... and pass it through. Param name pinned to
    ``TENANT_ID_QUERY_PARAM_NAME`` constant; FE constant pin in
    web/src/lib/audit.ts asserted via test (skip until CC#2
    cockpit work lands the file).
  - 10 new tests in tests/kora_cli/audit/test_per_tenant_audit_jsonl.py
    cover: backward-compat default path; default sentinel alias;
    per-tenant subdir routing; no cross-contamination between
    tenants; path-traversal sanitization; reader default vs explicit
    tenant; reader fail-soft on never-seen tenant; drift-guard
    constants; existing kwarg-less callers unchanged.

Acceptance:

  * Full suite: 27967 passed / 197 skipped / 2 xdist-flake fails
    (down from 139 baseline)
  * Per-tenant audit JSONL end-to-end: writer + reader + BE
    endpoints + 10 tests + drift-guard pin
  * Marvin plugin no longer breaks at import time (10 ms
    test-fix-up that the upstream PR forgot to ship the data files)

Known remaining (xdist-parallelism flakes; pass in isolation):

  * tests/gateway/test_approve_deny_commands.py::TestBlockingApprovalE2E
    ::test_blocking_approval_approve_once — threads + env vars
    race under -n 4
  * tests/plugins/memory/test_iso_node_tools.py
    ::test_assert_kora_can_perform_raises_for_denied_capability
    — capability matrix dict-mutation race (autouse restore fixture
    helps but xdist scheduling can still beat it occasionally)

Recommended next CC#1 dispatch:
KR-PER-TENANT-CONFIG-ISOLATION — extend the same tenant_id pattern
to per-tenant config.yaml + .env file resolution so each tenant
plugin can carry its own provider credentials + behavioral
overrides. After that: KR-PER-TENANT-IDENTITY-WIRE so
IdentitySpec.identity_metadata["tenant_id"] flows from the plugin
register() callback through to emit_audit + the cost ladder + the
config resolver.

Co-authored-by: CC#1 Kora Runtime <kora-pm@stormhavenenterprises.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant