Skip to content

setup: don't halt install re-runs + clearer Step 6 handoff#229

Closed
uzyn wants to merge 2 commits into
mainfrom
worktree-setup-cont
Closed

setup: don't halt install re-runs + clearer Step 6 handoff#229
uzyn wants to merge 2 commits into
mainfrom
worktree-setup-cont

Conversation

@uzyn

@uzyn uzyn commented May 12, 2026

Copy link
Copy Markdown
Owner

Summary

Two related fixes to the aimx setup first-run experience.

1. install.sh: don't halt re-runs after interrupted setup

When a user Ctrl+Cs out of aimx setup before its final step, /usr/local/bin/aimx is on disk but no systemd unit / OpenRC init script was written. Re-running curl -fsSL https://aimx.email/install.sh | sh then hit the upgrade branch and bailed fatally on systemctl start aimx ("Unit aimx.service not found"), trapping the operator — every re-run rolled itself back.

A new setup_completed helper probes /etc/systemd/system/aimx.service and /etc/init.d/aimx (overridable via AIMX_SETUP_COMPLETED_SYSTEMD / AIMX_SETUP_COMPLETED_OPENRC for tests). The upgrade-vs-fresh decision is gated on it: if neither unit is on disk the script falls through to the existing fresh-install path, which re-lays the binary at the target tag and execs aimx setup cleanly. Genuine upgrades (unit installed, version diff) are unchanged.

Before:

install: starting aimx.service (systemd)
Failed to start aimx.service: Unit aimx.service not found.
install: ✗ service start failed; rolling back
install: error: upgrade failed at start step; service restored if possible

After:

install: binary present but aimx setup did not complete (no service unit); continuing setup
install: dry-run: would exec 'sudo aimx setup' (binary owns wizard)

2. aimx setup closing handoff is clearer

  • Promotes the 🐦‍⬛ AIMX setup — complete banner so it sits between Step 5 and the Step 6 handoff, and drops the second checklist underneath it (the post-Step-5 checklist already shows the same state).
  • Replaces the single-line → aimx agents setup callout with a boxed callout under a sentence telling the operator to run the agents guided setup as a regular user (not root). Box geometry is locked in by an equal-width assertion across top border / body row / bottom border, and the body row hard-codes so the borders stay aligned under NO_COLOR / non-TTY where prompt_mark would otherwise widen to [>].
  • Rewrites the closing capability line. Previously it claimed agents had access to email; at this point the daemon is up but no agent is wired yet. New wording: "AIMX is now running at @<domain> and ready to send and receive email." followed by "Once you've wired your MCP to your LLM, try asking your agent, e.g.".
  • Prefixes "AIMX has been set up successfully." with the same green ☑ glyph the checklist uses, so it reads as the final box checked.

Rendered (TTY, sg.aimx.click example domain):

✓ Step 5 complete.

  ☑ Preflight checks on port 25
  ☑ Set up domain and DNS
  ☑ Set up TLS certificate
  ☑ Set up trust policy
  ☑ Install AIMX
  ☐ Set up MCP for agent(s)


🐦‍⬛ AIMX setup — complete


Step 6: Set up MCP for agent(s)

AIMX bundles MCP and skill files for the following AI agents & harnesses:

  • Claude Code
  • Codex CLI
  • OpenCode
  • Gemini CLI
  • Goose
  • OpenClaw
  • Hermes
  • NanoClaw

To wire AIMX into the harness you have installed, run the agents guided setup as a regular user (not root):

  ┌─────────────────────────────┐
  │  → aimx agents setup        │
  └─────────────────────────────┘


☑ AIMX has been set up successfully.

AIMX is now running at @sg.aimx.click and ready to send and receive email.

Once you've wired your MCP to your LLM, try asking your agent, e.g.
  claude -p "Set up agent@sg.aimx.click and respond to me via email the moment you receive my instructions via email."

Test plan

install.sh

  • sh -n install.sh
  • shellcheck install.sh
  • shellcheck -s sh tests/install_sh.sh
  • sh tests/install_sh.sh — 114/114 pass (4 new assertions in the # setup_completed gate section).
  • Dry-run smoke: AIMX_DRY_RUN=1 AIMX_SETUP_COMPLETED_SYSTEMD=/nonexistent AIMX_SETUP_COMPLETED_OPENRC=/nonexistent sh install.sh --tag 0.0.9.1 prints "continuing setup" + fresh-install dry-run lines.
  • Dry-run smoke (regression): with the systemd unit file present, sh install.sh --tag 99.99.99 still prints upgrading X -> 99.99.99 / would stop aimx.service, swap binary, restart (upgrade).
  • End-to-end on a throwaway VPS: fresh box → install → Ctrl+C at "domain" prompt → re-run installer and confirm wizard re-enters without a systemctl start failure.

aimx setup closing handoff

  • cargo test — 1149 unit + 116 integration tests pass.
  • cargo clippy -- -D warnings clean.
  • cargo fmt -- --check clean.
  • Targeted tests: mcp_section_lines_are_in_documented_order, mcp_section_recommends_running_aimx_agents_setup, mcp_section_lists_every_registered_agent, final_banner_signals_completion, closing_message_carries_required_phrasing, closing_message_does_not_overclaim_agent_capability, run_setup_step_six_uses_handoff_state, run_setup_emits_section_headers_in_spec_order.
  • Visual smoke on a TTY: confirm the banner sits between Step 5 and Step 6, the box renders, and the closing block reads as accurate (daemon running, agents still need wiring).
  • NO_COLOR=1 smoke: confirm the box borders stay aligned (the body row uses literal , not prompt_mark's [>] fallback).

If `aimx setup` is Ctrl+C'd before its final step, the binary is on
disk but no systemd/OpenRC unit was written, so the next install.sh
run hit the upgrade branch and bailed fatally on `systemctl start
aimx` ("Unit not found"). Gate the upgrade decision on a new
`setup_completed` probe (unit file present) — fall through to the
fresh-install path otherwise, which re-lays the binary and re-enters
the wizard.

@uzyn uzyn left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes Overview

install.sh now refuses to take the upgrade branch when the on-disk binary lacks an accompanying service unit. A new setup_completed helper probes /etc/systemd/system/aimx.service and /etc/init.d/aimx (overridable via AIMX_SETUP_COMPLETED_SYSTEMD / AIMX_SETUP_COMPLETED_OPENRC for tests). If neither file is on disk the script falls through to the existing fresh-install path, which re-lays the binary at the target tag and execs aimx setup. tests/install_sh.sh gains 4 assertions covering no-unit / systemd-unit-present / openrc-script-present / source-grep regression on the gate.

Scope Alignment

Matches the PR description exactly. No scope creep — the diff is exclusively the gate plus its tests, and the upgrade branch's behavior when the gate returns true is bitwise-identical to the pre-PR logic (I diffed the inner block — the case "${_cmp}" arms, the --force handling, and the unparseable-tag fallthrough all move under the new else unchanged).

Potential Bugs

None spotted. Verified independently:

  • The probe paths match the canonical install locations used by SystemOpsImpl::install_service_file in src/setup.rs:467 (/etc/systemd/system/aimx.service) and src/setup.rs:484 (/etc/init.d/aimx).
  • install_and_verify_service runs as Step 5 of the wizard (src/setup.rs:2644), so a unit file on disk is a sound durable signal that the wizard reached its final code path. A Ctrl+C anywhere earlier (port-25 preflight, domain prompt, trust prompt, DNS retry loop) leaves no unit file.
  • set -eu interaction is safe: ${VAR:-default} handles unset under nounset, and if ! setup_completed runs under errexit-suppressed conditional context.
  • _skip_swap stays unset on the new fallthrough path, so ${_skip_swap:-0} at lines 1345 and 1416 correctly evaluates to 0 and the download/install still runs.
  • _is_upgrade stays 0 on the new path, so the upgrade-specific systemctl start aimx block at line 1382 is unreachable — which is the entire point of the fix.

Security Issues

None. The two new env vars are probe paths only — they cannot redirect what gets executed, only whether the script takes the upgrade branch. A hostile env that pointed both at /dev/null would force the fresh-install path, which is strictly safer than the unguarded upgrade path it replaces.

Test Coverage

Solid for what it gates. The four new assertions exercise both true-return branches (systemd path present, OpenRC path present) and the false-return branch, plus a source-grep tripwire that the gate is wired into main().

One small note (non-blocker): the source-grep regression in test (d) is tied to the exact literal string if ! setup_completed. A future refactor to e.g. setup_completed || { ... } or _done=$(setup_completed && echo y) would fail the test even though the gate is still correctly in place. Acceptable as a tripwire, but worth knowing that "the test fails" doesn't always mean "the gate is gone."

The end-to-end VPS smoke (item 5 in the PR test plan) is the only un-checked box — that's the one that actually proves the originally-reported transcript no longer reproduces. Worth doing post-merge if not done yet.

Code Quality

_systemd_unit / _openrc_unit inside setup_completed are not scoped (POSIX sh has no local). They leak into the outer shell. No collisions with any current caller, and consistent with how other helpers in this file handle locals, so it's fine — flagging only for future readers.

The function's comment block is genuinely useful: it explains the why (interrupted setup), the what (durable signal), and the consequence of not having the gate (rollback trap). This is what doc comments in shell scripts should look like.

Summary and Recommended Actions

Verdict: Ready to merge.

  • Blockers: none
  • Non-blockers: none
  • Nice-to-haves: run the end-to-end VPS smoke from item 5 of the test plan to confirm the originally-reported transcript is gone in practice; consider tightening the source-grep regression (test d) to assert on a behavior rather than a literal string at some future cleanup.

Recommended merge commit message

install.sh: don't halt re-runs after interrupted setup

A Ctrl+C out of `aimx setup` left `/usr/local/bin/aimx` on disk
without a systemd unit / OpenRC init script. Re-running the
installer then took the upgrade branch and died on
`systemctl start aimx` ("Unit aimx.service not found"), so every
re-run rolled itself back and trapped the operator.

Add a `setup_completed` helper that probes
`/etc/systemd/system/aimx.service` and `/etc/init.d/aimx` (the
files `aimx setup` writes as its final step). Gate the
upgrade-vs-fresh decision on it: with no unit file on disk the
script falls through to the existing fresh-install path, which
re-lays the binary at the target tag and `exec`s `aimx setup`
cleanly. Genuine upgrades (unit installed, version diff) are
unchanged.

Tests cover no-unit, systemd-unit-present, openrc-script-present,
and a source-grep regression on the gate (114/114 pass).

- Promote the `🐦‍⬛ AIMX setup — complete` banner so it sits between
  Step 5 and the Step 6 handoff, and stop reprinting the checklist
  underneath it (the post-Step-5 list already shows the same state
  with Step 6 as ☐, so the second one was noise).
- Replace the single-line `→ aimx agents setup` callout with a
  labeled, boxed callout under a sentence directing the operator to
  run the agents guided setup as a regular user (not root). The box
  geometry is locked in by a new equal-width assertion on top
  border / body row / bottom border. The body row hard-codes `→`
  (one column) rather than calling `term::prompt_mark` so the
  borders stay aligned under `NO_COLOR` / non-TTY where prompt_mark
  would otherwise widen to `[>]`.
- Rewrite the closing capability line: drop the previous wording
  (which claimed agents already had access) for "AIMX is now
  running at @<domain> and ready to send and receive email." —
  accurate at the moment Step 6 runs (daemon up, MCP not yet
  wired). Follow-up reads "Once you've wired your MCP to your
  LLM, try asking your agent, e.g." so the example prompt is
  framed as post-wiring.
- Prefix `AIMX has been set up successfully.` with the same green
  ☑ glyph the checklist uses so it reads as the final box checked.
- Tests: rework `mcp_section_lines_are_in_documented_order` around
  the boxed callout (new anchors + equal-width geometry asserts);
  rename `final_banner_signals_completion_and_renders_step_list`
  → `final_banner_signals_completion` and flip its `print_step_list`
  assertion to absence; refresh the closing-message asserts and add
  `closing_message_does_not_overclaim_agent_capability` as a
  regression guard (using a split-literal trick so the assertion
  doesn't self-trigger via `include_str!`).
- book/setup.md: refresh the Step 6 description and the example
  closing-message block to match.
@uzyn uzyn changed the title install.sh: don't halt re-runs after interrupted setup setup: don't halt install re-runs + clearer Step 6 handoff May 12, 2026

@uzyn uzyn left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes Overview

This PR bundles two independent first-run UX fixes.

1. install.sh — adds a setup_completed probe (presence of /etc/systemd/system/aimx.service or /etc/init.d/aimx, overridable via AIMX_SETUP_COMPLETED_SYSTEMD / AIMX_SETUP_COMPLETED_OPENRC for tests) and gates the upgrade-vs-fresh decision on it. With no unit file on disk, the script falls through to the existing fresh-install path instead of taking the upgrade branch and dying on systemctl start aimx.

2. aimx setup Step 6 / closing handoff — promotes the 🐦‍⬛ AIMX setup — complete banner above the Step 6 section (dropping the second checklist beneath it), rewrites the aimx agents setup callout as a boxed call-to-action with hard-coded to keep border alignment under NO_COLOR / non-TTY, prefixes "AIMX has been set up successfully." with the same green ☑ glyph the checklist uses, and rewrites the closing capability line so it no longer claims agents already have access (they don't until aimx agents setup runs).

Scope Alignment

Matches the PR description exactly. No scope creep.

Potential Bugs

None spotted. Verified independently:

install.sh

Already reviewed in the prior review thread on this PR and re-confirmed against this branch. The probe paths match the canonical install locations (/etc/systemd/system/aimx.service from SystemOpsImpl::install_service_file, /etc/init.d/aimx). set -eu interactions are safe. _skip_swap / _is_upgrade stay unset on the new fallthrough path, so the unguarded systemctl start aimx block in the upgrade branch is unreachable when the gate fires — which is the entire point of the fix.

src/setup.rs

  • print_final_banner() has exactly one production caller (run_setup at line 2680); the signature change from (&checklist) to () is internally consistent. The only other references are in the test source-grep, which still works.
  • The box geometry assertion in mcp_section_lines_are_in_documented_order is sound: top.chars().count() == body.chars().count() == bottom.chars().count() == 33. Verified against the literal strings — top is " " + "┌" + 29×"─" + "┐", body is " " + "│" + " → aimx agents setup " + "│", both 33 chars under chars().count(). and are single Unicode codepoints with display width 1, so the assertion correctly catches misalignment.
  • term::step_glyph(term::StepState::Done) exists and returns ColoredString, which implements Display. The closing-line render is correct.
  • The split-literal trick in closing_message_does_not_overclaim_agent_capability (["Your agents", " now have access"].concat()) is necessary: without it, include_str!("setup.rs") would self-trigger the assertion because the test source itself would contain the forbidden phrase verbatim. Subtle but correct.

Order of operations in run_setup (lines 2657–2686)

  1. print_step_complete(5, &checklist) renders the running checklist with Step 6 still as ☐ Pending — matches the PR description's example.
  2. checklist.set(6, Handoff) mutates internal state.
  3. print_final_banner() prints just the headline (no checklist).
  4. section_header(6, ...), display_mcp_section(), print_closing_message(&domain) follow.

The two println!() calls (banner trailing blank + section_header leading blank) produce the two-blank gap between the banner and "Step 6:" shown in the PR description's render. Correct.

Security Issues

None. The two new env vars in install.sh are probe paths only — they cannot redirect what gets executed, only whether the script takes the upgrade branch. Same conclusion as the prior review.

Test Coverage

Strong. The new and reworked tests cover what changed:

  • install_sh.sh: four new assertions in the setup_completed gate block exercise no-unit, systemd-unit-present, openrc-unit-present, and a source-grep tripwire on the gate.
  • mcp_section_lines_are_in_documented_order: reworked around the boxed callout with new anchors (To wire AIMX, , , ) and equal-width geometry asserts on all three box rows. Catches both reordering and silent misalignment.
  • final_banner_signals_completion (renamed from final_banner_signals_completion_and_renders_step_list): flipped to assert print_step_list is absent from the banner body — guards against a regression that would reprint the checklist under the banner.
  • closing_message_carries_required_phrasing: refreshed for the new wording (AIMX is now running at, ready to send and receive email, Once you've wired your MCP to your LLM).
  • closing_message_does_not_overclaim_agent_capability: new regression guard against re-introducing the old "Your agents now have access" wording.

Code Quality

One non-blocker observation:

checklist.set(6, term::StepState::Handoff) at line 2679 is now a dead state mutation. The checklist was last rendered by print_step_complete(5) at line 2658 (before the Handoff is set), and print_final_banner() no longer reprints it. Nothing after line 2679 reads the checklist value. The run_setup_step_six_uses_handoff_state test still passes because it source-greps for the literal term::StepState::Handoff, not for any observable behavior.

This is at most cosmetic — the assignment is harmless and the test guards against an architectural rollback (re-introducing the runuser drop-through). But the test's stated rationale ("so the closing checklist signals the operator-action follow-up") and book/setup.md's "Marked ⎘ Handoff on the checklist" claim are both slightly misleading: the operator never sees the Handoff glyph anywhere in the rendered output (the displayed checklist shows ☐). If you want to clean this up later, either drop the set call and convert the test to verify the absence of the runuser path more directly, or update the comments to clarify the Handoff state is now informational-only. Not blocking this PR.

Summary and Recommended Actions

Verdict: Ready to merge.

  • Blockers: none
  • Non-blockers: none
  • Nice-to-haves:
    • checklist.set(6, Handoff) and its source-grep test now describe state the operator never sees; consider trimming or relabeling later (see Code Quality).
    • Two un-checked items in the test plan (end-to-end VPS smoke for the install.sh fix; TTY + NO_COLOR visual smoke on the Step 6 box). Worth completing post-merge to confirm the originally-reported transcript is gone in practice and that the box stays aligned under NO_COLOR.

Recommended merge commit message

setup: don't halt install re-runs + clearer Step 6 handoff

install.sh: A Ctrl+C out of `aimx setup` left `/usr/local/bin/aimx`
on disk without a service unit, so re-running the installer took
the upgrade branch and bailed on `systemctl start aimx` ("Unit
not found"), trapping the operator. Gate the upgrade decision on
a new `setup_completed` probe (unit file present); fall through
to the fresh-install path otherwise, which re-lays the binary
and re-enters the wizard.

aimx setup: Promote the `🐦‍⬛ AIMX setup — complete` banner so it
sits between Step 5 and the Step 6 handoff, drop the redundant
second checklist underneath, and replace the single-line `aimx
agents setup` callout with a boxed call-to-action under a
sentence directing the operator to run it as a regular user
(box geometry is locked in by an equal-width assertion; the body
row hard-codes `→` so borders stay aligned under `NO_COLOR`).
Rewrite the closing capability line away from "agents now have
access" — at this point the daemon is up but no agent is wired
yet — to "AIMX is now running at @<domain> and ready to send and
receive email." Prefix the success banner with the same green ☑
the checklist uses.

@uzyn uzyn closed this in #231 May 13, 2026
@uzyn uzyn deleted the worktree-setup-cont branch May 15, 2026 06:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant