setup: don't halt install re-runs + clearer Step 6 handoff#229
Conversation
If `aimx setup` is Ctrl+C'd before its final step, the binary is on
disk but no systemd/OpenRC unit was written, so the next install.sh
run hit the upgrade branch and bailed fatally on `systemctl start
aimx` ("Unit not found"). Gate the upgrade decision on a new
`setup_completed` probe (unit file present) — fall through to the
fresh-install path otherwise, which re-lays the binary and re-enters
the wizard.
uzyn
left a comment
There was a problem hiding this comment.
Changes Overview
install.sh now refuses to take the upgrade branch when the on-disk binary lacks an accompanying service unit. A new setup_completed helper probes /etc/systemd/system/aimx.service and /etc/init.d/aimx (overridable via AIMX_SETUP_COMPLETED_SYSTEMD / AIMX_SETUP_COMPLETED_OPENRC for tests). If neither file is on disk the script falls through to the existing fresh-install path, which re-lays the binary at the target tag and execs aimx setup. tests/install_sh.sh gains 4 assertions covering no-unit / systemd-unit-present / openrc-script-present / source-grep regression on the gate.
Scope Alignment
Matches the PR description exactly. No scope creep — the diff is exclusively the gate plus its tests, and the upgrade branch's behavior when the gate returns true is bitwise-identical to the pre-PR logic (I diffed the inner block — the case "${_cmp}" arms, the --force handling, and the unparseable-tag fallthrough all move under the new else unchanged).
Potential Bugs
None spotted. Verified independently:
- The probe paths match the canonical install locations used by
SystemOpsImpl::install_service_fileinsrc/setup.rs:467(/etc/systemd/system/aimx.service) andsrc/setup.rs:484(/etc/init.d/aimx). install_and_verify_serviceruns as Step 5 of the wizard (src/setup.rs:2644), so a unit file on disk is a sound durable signal that the wizard reached its final code path. A Ctrl+C anywhere earlier (port-25 preflight, domain prompt, trust prompt, DNS retry loop) leaves no unit file.set -euinteraction is safe:${VAR:-default}handles unset undernounset, andif ! setup_completedruns under errexit-suppressed conditional context._skip_swapstays unset on the new fallthrough path, so${_skip_swap:-0}at lines 1345 and 1416 correctly evaluates to 0 and the download/install still runs._is_upgradestays 0 on the new path, so the upgrade-specificsystemctl start aimxblock at line 1382 is unreachable — which is the entire point of the fix.
Security Issues
None. The two new env vars are probe paths only — they cannot redirect what gets executed, only whether the script takes the upgrade branch. A hostile env that pointed both at /dev/null would force the fresh-install path, which is strictly safer than the unguarded upgrade path it replaces.
Test Coverage
Solid for what it gates. The four new assertions exercise both true-return branches (systemd path present, OpenRC path present) and the false-return branch, plus a source-grep tripwire that the gate is wired into main().
One small note (non-blocker): the source-grep regression in test (d) is tied to the exact literal string if ! setup_completed. A future refactor to e.g. setup_completed || { ... } or _done=$(setup_completed && echo y) would fail the test even though the gate is still correctly in place. Acceptable as a tripwire, but worth knowing that "the test fails" doesn't always mean "the gate is gone."
The end-to-end VPS smoke (item 5 in the PR test plan) is the only un-checked box — that's the one that actually proves the originally-reported transcript no longer reproduces. Worth doing post-merge if not done yet.
Code Quality
_systemd_unit / _openrc_unit inside setup_completed are not scoped (POSIX sh has no local). They leak into the outer shell. No collisions with any current caller, and consistent with how other helpers in this file handle locals, so it's fine — flagging only for future readers.
The function's comment block is genuinely useful: it explains the why (interrupted setup), the what (durable signal), and the consequence of not having the gate (rollback trap). This is what doc comments in shell scripts should look like.
Summary and Recommended Actions
Verdict: Ready to merge.
- Blockers: none
- Non-blockers: none
- Nice-to-haves: run the end-to-end VPS smoke from item 5 of the test plan to confirm the originally-reported transcript is gone in practice; consider tightening the source-grep regression (test d) to assert on a behavior rather than a literal string at some future cleanup.
Recommended merge commit message
install.sh: don't halt re-runs after interrupted setup
A Ctrl+C out of `aimx setup` left `/usr/local/bin/aimx` on disk
without a systemd unit / OpenRC init script. Re-running the
installer then took the upgrade branch and died on
`systemctl start aimx` ("Unit aimx.service not found"), so every
re-run rolled itself back and trapped the operator.
Add a `setup_completed` helper that probes
`/etc/systemd/system/aimx.service` and `/etc/init.d/aimx` (the
files `aimx setup` writes as its final step). Gate the
upgrade-vs-fresh decision on it: with no unit file on disk the
script falls through to the existing fresh-install path, which
re-lays the binary at the target tag and `exec`s `aimx setup`
cleanly. Genuine upgrades (unit installed, version diff) are
unchanged.
Tests cover no-unit, systemd-unit-present, openrc-script-present,
and a source-grep regression on the gate (114/114 pass).
- Promote the `🐦⬛ AIMX setup — complete` banner so it sits between Step 5 and the Step 6 handoff, and stop reprinting the checklist underneath it (the post-Step-5 list already shows the same state with Step 6 as ☐, so the second one was noise). - Replace the single-line `→ aimx agents setup` callout with a labeled, boxed callout under a sentence directing the operator to run the agents guided setup as a regular user (not root). The box geometry is locked in by a new equal-width assertion on top border / body row / bottom border. The body row hard-codes `→` (one column) rather than calling `term::prompt_mark` so the borders stay aligned under `NO_COLOR` / non-TTY where prompt_mark would otherwise widen to `[>]`. - Rewrite the closing capability line: drop the previous wording (which claimed agents already had access) for "AIMX is now running at @<domain> and ready to send and receive email." — accurate at the moment Step 6 runs (daemon up, MCP not yet wired). Follow-up reads "Once you've wired your MCP to your LLM, try asking your agent, e.g." so the example prompt is framed as post-wiring. - Prefix `AIMX has been set up successfully.` with the same green ☑ glyph the checklist uses so it reads as the final box checked. - Tests: rework `mcp_section_lines_are_in_documented_order` around the boxed callout (new anchors + equal-width geometry asserts); rename `final_banner_signals_completion_and_renders_step_list` → `final_banner_signals_completion` and flip its `print_step_list` assertion to absence; refresh the closing-message asserts and add `closing_message_does_not_overclaim_agent_capability` as a regression guard (using a split-literal trick so the assertion doesn't self-trigger via `include_str!`). - book/setup.md: refresh the Step 6 description and the example closing-message block to match.
uzyn
left a comment
There was a problem hiding this comment.
Changes Overview
This PR bundles two independent first-run UX fixes.
1. install.sh — adds a setup_completed probe (presence of /etc/systemd/system/aimx.service or /etc/init.d/aimx, overridable via AIMX_SETUP_COMPLETED_SYSTEMD / AIMX_SETUP_COMPLETED_OPENRC for tests) and gates the upgrade-vs-fresh decision on it. With no unit file on disk, the script falls through to the existing fresh-install path instead of taking the upgrade branch and dying on systemctl start aimx.
2. aimx setup Step 6 / closing handoff — promotes the 🐦⬛ AIMX setup — complete banner above the Step 6 section (dropping the second checklist beneath it), rewrites the aimx agents setup callout as a boxed call-to-action with hard-coded → to keep border alignment under NO_COLOR / non-TTY, prefixes "AIMX has been set up successfully." with the same green ☑ glyph the checklist uses, and rewrites the closing capability line so it no longer claims agents already have access (they don't until aimx agents setup runs).
Scope Alignment
Matches the PR description exactly. No scope creep.
Potential Bugs
None spotted. Verified independently:
install.sh
Already reviewed in the prior review thread on this PR and re-confirmed against this branch. The probe paths match the canonical install locations (/etc/systemd/system/aimx.service from SystemOpsImpl::install_service_file, /etc/init.d/aimx). set -eu interactions are safe. _skip_swap / _is_upgrade stay unset on the new fallthrough path, so the unguarded systemctl start aimx block in the upgrade branch is unreachable when the gate fires — which is the entire point of the fix.
src/setup.rs
print_final_banner()has exactly one production caller (run_setupat line 2680); the signature change from(&checklist)to()is internally consistent. The only other references are in the test source-grep, which still works.- The box geometry assertion in
mcp_section_lines_are_in_documented_orderis sound:top.chars().count() == body.chars().count() == bottom.chars().count() == 33. Verified against the literal strings — top is" " + "┌" + 29×"─" + "┐", body is" " + "│" + " → aimx agents setup " + "│", both 33 chars underchars().count().→and─are single Unicode codepoints with display width 1, so the assertion correctly catches misalignment. term::step_glyph(term::StepState::Done)exists and returnsColoredString, which implementsDisplay. The closing-line render is correct.- The split-literal trick in
closing_message_does_not_overclaim_agent_capability(["Your agents", " now have access"].concat()) is necessary: without it,include_str!("setup.rs")would self-trigger the assertion because the test source itself would contain the forbidden phrase verbatim. Subtle but correct.
Order of operations in run_setup (lines 2657–2686)
print_step_complete(5, &checklist)renders the running checklist with Step 6 still as ☐ Pending — matches the PR description's example.checklist.set(6, Handoff)mutates internal state.print_final_banner()prints just the headline (no checklist).section_header(6, ...),display_mcp_section(),print_closing_message(&domain)follow.
The two println!() calls (banner trailing blank + section_header leading blank) produce the two-blank gap between the banner and "Step 6:" shown in the PR description's render. Correct.
Security Issues
None. The two new env vars in install.sh are probe paths only — they cannot redirect what gets executed, only whether the script takes the upgrade branch. Same conclusion as the prior review.
Test Coverage
Strong. The new and reworked tests cover what changed:
install_sh.sh: four new assertions in thesetup_completed gateblock exercise no-unit, systemd-unit-present, openrc-unit-present, and a source-grep tripwire on the gate.mcp_section_lines_are_in_documented_order: reworked around the boxed callout with new anchors (To wire AIMX,┌,│,└) and equal-width geometry asserts on all three box rows. Catches both reordering and silent misalignment.final_banner_signals_completion(renamed fromfinal_banner_signals_completion_and_renders_step_list): flipped to assertprint_step_listis absent from the banner body — guards against a regression that would reprint the checklist under the banner.closing_message_carries_required_phrasing: refreshed for the new wording (AIMX is now running at,ready to send and receive email,Once you've wired your MCP to your LLM).closing_message_does_not_overclaim_agent_capability: new regression guard against re-introducing the old "Your agents now have access" wording.
Code Quality
One non-blocker observation:
checklist.set(6, term::StepState::Handoff) at line 2679 is now a dead state mutation. The checklist was last rendered by print_step_complete(5) at line 2658 (before the Handoff is set), and print_final_banner() no longer reprints it. Nothing after line 2679 reads the checklist value. The run_setup_step_six_uses_handoff_state test still passes because it source-greps for the literal term::StepState::Handoff, not for any observable behavior.
This is at most cosmetic — the assignment is harmless and the test guards against an architectural rollback (re-introducing the runuser drop-through). But the test's stated rationale ("so the closing checklist signals the operator-action follow-up") and book/setup.md's "Marked ⎘ Handoff on the checklist" claim are both slightly misleading: the operator never sees the Handoff glyph anywhere in the rendered output (the displayed checklist shows ☐). If you want to clean this up later, either drop the set call and convert the test to verify the absence of the runuser path more directly, or update the comments to clarify the Handoff state is now informational-only. Not blocking this PR.
Summary and Recommended Actions
Verdict: Ready to merge.
- Blockers: none
- Non-blockers: none
- Nice-to-haves:
checklist.set(6, Handoff)and its source-grep test now describe state the operator never sees; consider trimming or relabeling later (see Code Quality).- Two un-checked items in the test plan (end-to-end VPS smoke for the install.sh fix; TTY +
NO_COLORvisual smoke on the Step 6 box). Worth completing post-merge to confirm the originally-reported transcript is gone in practice and that the box stays aligned underNO_COLOR.
Recommended merge commit message
setup: don't halt install re-runs + clearer Step 6 handoff
install.sh: A Ctrl+C out of `aimx setup` left `/usr/local/bin/aimx`
on disk without a service unit, so re-running the installer took
the upgrade branch and bailed on `systemctl start aimx` ("Unit
not found"), trapping the operator. Gate the upgrade decision on
a new `setup_completed` probe (unit file present); fall through
to the fresh-install path otherwise, which re-lays the binary
and re-enters the wizard.
aimx setup: Promote the `🐦⬛ AIMX setup — complete` banner so it
sits between Step 5 and the Step 6 handoff, drop the redundant
second checklist underneath, and replace the single-line `aimx
agents setup` callout with a boxed call-to-action under a
sentence directing the operator to run it as a regular user
(box geometry is locked in by an equal-width assertion; the body
row hard-codes `→` so borders stay aligned under `NO_COLOR`).
Rewrite the closing capability line away from "agents now have
access" — at this point the daemon is up but no agent is wired
yet — to "AIMX is now running at @<domain> and ready to send and
receive email." Prefix the success banner with the same green ☑
the checklist uses.
Summary
Two related fixes to the
aimx setupfirst-run experience.1.
install.sh: don't halt re-runs after interrupted setupWhen a user Ctrl+Cs out of
aimx setupbefore its final step,/usr/local/bin/aimxis on disk but no systemd unit / OpenRC init script was written. Re-runningcurl -fsSL https://aimx.email/install.sh | shthen hit the upgrade branch and bailed fatally onsystemctl start aimx("Unit aimx.service not found"), trapping the operator — every re-run rolled itself back.A new
setup_completedhelper probes/etc/systemd/system/aimx.serviceand/etc/init.d/aimx(overridable viaAIMX_SETUP_COMPLETED_SYSTEMD/AIMX_SETUP_COMPLETED_OPENRCfor tests). The upgrade-vs-fresh decision is gated on it: if neither unit is on disk the script falls through to the existing fresh-install path, which re-lays the binary at the target tag andexecsaimx setupcleanly. Genuine upgrades (unit installed, version diff) are unchanged.Before:
After:
2.
aimx setupclosing handoff is clearer🐦⬛ AIMX setup — completebanner so it sits between Step 5 and the Step 6 handoff, and drops the second checklist underneath it (the post-Step-5 checklist already shows the same state).→ aimx agents setupcallout with a boxed callout under a sentence telling the operator to run the agents guided setup as a regular user (not root). Box geometry is locked in by an equal-width assertion across top border / body row / bottom border, and the body row hard-codes→so the borders stay aligned underNO_COLOR/ non-TTY whereprompt_markwould otherwise widen to[>].Rendered (TTY,
sg.aimx.clickexample domain):Test plan
install.shsh -n install.shshellcheck install.shshellcheck -s sh tests/install_sh.shsh tests/install_sh.sh— 114/114 pass (4 new assertions in the# setup_completed gatesection).AIMX_DRY_RUN=1 AIMX_SETUP_COMPLETED_SYSTEMD=/nonexistent AIMX_SETUP_COMPLETED_OPENRC=/nonexistent sh install.sh --tag 0.0.9.1prints "continuing setup" + fresh-install dry-run lines.sh install.sh --tag 99.99.99still printsupgrading X -> 99.99.99/would stop aimx.service, swap binary, restart (upgrade).systemctl startfailure.aimx setupclosing handoffcargo test— 1149 unit + 116 integration tests pass.cargo clippy -- -D warningsclean.cargo fmt -- --checkclean.mcp_section_lines_are_in_documented_order,mcp_section_recommends_running_aimx_agents_setup,mcp_section_lists_every_registered_agent,final_banner_signals_completion,closing_message_carries_required_phrasing,closing_message_does_not_overclaim_agent_capability,run_setup_step_six_uses_handoff_state,run_setup_emits_section_headers_in_spec_order.NO_COLOR=1smoke: confirm the box borders stay aligned (the body row uses literal→, notprompt_mark's[>]fallback).