refactor: re-order code to suit for multi-frontend structure#197
refactor: re-order code to suit for multi-frontend structure#197
Conversation
|
Sorry for bringing big PRs because we have jet lag 😭. I have to deal with conflicts and rebases if using github stacked PRs. For each time before my PR, I will run all tests and make Opus4.5 check my code to make it robust and stable. However, our project will be more and more stable and later we won't need such big PRs. |
There was a problem hiding this comment.
Pull request overview
This PR centralizes daemon lifecycle operations in a shared IPC batch module, updates the CLI and TUI to use these batch operations, and tightens readiness/error handling across the supervisor, IPC layer, and tests.
Changes:
- Introduces
ipc::batchwith high-levelstart_daemons,stop_daemons, andrun_adhocoperations using dependency resolution, and refactorsIpcClientto return structuredRunResult/StartResult/StopResult. - Updates CLI commands (
start,stop,restart,run) and the TUI to use the new batch APIs, improves daemon status modeling (ErroredvsErroredUnknown), and propagates this to the web and TUI views. - Hardens IPC server handling of malformed messages, adds an explicit
Invalidrequest path, and stabilizes HTTP/port readiness e2e tests by cleaning stray processes.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_e2e.rs | Ensures HTTP/port readiness tests clean up stray processes on fixed ports before starting new servers. |
| tests/common/mod.rs | Adds TestEnv::kill_port helper (Unix-only) using lsof/kill and a short delay to free ports between tests. |
| src/web/routes/logs.rs | Refactors logs page tab/button/content generation to use inline-format strings with safe_id/js_id/url_id, keeping SSE clear-handling wired correctly. |
| src/web/routes/index.rs | Updates daemon row status mapping to treat both Errored(_) and ErroredUnknown as "errored". |
| src/web/routes/daemons.rs | Mirrors status mapping update, makes daemon names clickable links inside a clickable row, and slightly adjusts table body markup. |
| src/tui/ui.rs | Adjusts TUI status rendering to show errored (code) for Errored(i32) and a generic errored label for ErroredUnknown. |
| src/tui/mod.rs | Wraps IpcClient in an Arc, switches TUI actions to use start_daemons/stop_daemons, and improves user messages for single/batch start/stop/restart flows. |
| src/tui/app.rs | Changes refresh to take &Arc<IpcClient>, updates status sort order for the new error states, and removes the embedded start_daemon helper in favor of IPC batch logic. |
| src/supervisor/lifecycle.rs | On process exit, distinguishes between Errored(code) and ErroredUnknown, and updates daemon state accordingly for retry logic and UI display. |
| src/supervisor/ipc_handlers.rs | Adds an IpcRequest::Invalid branch that logs and returns a structured IpcResponse::Error for malformed IPC messages. |
| src/ipc/server.rs | Downgrades socket read errors to debug! and, on deserialization failure, constructs an Invalid request so the supervisor can reply with an error instead of silently dropping. |
| src/ipc/mod.rs | Registers the new batch module, and extends IpcRequest with an internal-only Invalid { error: String } variant (skipped by serde). |
| src/ipc/client.rs | Exposes low-level request helpers as pub(crate), refactors run to return a RunResult, and changes stop to return bool while normalizing and improving log messages. |
| src/ipc/batch.rs | New module implementing high-level start_daemons, stop_daemons, and run_adhoc with dependency resolution, disabled-daemon filtering, readiness overrides, and parallelism within dependency levels. |
| src/daemon_status.rs | Changes DaemonStatus::Errored from Option<i32> to Errored(i32) plus ErroredUnknown, updates styling/error-message helpers, and adds TOML roundtrip tests for all status variants. |
| src/cli/stop.rs | Simplifies stop to call IpcClient::stop_daemons, exiting with code 1 if any daemon fails to stop. |
| src/cli/start.rs | Rewrites start to construct StartOptions and delegate all daemon/dependency logic to start_daemons, then prints startup logs using the returned StartResult. |
| src/cli/run.rs | Uses StartOptions and run_adhoc to manage one-off daemons, only printing startup logs when the daemon actually started and exiting non-zero if exit_code is present. |
| src/cli/restart.rs | Refactors restart into a thin wrapper around start_daemons with force = true and new readiness flags; documentation now describes it as equivalent to start --force with dependency resolution. |
| pitchfork.usage.kdl | Updates the restart command usage/help, adding readiness flags and the new long description tying behavior to start --force. |
| docs/cli/restart.md | Regenerated restart CLI docs with [FLAGS] usage, updated description, examples, and readiness flag sections. |
| docs/cli/index.md | Updates the CLI index entry for restart to use the generic [FLAGS] [ID]… usage string. |
| docs/cli/commands.json | Regenerated CLI command metadata for restart, including new flags and updated long help. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| cmd restart help="Restarts a daemon (stops then starts it)" { | ||
| long_help "Restarts a daemon (stops then starts it)\n\nSends SIGTERM to stop the daemon, then starts it again from the\npitchfork.toml configuration.\n\nExamples:\n pitchfork restart api Restart a single daemon\n pitchfork restart api worker Restart multiple daemons\n pitchfork restart --all Restart all running daemons" | ||
| long_help "Restarts a daemon (stops then starts it)\n\nEquivalent to 'start --force' - stops the daemon (SIGTERM) then starts it again\nfrom the pitchfork.toml configuration with dependency resolution.\n\nExamples:\n pitchfork restart api Restart a single daemon\n pitchfork restart api worker Restart multiple daemons\n pitchfork restart --all Restart all running daemons\n pitchfork restart api --delay 5 Wait 5 seconds for daemon to be ready" | ||
| flag "-a --all" help="Restart all running daemons" | ||
| flag --delay help="Delay in seconds before considering daemon ready (default: 3 seconds)" { | ||
| arg <DELAY> | ||
| } | ||
| flag --output help="Wait until output matches this regex pattern before considering daemon ready" { | ||
| arg <OUTPUT> | ||
| } | ||
| flag --http help="Wait until HTTP endpoint returns 2xx status before considering daemon ready" { | ||
| arg <HTTP> | ||
| } | ||
| flag --port help="Wait until TCP port is listening before considering daemon ready" { | ||
| arg <PORT> | ||
| } | ||
| flag --cmd help="Shell command to poll for readiness (exit code 0 = ready)" { | ||
| arg <CMD> | ||
| } | ||
| flag "-q --quiet" help="Suppress startup log output" |
There was a problem hiding this comment.
Here the long_help and -a/--all flag help both describe restart --all as restarting "all running daemons", but the new implementation in src/cli/restart.rs calls start_daemons(&self.id, self.all, opts) which, for all = true, operates on all configured non-disabled daemons (starting stopped ones as well). Please either update this spec text to reflect the actual behavior (e.g. describe it as equivalent to start --all --force), or adjust the implementation so that --all only targets currently running daemons as documented.
| pitchfork restart --all Restart all running daemons | ||
| pitchfork restart api --delay 5 Wait 5 seconds for daemon to be ready" | ||
| )] | ||
| pub struct Restart { | ||
| /// ID of the daemon(s) to restart | ||
| id: Vec<String>, | ||
| /// Restart all running daemons | ||
| #[clap(long, short)] | ||
| all: bool, |
There was a problem hiding this comment.
The new restart implementation delegates to start_daemons(&self.id, self.all, opts), and when all is true start_daemons uses pt.daemons.keys() (all configured daemons) regardless of whether they are currently running. This means pitchfork restart --all will also start daemons that were stopped, while the help text and field doc still say "Restart all running daemons". To avoid confusing behavior, either adjust the --all semantics to only target currently running daemons, or update the help/long_about text (and usage/docs) to clarify that --all affects all configured non-disabled daemons, not just those already running.
|
bugbot run |
|
bugbot run verbose=true |
|
Bugbot request id: serverGenReqId_11097ac4-01c3-48c1-b303-db8da92483f3 |
|
The refactor into One issue worth fixing before merge: This comment was AI-generated. |
@jdx I need your idea about
|
|
I think it should restart the running daemons, similar to |
|
it might be wise to not have |
|
or maybe |
|
In this PR we can implement Later we can consider about how to replace
|
fa3bd2c to
174dfe9
Compare
|
The refactor looks solid overall. The shared The A few things worth noting:
Other improvements are clean: This comment was AI-generated. |
174dfe9 to
44c0618
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Compute daemon IDs to restart | ||
| // For --all, only restart currently running daemons | ||
| let ids: Vec<String> = if self.all { | ||
| // Get all running daemons that are in config | ||
| // (ad-hoc daemons started via `pitchfork run` cannot be restarted from config) | ||
| let active = ipc.active_daemons().await?; | ||
| active | ||
| .into_iter() | ||
| .filter(|d| d.pid.is_some() && pt.daemons.contains_key(&d.id)) | ||
| .map(|d| d.id) | ||
| .collect() | ||
| ipc.get_running_daemons().await? | ||
| } else { |
There was a problem hiding this comment.
Using ipc.get_running_daemons() when --all is set means ids can include ad-hoc daemons that are not present in pitchfork.toml, but start_daemons immediately calls resolve_dependencies(&requested_ids, &pt.daemons)?, which will error out with DaemonNotFound for those IDs instead of skipping them. Previously, restart --all filtered to only config-backed daemons and ignored ad-hoc ones; if you want to preserve that behavior, consider filtering ids here to only those present in the merged config (or otherwise excluding ad-hoc daemons) before calling start_daemons.
jdx
left a comment
There was a problem hiding this comment.
Code Review (AI-generated)
This is a well-motivated refactoring. Extracting the batch start/stop/restart logic from CLI commands into ipc::batch is the right move for sharing across CLI, TUI, and Web UI. The CLI commands are dramatically simplified as a result. A few observations:
1. restart lost validation of ad-hoc daemons
The old restart code validated that each daemon existed in config before stopping anything — preventing you from stopping an ad-hoc daemon that can't be restarted. The new code passes self.id.clone() directly to start_daemons, which will filter disabled daemons but doesn't check if a daemon is ad-hoc/config-based. If someone runs pitchfork restart <adhoc-daemon>, the force-start path will try to stop it, then fail to find it in config. The daemon gets stopped but never restarted.
The --all path uses get_running_daemons() which also includes ad-hoc daemons — the old code filtered to only pt.daemons.contains_key().
2. stop lost reverse-dependency ordering for specific daemons
The old stop_specific didn't do dependency resolution, but stop_all did. The new stop always calls stop_daemons which does dependency resolution. This is arguably better behavior, but it's a behavioral change worth noting — pitchfork stop foo will now also stop daemons that depend on foo (if foo is their dependency). Actually, looking more closely, stop_daemons only resolves the ordering of the requested IDs, not their dependents. So this should be fine, but the old stop_specific behavior (just stop what was asked, no dependency resolution at all) was simpler.
3. Errored(Option<i32>) → Errored(i32) + ErroredUnknown is a serialization-breaking change
The DaemonStatus enum is serialized to the state file (state.toml). Changing Errored(Option<i32>) to Errored(i32) / ErroredUnknown means existing state files with Errored(None) won't deserialize correctly after upgrade. The roundtrip test covers new values but not migration from old format. This could cause issues on upgrade.
4. IpcRequest::Invalid with #[serde(skip)] is well done
Using #[serde(skip)] for the Invalid variant and constructing it on deserialization failure is a clean pattern — it lets the handler respond with an error instead of silently dropping the connection.
5. Process termination verification loop is good but hardcoded
The new loop in lifecycle.rs that waits for a process to fully terminate (up to 500ms with 50ms intervals) addresses a real race condition. However the 500ms max wait might be tight for processes that take longer to release resources. Consider making this configurable or at least logging a warning if the loop exhausts without the process terminating.
6. build_run_options returns Result<_, String> instead of miette::Result
The build_run_options helper returns std::result::Result<RunOptions, String> while the rest of the codebase uses miette::Result. This makes the Web UI callsite work (since it formats into HTML) but is inconsistent. The String error type loses diagnostic context.
7. get_all_configured_daemons parses all TOML files every call
IpcClient::get_all_configured_daemons() calls PitchforkToml::all_merged() which presumably re-reads and merges all config files. In start_daemons, PitchforkToml::all_merged() is called again. So when start --all is used, the config is parsed twice. Minor, but could be passed through.
8. TUI std::slice::from_ref(&id) for single daemon start
Using std::slice::from_ref(&id) to wrap a single String into a slice for start_daemons works but is a bit unusual. A small start_daemon convenience method on IpcClient that wraps the single-daemon case would be cleaner.
9. kill_port test helper uses kill -9
The new kill_port helper force-kills with SIGKILL. This works for test cleanup but is aggressive — if the process is in the middle of something, it won't clean up. For test stability this is probably fine, just noting it.
Overall this is a solid refactoring that achieves its goal of making batch operations reusable. The main concern is the behavioral change around restart validation of ad-hoc daemons (point 1) and the state file serialization compatibility (point 3).
|
@jdx addressed problem 1, 3, and 5. Note that in problem 1 we save the command of the ad-hoc daemons and use the saved command to restart. This allows an adhoc daemon to be restart and brings less confusion to user. In problem 3 I think |
42d7e8c to
cb4220a
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
3da4510 to
50a2d59
Compare
## 🤖 New release * `pitchfork-cli`: 1.2.0 -> 1.3.0 <details><summary><i><b>Changelog</b></i></summary><p> <blockquote> ## [1.3.0](v1.2.0...v1.3.0) - 2026-02-01 ### Added - *(list)* show available daemons and align logics with TUI ([#206](#206)) - *(logs)* support `--since <humantime>`, use pager by default ([#204](#204)) - support `pitchfork.local.toml` ([#198](#198)) - impl `stop --all` ([#195](#195)) - beautify web ui ([#191](#191)) - add ready_cmd option ([#187](#187)) ### Fixed - refactor the logic of stopping a daemon and add tests ([#192](#192)) ### Other - re-order code to suit for multi-frontend structure ([#197](#197)) - *(deps)* update rust crate xx to v2.3.1 ([#203](#203)) - *(deps)* update rust crate clap to v4.5.56 ([#202](#202)) - *(ci)* run linting on all files in CI ([#196](#196)) - Update README.md logo ([#184](#184)) </blockquote> </p></details> --- This PR was generated with [release-plz](https://github.com/release-plz/release-plz/). <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Low Risk** > Low risk: this PR only updates version numbers and release notes/docs, with no runtime code changes. > > **Overview** > Releases **v1.3.0** by bumping the crate version from `1.2.0` to `1.3.0` (including `Cargo.lock`) and updating the `pitchfork.usage.kdl`/generated CLI docs to match. > > Updates `CHANGELOG.md` with the `1.3.0` release notes and links to the included feature/fix PRs. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit d278b90. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Refactor IPC client to use batch operations and improve TUI daemon management
It's a big PR and includes these refactors:
restartcommand and make its behaviour just likestart --force, considering dependency order and parallel starting