Skip to content

refactor: re-order code to suit for multi-frontend structure#197

Merged
jdx merged 4 commits intojdx:mainfrom
gaojunran:refactor-multi-frontend
Jan 31, 2026
Merged

refactor: re-order code to suit for multi-frontend structure#197
jdx merged 4 commits intojdx:mainfrom
gaojunran:refactor-multi-frontend

Conversation

@gaojunran
Copy link
Contributor

@gaojunran gaojunran commented Jan 29, 2026

Refactor IPC client to use batch operations and improve TUI daemon management

It's a big PR and includes these refactors:

  1. Move batch operations (start/stop daemons) out of CLI interface and these operations can be shared in TUI/Web UI simply.
  2. Improve error handling
  3. Improve stability for e2e tests (http/port ready check)
  4. Refactor restart command and make its behaviour just like start --force, considering dependency order and parallel starting

Copilot AI review requested due to automatic review settings January 29, 2026 06:41
@gaojunran
Copy link
Contributor Author

gaojunran commented Jan 29, 2026

Sorry for bringing big PRs because we have jet lag 😭. I have to deal with conflicts and rebases if using github stacked PRs.

For each time before my PR, I will run all tests and make Opus4.5 check my code to make it robust and stable.

However, our project will be more and more stable and later we won't need such big PRs.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR centralizes daemon lifecycle operations in a shared IPC batch module, updates the CLI and TUI to use these batch operations, and tightens readiness/error handling across the supervisor, IPC layer, and tests.

Changes:

  • Introduces ipc::batch with high-level start_daemons, stop_daemons, and run_adhoc operations using dependency resolution, and refactors IpcClient to return structured RunResult/StartResult/StopResult.
  • Updates CLI commands (start, stop, restart, run) and the TUI to use the new batch APIs, improves daemon status modeling (Errored vs ErroredUnknown), and propagates this to the web and TUI views.
  • Hardens IPC server handling of malformed messages, adds an explicit Invalid request path, and stabilizes HTTP/port readiness e2e tests by cleaning stray processes.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/test_e2e.rs Ensures HTTP/port readiness tests clean up stray processes on fixed ports before starting new servers.
tests/common/mod.rs Adds TestEnv::kill_port helper (Unix-only) using lsof/kill and a short delay to free ports between tests.
src/web/routes/logs.rs Refactors logs page tab/button/content generation to use inline-format strings with safe_id/js_id/url_id, keeping SSE clear-handling wired correctly.
src/web/routes/index.rs Updates daemon row status mapping to treat both Errored(_) and ErroredUnknown as "errored".
src/web/routes/daemons.rs Mirrors status mapping update, makes daemon names clickable links inside a clickable row, and slightly adjusts table body markup.
src/tui/ui.rs Adjusts TUI status rendering to show errored (code) for Errored(i32) and a generic errored label for ErroredUnknown.
src/tui/mod.rs Wraps IpcClient in an Arc, switches TUI actions to use start_daemons/stop_daemons, and improves user messages for single/batch start/stop/restart flows.
src/tui/app.rs Changes refresh to take &Arc<IpcClient>, updates status sort order for the new error states, and removes the embedded start_daemon helper in favor of IPC batch logic.
src/supervisor/lifecycle.rs On process exit, distinguishes between Errored(code) and ErroredUnknown, and updates daemon state accordingly for retry logic and UI display.
src/supervisor/ipc_handlers.rs Adds an IpcRequest::Invalid branch that logs and returns a structured IpcResponse::Error for malformed IPC messages.
src/ipc/server.rs Downgrades socket read errors to debug! and, on deserialization failure, constructs an Invalid request so the supervisor can reply with an error instead of silently dropping.
src/ipc/mod.rs Registers the new batch module, and extends IpcRequest with an internal-only Invalid { error: String } variant (skipped by serde).
src/ipc/client.rs Exposes low-level request helpers as pub(crate), refactors run to return a RunResult, and changes stop to return bool while normalizing and improving log messages.
src/ipc/batch.rs New module implementing high-level start_daemons, stop_daemons, and run_adhoc with dependency resolution, disabled-daemon filtering, readiness overrides, and parallelism within dependency levels.
src/daemon_status.rs Changes DaemonStatus::Errored from Option<i32> to Errored(i32) plus ErroredUnknown, updates styling/error-message helpers, and adds TOML roundtrip tests for all status variants.
src/cli/stop.rs Simplifies stop to call IpcClient::stop_daemons, exiting with code 1 if any daemon fails to stop.
src/cli/start.rs Rewrites start to construct StartOptions and delegate all daemon/dependency logic to start_daemons, then prints startup logs using the returned StartResult.
src/cli/run.rs Uses StartOptions and run_adhoc to manage one-off daemons, only printing startup logs when the daemon actually started and exiting non-zero if exit_code is present.
src/cli/restart.rs Refactors restart into a thin wrapper around start_daemons with force = true and new readiness flags; documentation now describes it as equivalent to start --force with dependency resolution.
pitchfork.usage.kdl Updates the restart command usage/help, adding readiness flags and the new long description tying behavior to start --force.
docs/cli/restart.md Regenerated restart CLI docs with [FLAGS] usage, updated description, examples, and readiness flag sections.
docs/cli/index.md Updates the CLI index entry for restart to use the generic [FLAGS] [ID]… usage string.
docs/cli/commands.json Regenerated CLI command metadata for restart, including new flags and updated long help.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 82 to 100
cmd restart help="Restarts a daemon (stops then starts it)" {
long_help "Restarts a daemon (stops then starts it)\n\nSends SIGTERM to stop the daemon, then starts it again from the\npitchfork.toml configuration.\n\nExamples:\n pitchfork restart api Restart a single daemon\n pitchfork restart api worker Restart multiple daemons\n pitchfork restart --all Restart all running daemons"
long_help "Restarts a daemon (stops then starts it)\n\nEquivalent to 'start --force' - stops the daemon (SIGTERM) then starts it again\nfrom the pitchfork.toml configuration with dependency resolution.\n\nExamples:\n pitchfork restart api Restart a single daemon\n pitchfork restart api worker Restart multiple daemons\n pitchfork restart --all Restart all running daemons\n pitchfork restart api --delay 5 Wait 5 seconds for daemon to be ready"
flag "-a --all" help="Restart all running daemons"
flag --delay help="Delay in seconds before considering daemon ready (default: 3 seconds)" {
arg <DELAY>
}
flag --output help="Wait until output matches this regex pattern before considering daemon ready" {
arg <OUTPUT>
}
flag --http help="Wait until HTTP endpoint returns 2xx status before considering daemon ready" {
arg <HTTP>
}
flag --port help="Wait until TCP port is listening before considering daemon ready" {
arg <PORT>
}
flag --cmd help="Shell command to poll for readiness (exit code 0 = ready)" {
arg <CMD>
}
flag "-q --quiet" help="Suppress startup log output"
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the long_help and -a/--all flag help both describe restart --all as restarting "all running daemons", but the new implementation in src/cli/restart.rs calls start_daemons(&self.id, self.all, opts) which, for all = true, operates on all configured non-disabled daemons (starting stopped ones as well). Please either update this spec text to reflect the actual behavior (e.g. describe it as equivalent to start --all --force), or adjust the implementation so that --all only targets currently running daemons as documented.

Copilot uses AI. Check for mistakes.
Comment on lines +21 to 29
pitchfork restart --all Restart all running daemons
pitchfork restart api --delay 5 Wait 5 seconds for daemon to be ready"
)]
pub struct Restart {
/// ID of the daemon(s) to restart
id: Vec<String>,
/// Restart all running daemons
#[clap(long, short)]
all: bool,
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new restart implementation delegates to start_daemons(&self.id, self.all, opts), and when all is true start_daemons uses pt.daemons.keys() (all configured daemons) regardless of whether they are currently running. This means pitchfork restart --all will also start daemons that were stopped, while the help text and field doc still say "Restart all running daemons". To avoid confusing behavior, either adjust the --all semantics to only target currently running daemons, or update the help/long_about text (and usage/docs) to clarify that --all affects all configured non-disabled daemons, not just those already running.

Copilot uses AI. Check for mistakes.
@jdx
Copy link
Owner

jdx commented Jan 29, 2026

bugbot run

@jdx
Copy link
Owner

jdx commented Jan 29, 2026

bugbot run verbose=true

@cursor
Copy link

cursor bot commented Jan 29, 2026

Bugbot request id: serverGenReqId_11097ac4-01c3-48c1-b303-db8da92483f3

@jdx
Copy link
Owner

jdx commented Jan 29, 2026

The refactor into ipc/batch.rs looks good overall — sharing batch operations across CLI/TUI/Web is a clean approach, and the dependency-level parallel execution is well structured.

One issue worth fixing before merge: restart --all semantics mismatch. The help text says "Restart all running daemons", but start_daemons with all=true uses pt.daemons.keys() (all configured daemons), meaning it will also start daemons that were previously stopped. stop_daemons correctly filters to running daemons, so the inconsistency is specifically in the restartstart_daemons path. Either filter to running daemons or update the docs to reflect the actual behavior.

This comment was AI-generated.

@gaojunran
Copy link
Contributor Author

One issue worth fixing before merge: restart --all semantics mismatch. The help text says "Restart all running daemons", but start_daemons with all=true uses pt.daemons.keys() (all configured daemons), meaning it will also start daemons that were previously stopped. stop_daemons correctly filters to running daemons, so the inconsistency is specifically in the restartstart_daemons path. Either filter to running daemons or update the docs to reflect the actual behavior.

@jdx I need your idea about restart --all. Should it be

  • the same as start --all --force, to restart all the configured daemons regardless of the status (this will make it clear that restart is JUST start --force)
    OR
  • just restart the running daemons (this seems to be more reasonable? not sure)

@jdx
Copy link
Owner

jdx commented Jan 29, 2026

I think it should restart the running daemons, similar to pf stop --all

@jdx
Copy link
Owner

jdx commented Jan 29, 2026

it might be wise to not have --all on start for this reason and have something like --all-configured or something to make it clear

@jdx
Copy link
Owner

jdx commented Jan 29, 2026

or maybe --all-local or --local

@gaojunran
Copy link
Contributor Author

gaojunran commented Jan 30, 2026

In this PR we can implement restart --all in the way like stop --all, just restarting the running daemons.

Later we can consider about how to replace start --all. My considerations:

  • Is the name local in --all-local a little misleading because I think we will finally need pitchfork.local.toml which should be git-ignored?
  • We need --all-local / -l(or an alternative name) and --all-global / -g

@gaojunran gaojunran force-pushed the refactor-multi-frontend branch from fa3bd2c to 174dfe9 Compare January 30, 2026 04:40
@jdx
Copy link
Owner

jdx commented Jan 30, 2026

The refactor looks solid overall. The shared batch.rs module significantly reduces duplication — start.rs, stop.rs, and restart.rs all drop to ~15-40 lines each. The TUI now goes through the same code paths as CLI, which fixes prior inconsistencies (e.g., wait_ready: false, std::env::current_dir() vs env::CWD).

The restart --all semantics are now correct — it fetches running daemon IDs via get_running_daemons() before passing them to start_daemons with force: true, so it won't start previously-stopped daemons.

A few things worth noting:

  1. restart no longer does an explicit stop-then-sleep-then-start. The old code did stopsleep(500ms)start. Now it relies on the supervisor's force-restart path via force: true. Should work, but worth verifying there are no race conditions where the old process hasn't fully exited before the new one spawns.

  2. stop <specific daemons> now always uses dependency resolution. Previously only stop --all did reverse-dependency ordering. The new stop_daemons in batch.rs always resolves dependencies for config-based daemons. This is arguably better behavior but is a change from before.

  3. Web UI still uses old patterns. The web routes weren't updated to use the batch module. Not a blocker, but if the goal is shared frontend logic, the web UI should eventually use start_daemons/stop_daemons too.

Other improvements are clean: DaemonStatus::Errored split, RunResult struct replacing the tuple return, invalid IPC request handling, inline format args in web templates.

This comment was AI-generated.

Copilot AI review requested due to automatic review settings January 31, 2026 06:44
@gaojunran gaojunran force-pushed the refactor-multi-frontend branch from 174dfe9 to 44c0618 Compare January 31, 2026 06:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +59 to 63
// Compute daemon IDs to restart
// For --all, only restart currently running daemons
let ids: Vec<String> = if self.all {
// Get all running daemons that are in config
// (ad-hoc daemons started via `pitchfork run` cannot be restarted from config)
let active = ipc.active_daemons().await?;
active
.into_iter()
.filter(|d| d.pid.is_some() && pt.daemons.contains_key(&d.id))
.map(|d| d.id)
.collect()
ipc.get_running_daemons().await?
} else {
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ipc.get_running_daemons() when --all is set means ids can include ad-hoc daemons that are not present in pitchfork.toml, but start_daemons immediately calls resolve_dependencies(&requested_ids, &pt.daemons)?, which will error out with DaemonNotFound for those IDs instead of skipping them. Previously, restart --all filtered to only config-backed daemons and ignored ad-hoc ones; if you want to preserve that behavior, consider filtering ids here to only those present in the merged config (or otherwise excluding ad-hoc daemons) before calling start_daemons.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Owner

@jdx jdx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review (AI-generated)

This is a well-motivated refactoring. Extracting the batch start/stop/restart logic from CLI commands into ipc::batch is the right move for sharing across CLI, TUI, and Web UI. The CLI commands are dramatically simplified as a result. A few observations:

1. restart lost validation of ad-hoc daemons

The old restart code validated that each daemon existed in config before stopping anything — preventing you from stopping an ad-hoc daemon that can't be restarted. The new code passes self.id.clone() directly to start_daemons, which will filter disabled daemons but doesn't check if a daemon is ad-hoc/config-based. If someone runs pitchfork restart <adhoc-daemon>, the force-start path will try to stop it, then fail to find it in config. The daemon gets stopped but never restarted.

The --all path uses get_running_daemons() which also includes ad-hoc daemons — the old code filtered to only pt.daemons.contains_key().

2. stop lost reverse-dependency ordering for specific daemons

The old stop_specific didn't do dependency resolution, but stop_all did. The new stop always calls stop_daemons which does dependency resolution. This is arguably better behavior, but it's a behavioral change worth noting — pitchfork stop foo will now also stop daemons that depend on foo (if foo is their dependency). Actually, looking more closely, stop_daemons only resolves the ordering of the requested IDs, not their dependents. So this should be fine, but the old stop_specific behavior (just stop what was asked, no dependency resolution at all) was simpler.

3. Errored(Option<i32>)Errored(i32) + ErroredUnknown is a serialization-breaking change

The DaemonStatus enum is serialized to the state file (state.toml). Changing Errored(Option<i32>) to Errored(i32) / ErroredUnknown means existing state files with Errored(None) won't deserialize correctly after upgrade. The roundtrip test covers new values but not migration from old format. This could cause issues on upgrade.

4. IpcRequest::Invalid with #[serde(skip)] is well done

Using #[serde(skip)] for the Invalid variant and constructing it on deserialization failure is a clean pattern — it lets the handler respond with an error instead of silently dropping the connection.

5. Process termination verification loop is good but hardcoded

The new loop in lifecycle.rs that waits for a process to fully terminate (up to 500ms with 50ms intervals) addresses a real race condition. However the 500ms max wait might be tight for processes that take longer to release resources. Consider making this configurable or at least logging a warning if the loop exhausts without the process terminating.

6. build_run_options returns Result<_, String> instead of miette::Result

The build_run_options helper returns std::result::Result<RunOptions, String> while the rest of the codebase uses miette::Result. This makes the Web UI callsite work (since it formats into HTML) but is inconsistent. The String error type loses diagnostic context.

7. get_all_configured_daemons parses all TOML files every call

IpcClient::get_all_configured_daemons() calls PitchforkToml::all_merged() which presumably re-reads and merges all config files. In start_daemons, PitchforkToml::all_merged() is called again. So when start --all is used, the config is parsed twice. Minor, but could be passed through.

8. TUI std::slice::from_ref(&id) for single daemon start

Using std::slice::from_ref(&id) to wrap a single String into a slice for start_daemons works but is a bit unusual. A small start_daemon convenience method on IpcClient that wraps the single-daemon case would be cleaner.

9. kill_port test helper uses kill -9

The new kill_port helper force-kills with SIGKILL. This works for test cleanup but is aggressive — if the process is in the middle of something, it won't clean up. For test stability this is probably fine, just noting it.


Overall this is a solid refactoring that achieves its goal of making batch operations reusable. The main concern is the behavioral change around restart validation of ad-hoc daemons (point 1) and the state file serialization compatibility (point 3).

@gaojunran
Copy link
Contributor Author

gaojunran commented Jan 31, 2026

@jdx addressed problem 1, 3, and 5. Note that in problem 1 we save the command of the ad-hoc daemons and use the saved command to restart. This allows an adhoc daemon to be restart and brings less confusion to user.

In problem 3 I think Errored<Option<i32>> is better than ErroredUnknown + Errored<i32>, so the logic is rollbacked

@gaojunran gaojunran force-pushed the refactor-multi-frontend branch from 42d7e8c to cb4220a Compare January 31, 2026 13:00
Copilot AI review requested due to automatic review settings January 31, 2026 13:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

gaojunran and others added 2 commits January 31, 2026 21:23
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@gaojunran gaojunran force-pushed the refactor-multi-frontend branch from 3da4510 to 50a2d59 Compare January 31, 2026 13:27
@jdx jdx merged commit 104521f into jdx:main Jan 31, 2026
3 checks passed
@jdx jdx mentioned this pull request Jan 30, 2026
jdx added a commit that referenced this pull request Feb 1, 2026
## 🤖 New release

* `pitchfork-cli`: 1.2.0 -> 1.3.0

<details><summary><i><b>Changelog</b></i></summary><p>

<blockquote>

## [1.3.0](v1.2.0...v1.3.0) -
2026-02-01

### Added

- *(list)* show available daemons and align logics with TUI
([#206](#206))
- *(logs)* support `--since <humantime>`, use pager by default
([#204](#204))
- support `pitchfork.local.toml`
([#198](#198))
- impl `stop --all` ([#195](#195))
- beautify web ui ([#191](#191))
- add ready_cmd option
([#187](#187))

### Fixed

- refactor the logic of stopping a daemon and add tests
([#192](#192))

### Other

- re-order code to suit for multi-frontend structure
([#197](#197))
- *(deps)* update rust crate xx to v2.3.1
([#203](#203))
- *(deps)* update rust crate clap to v4.5.56
([#202](#202))
- *(ci)* run linting on all files in CI
([#196](#196))
- Update README.md logo
([#184](#184))
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Low risk: this PR only updates version numbers and release notes/docs,
with no runtime code changes.
> 
> **Overview**
> Releases **v1.3.0** by bumping the crate version from `1.2.0` to
`1.3.0` (including `Cargo.lock`) and updating the
`pitchfork.usage.kdl`/generated CLI docs to match.
> 
> Updates `CHANGELOG.md` with the `1.3.0` release notes and links to the
included feature/fix PRs.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
d278b90. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants