Skip to content

Windows support: CUDA detection, cross-platform justfile, clean server shutdown#272

Merged
jamiepine merged 5 commits intomainfrom
windows-support
Mar 15, 2026
Merged

Windows support: CUDA detection, cross-platform justfile, clean server shutdown#272
jamiepine merged 5 commits intomainfrom
windows-support

Conversation

@jamiepine
Copy link
Owner

@jamiepine jamiepine commented Mar 15, 2026

Summary

  • Justfile cross-platform support: All recipes now work on Windows (PowerShell) with [unix]/[windows] variants. just setup auto-installs CUDA PyTorch when an NVIDIA GPU is detected. Added build-local, build-server-cuda recipes. PyInstaller included in setup.
  • CUDA detection in dev mode: /health endpoint now reports backend_variant=cuda when torch.cuda.is_available() instead of hardcoding cpu. GPU Acceleration UI cleaned up — no redundant info, download button hidden when CUDA is already active.
  • Clean server shutdown: Server monitors parent PID via --parent-pid watchdog thread instead of Rust spawning taskkill/wmic console windows. --version moved before heavy imports for instant response.
  • CORS fix: Added http://tauri.localhost origin (Windows production builds use http, not https).
  • UI fixes: Hidden drag region on Windows (native titlebar), dev-mode update status, VRAM 0 rendering fix, pt-8 top padding on Windows.
  • Docs: Updated README and CONTRIBUTING for cross-platform just workflow.

Changes

File What
justfile Windows recipes for all commands, CUDA PyTorch auto-install, build-local/build-server-cuda, PyInstaller in setup
backend/server.py --parent-pid watchdog, --version fast path before torch imports
backend/main.py CORS http://tauri.localhost, backend_variant from torch.cuda.is_available()
tauri/.../main.rs Pass --parent-pid, remove taskkill/wmic/process tree kill code (~170 lines deleted)
README.md Platform table, build-local docs
CONTRIBUTING.md Unified setup flow, removed manual setup required for Windows

Summary by CodeRabbit

  • Documentation

    • Overhauled setup and build docs: platform prerequisites, streamlined Just-based workflow, Windows-specific notes, model download, OpenAPI/client generation, and local build instructions (including CUDA guidance).
  • New Features

    • Frontend-only dev command; Windows CUDA-enabled build and local-build targets; explicit local installer build guidance; new runtime toggle to disable parent-watchdog.
  • Bug Fixes

    • Smoother shutdown when parent process exits, improved Windows webview compatibility, and more robust start/stop flows with frontend handshake.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 15, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a parent-process watchdog propagated from the Tauri app to spawned backend processes, exposes a runtime API to disable the watchdog, simplifies Tauri shutdown flow, and replaces Makefile-based setup with a Just-centric, Windows/CUDA-aware dev and build workflow; PyInstaller build logic updated for CUDA DLL handling.

Changes

Cohort / File(s) Summary
Documentation & Setup
CONTRIBUTING.md, README.md
Replace Makefile-centric instructions with Just-based workflow, add platform/GPU notes, Windows CUDA guidance, model/download and OpenAPI generation steps, and new dev/build commands (just dev, just dev-frontend, just build, just build-local).
Backend Server & API
backend/server.py, backend/main.py
Add --parent-pid CLI arg, implement _start_parent_watchdog(parent_pid, ...) with cross-platform liveness checks and startup wiring; add disable_watchdog() API endpoint; early fast-path --version handling and extended CORS origins.
Tauri Runtime & Process Management
tauri/src-tauri/src/main.rs
Thread parent PID into server invocations (--parent-pid), remove Windows process-tree helpers, introduce guarded window-close handshake and rely on watchdog for shutdown; adjust stop/restart flows and keep-running handling.
Build, Packaging & Just targets
backend/build_binary.py, justfile
PyInstaller: add hidden imports, implement CUDA DLL discovery/exclusion and Windows runtime swap to install CPU-only torch for CPU builds (with restore), wrap build in try/finally; add Windows-aware Just targets (dev-frontend, build-server-cuda, build-local, db-init) and update process handling in Justfile.

Sequence Diagram(s)

sequenceDiagram
    participant Tauri as Tauri App
    participant Backend as Backend Server
    participant OS as Operating System

    Tauri->>Tauri: determine parent_pid (current PID)
    Tauri->>Backend: spawn backend with "--parent-pid <pid>"
    Backend->>Backend: parse CLI args, start watchdog thread
    rect rgba(100,150,255,0.5)
      Note over Backend: Normal operation
      Backend->>OS: poll/check parent PID liveness
      OS-->>Backend: alive / dead
    end
    rect rgba(255,150,100,0.5)
      Note over Tauri,Backend: Parent termination -> shutdown
      Tauri->>Backend: (parent exits / app closed)
      Backend->>Backend: watchdog detects parent dead
      Backend->>Backend: initiate graceful shutdown
      Backend->>OS: release resources and exit
    end
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hopped a PID into a tiny shell,

I watched the parent breathe and rang a gentle bell.
No messy kills — just courteous goodnight,
CUDA dreams compiled beneath soft light.
I nibbled bugs and left the runners right.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the three main changes in this PR: Windows support enablement, cross-platform justfile improvements, and clean server shutdown via watchdog.
Docstring Coverage ✅ Passed Docstring coverage is 84.62% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch windows-support
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/server.py`:
- Around line 118-123: The CLI accepts any integer for the "--parent-pid"
argument which allows non-positive values to enable the watchdog and cause
invalid shutdown behavior; update the argument parsing and watchdog enablement
so that parser.add_argument("--parent-pid", ...) yields a value that is only
treated as set when it's a positive int (e.g., validate or coerce to None for
<=0) and then gate the watchdog startup logic (the code that checks/uses
parent-pid to start the watchdog) to run only when parent_pid > 0; locate the
parser.add_argument call for "--parent-pid" and the subsequent watchdog
startup/monitoring code and add a simple positive-PID check before enabling the
watchdog.

In `@justfile`:
- Around line 192-196: The Windows copy step can fail if the target directory
doesn't exist; before the Copy-Item that writes to "{{ tauri_dir
}}/src-tauri/binaries/voicebox-server-$triple.exe" ensure the directory exists
(use Test-Path/New-Item or New-Item -ItemType Directory -Force) so the Copy-Item
for the sidecar (referenced by Copy-Item and $triple) won’t error on clean
checkouts; insert the directory-creation guard just prior to the Copy-Item line
that places the Windows binary in the src-tauri/binaries folder.

In `@README.md`:
- Line 243: Replace the product name spelling "XCode" with the correct "Xcode"
in the README prerequisites line (the string "**Prerequisites:**
[Bun](https://bun.sh), [Rust](https://rustup.rs), [Python
3.11+](https://python.org), [Tauri
Prerequisites](https://v2.tauri.app/start/prerequisites/), and
[XCode](https://developer.apple.com/xcode/) on macOS.") so the link text and
display use "Xcode".

In `@tauri/src-tauri/src/main.rs`:
- Around line 525-537: The graceful shutdown in stop_server currently only POSTs
/shutdown and doesn't handle failure; update stop_server so that after the
reqwest POST to http://127.0.0.1:{SERVER_PORT}/shutdown you check the Result
(timeout/error or non-success status) and, on Windows only, invoke a fallback
force-terminate path that kills the old server process: obtain the server
PID/handle the code already stores (or read the PID file used by
start_server/restart_server), then perform a platform-specific kill (e.g., call
taskkill /PID <pid> /T /F or use the WinAPI) and confirm the process exited
before returning; reference stop_server, restart_server, start_server and
SERVER_PORT when locating where to add the check and the Windows-only
force-terminate fallback.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9083b2c5-8682-41fd-9abd-2b79308d575f

📥 Commits

Reviewing files that changed from the base of the PR and between 7a511e3 and 4d6c976.

📒 Files selected for processing (6)
  • CONTRIBUTING.md
  • README.md
  • backend/main.py
  • backend/server.py
  • justfile
  • tauri/src-tauri/src/main.rs

Comment on lines +525 to +537
// Send graceful shutdown via HTTP — the server's parent-pid watchdog
// will also handle cleanup if this app process exits.
println!("Sending graceful shutdown via HTTP...");
let client = reqwest::blocking::Client::builder()
.timeout(std::time::Duration::from_secs(2))
.build()
.unwrap();

let shutdown_result = client
let _ = client
.post(&format!("http://127.0.0.1:{}/shutdown", SERVER_PORT))
.send();

if shutdown_result.is_ok() {
println!("HTTP shutdown sent, waiting for graceful exit...");
// Wait up to 3 seconds for graceful shutdown
for i in 0..30 {
std::thread::sleep(std::time::Duration::from_millis(100));
if !is_process_running(pid) {
println!("Process exited gracefully after {}ms", i * 100);
return Ok(());
}
}
println!("Graceful shutdown timed out, forcing kill...");
} else {
println!("HTTP shutdown failed, forcing kill...");
}

// Layer 2: Kill process tree with enumeration
println!("Killing process tree for wrapper PID {}...", pid);
kill_windows_process_tree(pid)?;

// Layer 3: Verify and kill by name if still running
std::thread::sleep(std::time::Duration::from_millis(200));
if is_process_running(pid) {
println!("Process tree kill failed, killing by name...");
use std::process::Command;
let _ = Command::new("taskkill")
.args(["/IM", "voicebox-server.exe", "/T", "/F"])
.output();
}

// Layer 4: Final verification
std::thread::sleep(std::time::Duration::from_millis(200));
if is_process_running(pid) {
eprintln!("WARNING: Failed to kill server after all attempts");
} else {
println!("Server killed successfully");
}
}

#[cfg(unix)]
{
println!("stop_server: Process group kill completed");
println!("Shutdown request sent (server watchdog will handle cleanup)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add a hard-stop fallback on Windows when graceful shutdown fails

On Windows, stop_server now only posts /shutdown (Line [533]). If that request fails/hangs, restart_server (Line [562] onward) can proceed while the old server is still alive, and start_server may reuse that stale process. Please add a fallback force-terminate path when shutdown HTTP is not successful.

🛠️ Suggested patch
         #[cfg(windows)]
         {
             // Send graceful shutdown via HTTP — the server's parent-pid watchdog
             // will also handle cleanup if this app process exits.
             println!("Sending graceful shutdown via HTTP...");
             let client = reqwest::blocking::Client::builder()
                 .timeout(std::time::Duration::from_secs(2))
                 .build()
                 .unwrap();
 
-            let _ = client
+            let shutdown_ok = client
                 .post(&format!("http://127.0.0.1:{}/shutdown", SERVER_PORT))
-                .send();
+                .send()
+                .map(|r| r.status().is_success())
+                .unwrap_or(false);
+
+            if !shutdown_ok {
+                println!("Graceful shutdown failed, forcing process termination...");
+                let _ = std::process::Command::new("taskkill")
+                    .args(["/PID", &pid.to_string(), "/T", "/F"])
+                    .output();
+            }
 
             println!("Shutdown request sent (server watchdog will handle cleanup)");
         }

Also applies to: 565-570

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tauri/src-tauri/src/main.rs` around lines 525 - 537, The graceful shutdown in
stop_server currently only POSTs /shutdown and doesn't handle failure; update
stop_server so that after the reqwest POST to
http://127.0.0.1:{SERVER_PORT}/shutdown you check the Result (timeout/error or
non-success status) and, on Windows only, invoke a fallback force-terminate path
that kills the old server process: obtain the server PID/handle the code already
stores (or read the PID file used by start_server/restart_server), then perform
a platform-specific kill (e.g., call taskkill /PID <pid> /T /F or use the
WinAPI) and confirm the process exited before returning; reference stop_server,
restart_server, start_server and SERVER_PORT when locating where to add the
check and the Windows-only force-terminate fallback.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
justfile (1)

197-197: ⚠️ Potential issue | 🟠 Major

Create the sidecar destination directory before copying on Windows.

Copy-Item can fail on clean checkouts when tauri/src-tauri/binaries does not exist yet.

🛠️ Suggested patch
     & "{{ python }}" backend/build_binary.py; \
     if ($LASTEXITCODE -ne 0) { throw "build_binary.py failed with exit code $LASTEXITCODE" }; \
     $triple = (rustc --print host-tuple); \
+    New-Item -ItemType Directory -Path "{{ tauri_dir }}/src-tauri/binaries" -Force | Out-Null; \
     Copy-Item "backend/dist/voicebox-server.exe" "{{ tauri_dir }}/src-tauri/binaries/voicebox-server-$triple.exe" -Force; \
     Write-Host "Copied sidecar: voicebox-server-$triple.exe"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@justfile` at line 197, The Copy-Item command in the justfile can fail if the
sidecar destination directory (tauri/src-tauri/binaries) doesn't exist; before
the Copy-Item that references "{{ tauri_dir
}}/src-tauri/binaries/voicebox-server-$triple.exe" ensure the destination
directory is created (e.g., check Test-Path or use New-Item -ItemType Directory
-Force) so the directory exists prior to copying the
backend/dist/voicebox-server.exe file.
tauri/src-tauri/src/main.rs (1)

528-537: ⚠️ Potential issue | 🟠 Major

Handle /shutdown failure on Windows and fall back to force-terminate.

stop_server currently ignores both request errors and non-success responses; restart can race with a still-running old server process.

🛠️ Suggested patch
-            let client = reqwest::blocking::Client::builder()
+            let client = reqwest::blocking::Client::builder()
                 .timeout(std::time::Duration::from_secs(2))
                 .build()
-                .unwrap();
+                .map_err(|e| format!("Failed to create shutdown client: {}", e))?;
 
-            let _ = client
+            let shutdown_ok = client
                 .post(&format!("http://127.0.0.1:{}/shutdown", SERVER_PORT))
-                .send();
+                .send()
+                .map(|r| r.status().is_success())
+                .unwrap_or(false);
+
+            if !shutdown_ok {
+                println!("Graceful shutdown failed, forcing process termination...");
+                let _ = std::process::Command::new("taskkill")
+                    .args(["/PID", &pid.to_string(), "/T", "/F"])
+                    .output();
+            }
 
             println!("Shutdown request sent (server watchdog will handle cleanup)");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tauri/src-tauri/src/main.rs` around lines 528 - 537, In stop_server, don't
ignore the POST result to /shutdown; capture the result of
client.post(...).send() and check for Err or a non-success HTTP status, and if
either occurs perform a Windows-specific fallback to force-terminate the old
server process (e.g., use a Windows API/Command or kill-by-pid) to avoid races;
reference the existing client builder, the POST to
format!("http://127.0.0.1:{}/shutdown", SERVER_PORT) and the stop_server
function when adding this error handling and the conditional Windows-only
fallback.
backend/server.py (1)

151-156: ⚠️ Potential issue | 🟠 Major

Validate --parent-pid as strictly positive before enabling watchdog.

Current gating accepts 0/negative values, which can cause invalid parent-liveness behavior.

🛠️ Suggested patch
         args = parser.parse_args()
+        if args.parent_pid is not None and args.parent_pid <= 0:
+            parser.error("--parent-pid must be a positive integer")
@@
-        if args.parent_pid is not None:
+        if args.parent_pid is not None:
             _parent_pid = args.parent_pid
             _data_dir = args.data_dir
             `@app.on_event`("startup")
             async def _on_startup():
                 _start_parent_watchdog(_parent_pid, _data_dir)

Also applies to: 176-177

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 151 - 156, The --parent-pid argument is
currently accepted even when zero or negative, causing invalid parent-liveness
behavior; update validation so the parsed parent_pid is treated as None (or
rejected) unless strictly > 0: after parsing (where parser.add_argument defines
"--parent-pid") add a check for parent_pid = args.parent_pid and if parent_pid
is None or parent_pid <= 0 set parent_pid = None (or raise ArgumentTypeError)
and ensure the watchdog enabling logic (the code that checks parent_pid later,
e.g., the block referenced at lines ~176-177 that starts the
parent-monitor/watchdog) only runs when parent_pid is a positive int; refer to
the argument name "--parent-pid" and the variable parent_pid/watchdog enabling
code to locate and apply this validation.
🧹 Nitpick comments (2)
backend/build_binary.py (1)

186-219: _get_cuda_dll_excludes() is currently dead code.

The helper is defined but never consumed in this script, so it adds maintenance overhead and can mislead readers about active exclusion behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/build_binary.py` around lines 186 - 219, The _get_cuda_dll_excludes()
helper is dead code—remove the entire function definition (named
_get_cuda_dll_excludes) from backend/build_binary.py and also delete any unused
local symbols that only supported it (e.g., exclude_dlls, cuda_prefixes) and any
imports that become unused as a result; if CUDA DLL exclusion is actually
intended, instead call _get_cuda_dll_excludes() from the build path that
configures PyInstaller excludes (replace the unused helper with an explicit
invocation in the packaging flow), but otherwise delete the function to avoid
confusion and update imports/linters accordingly.
backend/server.py (1)

75-76: Avoid silent exception swallowing in watchdog logger setup.

Catching and discarding all exceptions here makes production watchdog failures hard to diagnose.

🛠️ Proposed adjustment
-        except Exception:
-            pass
+        except Exception as e:
+            logging.getLogger(__name__).warning(
+                "Failed to initialize watchdog file logger: %s", e
+            )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 75 - 76, The except Exception: pass in the
watchdog logger setup silently swallows errors; replace it so exceptions are
logged and not discarded—catch the exception and call the module logger (or
processLogger) with logger.exception(...) or logger.error(..., exc_info=True) to
record the stacktrace and context (e.g., in the watchdog setup block where
except Exception: appears), and optionally re-raise or surface the error if
logger setup failure should stop startup.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/build_binary.py`:
- Around line 170-181: Wrap the PyInstaller run in a try/finally so the CUDA
restore block always executes: place the call to PyInstaller.__main__.run(args)
inside a try, and move the existing restore logic that checks restore_cuda and
calls subprocess.run(...) into the finally branch; if you need to surface build
errors, re-raise the exception after the restore (or let it propagate) so
environment restoration happens even on failure. Ensure you still reference the
restore_cuda flag and the same pip install invocation used currently.

---

Duplicate comments:
In `@backend/server.py`:
- Around line 151-156: The --parent-pid argument is currently accepted even when
zero or negative, causing invalid parent-liveness behavior; update validation so
the parsed parent_pid is treated as None (or rejected) unless strictly > 0:
after parsing (where parser.add_argument defines "--parent-pid") add a check for
parent_pid = args.parent_pid and if parent_pid is None or parent_pid <= 0 set
parent_pid = None (or raise ArgumentTypeError) and ensure the watchdog enabling
logic (the code that checks parent_pid later, e.g., the block referenced at
lines ~176-177 that starts the parent-monitor/watchdog) only runs when
parent_pid is a positive int; refer to the argument name "--parent-pid" and the
variable parent_pid/watchdog enabling code to locate and apply this validation.

In `@justfile`:
- Line 197: The Copy-Item command in the justfile can fail if the sidecar
destination directory (tauri/src-tauri/binaries) doesn't exist; before the
Copy-Item that references "{{ tauri_dir
}}/src-tauri/binaries/voicebox-server-$triple.exe" ensure the destination
directory is created (e.g., check Test-Path or use New-Item -ItemType Directory
-Force) so the directory exists prior to copying the
backend/dist/voicebox-server.exe file.

In `@tauri/src-tauri/src/main.rs`:
- Around line 528-537: In stop_server, don't ignore the POST result to
/shutdown; capture the result of client.post(...).send() and check for Err or a
non-success HTTP status, and if either occurs perform a Windows-specific
fallback to force-terminate the old server process (e.g., use a Windows
API/Command or kill-by-pid) to avoid races; reference the existing client
builder, the POST to format!("http://127.0.0.1:{}/shutdown", SERVER_PORT) and
the stop_server function when adding this error handling and the conditional
Windows-only fallback.

---

Nitpick comments:
In `@backend/build_binary.py`:
- Around line 186-219: The _get_cuda_dll_excludes() helper is dead code—remove
the entire function definition (named _get_cuda_dll_excludes) from
backend/build_binary.py and also delete any unused local symbols that only
supported it (e.g., exclude_dlls, cuda_prefixes) and any imports that become
unused as a result; if CUDA DLL exclusion is actually intended, instead call
_get_cuda_dll_excludes() from the build path that configures PyInstaller
excludes (replace the unused helper with an explicit invocation in the packaging
flow), but otherwise delete the function to avoid confusion and update
imports/linters accordingly.

In `@backend/server.py`:
- Around line 75-76: The except Exception: pass in the watchdog logger setup
silently swallows errors; replace it so exceptions are logged and not
discarded—catch the exception and call the module logger (or processLogger) with
logger.exception(...) or logger.error(..., exc_info=True) to record the
stacktrace and context (e.g., in the watchdog setup block where except
Exception: appears), and optionally re-raise or surface the error if logger
setup failure should stop startup.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8bb7385d-63dc-455a-b383-14857cbf6509

📥 Commits

Reviewing files that changed from the base of the PR and between 4d6c976 and f196374.

📒 Files selected for processing (4)
  • backend/build_binary.py
  • backend/server.py
  • justfile
  • tauri/src-tauri/src/main.rs

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
backend/server.py (1)

69-76: Don't swallow watchdog log setup failures.

This drops permission/path errors even though the file logger is your only production breadcrumb for watchdog behavior. At least warn to the root logger and narrow the catch to OSError.

♻️ Proposed change
     if data_dir:
         try:
             log_dir = os.path.join(data_dir, "logs")
             os.makedirs(log_dir, exist_ok=True)
             fh = logging.FileHandler(os.path.join(log_dir, "watchdog.log"))
             fh.setFormatter(logging.Formatter('%(asctime)s - %(message)s'))
             watchdog_logger.addHandler(fh)
-        except Exception:
-            pass
+        except OSError as exc:
+            logger.warning("Failed to initialize watchdog log file in %s: %s", log_dir, exc)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 69 - 76, The try/except around creating
log_dir and adding the FileHandler for watchdog_logger currently swallows all
exceptions; change it to catch only OSError (or PermissionError) and log a
warning to the root logger with the exception details instead of silently
passing. Specifically, update the block that creates log_dir, instantiates
logging.FileHandler and calls watchdog_logger.addHandler(...) to catch OSError
as e and call logging.getLogger().warning(...) (or .exception(...) if preferred)
with a clear message and the exception info so failures to set up the file
handler are visible.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/server.py`:
- Around line 113-118: The code currently checks parent liveness using
_is_pid_alive and simply returns if the parent is gone, which disables the
watchdog but leaves the server orphaned; replace the bare return after
watchdog_logger.warning with code that invokes the application's normal shutdown
path (i.e., call the same shutdown handler used for graceful termination instead
of returning). Locate where the app performs shutdown (the existing shutdown
coroutine/function or on_shutdown handler) and call or schedule that function
here so the process follows the same cleanup logic as a normal shutdown rather
than exiting via return; keep the watchdog_logger.warning and pass context
(parent_pid) into the shutdown invocation for clarity.
- Around line 120-122: The current use of os.kill(os.getpid(), signal.SIGTERM)
in the parent-liveness check (inside the block that calls _is_pid_alive and logs
via watchdog_logger) forcibly terminates the process on Windows and prevents
FastAPI/Uvicorn shutdown handlers from running; replace that hard kill with an
in-process exit so cleanup runs — for example, import sys and call sys.exit(0)
or raise SystemExit() instead of os.kill, or if you have access to the Uvicorn
server object set server.should_exit = True; update the code paths that
reference parent_pid and the watchdog_logger block accordingly and add the
needed sys import.

In `@justfile`:
- Around line 203-209: The build-server-cuda recipe is missing the error
handling used by build-server; update the recipe (build-server-cuda) to set
PowerShell's $ErrorActionPreference = "Stop" at the start, run the Python step
(& "{{ python }}" backend/build_binary.py --cuda) and immediately check
$LASTEXITCODE, exiting with a non-zero status (or throw) if it's non-zero, and
add -ErrorAction Stop to file operations (New-Item and Copy-Item) so failures
don’t get swallowed; mirror the same error-check and exit logic that
build-server uses to ensure build and copy failures are propagated.
- Around line 109-117: The dev (and dev-web) target currently stores the uvicorn
supervisor in $backendJob and uses Stop-Process -Id $backendJob.Id which only
kills the supervisor and leaves the worker running; change the cleanup to kill
the whole process tree using Windows taskkill for the stored $backendJob.Id
(e.g. call taskkill /PID <id> /T /F or start a process that runs that command)
so the uvicorn worker is also terminated; update the finally/cleanup block that
references $backendJob and ensure the same change is applied to the dev-web
target.
- Around line 254-260: The db-init recipe should not change directories; remove
the cd {{ backend_dir }} invocation in both the [unix] and [windows] variants
and invoke the venv interpreter ({{ python }}) from the repo root, importing the
full module path backend.database (call python -c "from backend.database import
init_db; init_db()") so the {{ python }} variable remains valid; update both
db-init entries (refer to the db-init target and the {{ python }} and {{
backend_dir }} variables) to run the interpreter without changing directories
and use backend.database as the import path.

---

Nitpick comments:
In `@backend/server.py`:
- Around line 69-76: The try/except around creating log_dir and adding the
FileHandler for watchdog_logger currently swallows all exceptions; change it to
catch only OSError (or PermissionError) and log a warning to the root logger
with the exception details instead of silently passing. Specifically, update the
block that creates log_dir, instantiates logging.FileHandler and calls
watchdog_logger.addHandler(...) to catch OSError as e and call
logging.getLogger().warning(...) (or .exception(...) if preferred) with a clear
message and the exception info so failures to set up the file handler are
visible.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d13732c0-4efa-402b-a935-07da7ab414f2

📥 Commits

Reviewing files that changed from the base of the PR and between f196374 and f1ba73a.

📒 Files selected for processing (3)
  • README.md
  • backend/server.py
  • justfile
🚧 Files skipped from review as they are similar to previous changes (1)
  • README.md

…/T for process tree, build-server-cuda error handling, db-init path
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
backend/server.py (1)

116-118: ⚠️ Potential issue | 🟡 Minor

Exiting instead of returning would prevent orphan servers.

If the parent is already dead when the watchdog starts, returning here leaves the server running as an orphan. The server should exit in this case, consistent with the behavior when the parent dies later.

🛠️ Proposed fix
         if not alive:
             watchdog_logger.warning(f"Parent PID {parent_pid} not found on first check — disabling watchdog")
-            return
+            watchdog_logger.info("Exiting to avoid orphan server")
+            if sys.platform == "win32":
+                os._exit(1)
+            else:
+                os.kill(os.getpid(), signal.SIGTERM)
+            return  # unreachable, but makes intent clear
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 116 - 118, In the watchdog startup check, do
not simply return when the parent is already dead; replace the early "return"
with a process exit so the server does not become an orphan—use a clean exit
(e.g., sys.exit(1) or os._exit(1)) and keep the existing watchdog_logger.warning
that references parent_pid and alive to record the reason; update the code in
the function containing this block (the watchdog/startup check using variables
alive, parent_pid and watchdog_logger) to call the exit API instead of
returning.
🧹 Nitpick comments (2)
backend/server.py (2)

68-76: Consider logging the exception instead of silently passing.

The bare except Exception: pass swallows all errors during log directory setup. While log setup failure shouldn't crash the watchdog, logging the exception would help debug production issues.

🔧 Proposed fix
         try:
             log_dir = os.path.join(data_dir, "logs")
             os.makedirs(log_dir, exist_ok=True)
             fh = logging.FileHandler(os.path.join(log_dir, "watchdog.log"))
             fh.setFormatter(logging.Formatter('%(asctime)s - %(message)s'))
             watchdog_logger.addHandler(fh)
-        except Exception:
-            pass
+        except Exception as e:
+            # Log to stderr since file handler failed
+            logging.getLogger(__name__).debug(f"Could not set up watchdog file logging: {e}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 68 - 76, The current try/except around log
directory and FileHandler creation swallows errors; update the except in the
block that references data_dir, log_dir, FileHandler and watchdog_logger to
catch Exception as e and log the failure instead of passing — e.g., emit a clear
error via watchdog_logger.error or logging.exception including the exception
information (exc_info=True) and a contextual message like "failed to set up
watchdog log handler"; keep the function flow unchanged so failures don't raise
further.

122-128: Comment is misleading: os._exit(0) also bypasses cleanup handlers.

The comment explains why os.kill(SIGTERM) is unsuitable on Windows (it calls TerminateProcess), then uses os._exit(0) which also exits immediately without running cleanup handlers or atexit functions.

Using sys.exit(0) from a thread won't terminate the process (it only raises SystemExit in that thread). Since there's no clean cross-platform way to gracefully shutdown uvicorn from a background thread, os._exit(0) may be the pragmatic choice—but the comment should reflect that this is a hard exit, not a graceful one.

📝 Clarify the comment
                 if sys.platform == "win32":
-                    # sys.exit triggers SystemExit, allowing uvicorn to run
-                    # shutdown handlers. os.kill(SIGTERM) on Windows calls
-                    # TerminateProcess which hard-kills without cleanup.
+                    # On Windows, os.kill(SIGTERM) calls TerminateProcess which
+                    # hard-kills without cleanup. sys.exit() from a thread only
+                    # raises SystemExit in this thread, not the main thread.
+                    # os._exit(0) is a hard exit but at least terminates cleanly.
                     os._exit(0)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 122 - 128, The comment above the
platform-specific exit branch is misleading: replace it to state that
os._exit(0) performs a hard process exit that bypasses cleanup handlers/atexit
(unlike sys.exit which raises SystemExit in the current thread), and clarify
that because there is no reliable cross-platform way to gracefully stop uvicorn
from a background thread the code deliberately chooses a hard exit on Windows
(os._exit(0)) while using os.kill(os.getpid(), signal.SIGTERM) on non-Windows;
update the comment adjacent to the if sys.platform == "win32": / os._exit(0) /
os.kill(os.getpid(), signal.SIGTERM) block to reflect this rationale and the
cleanup implications.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/build_binary.py`:
- Around line 187-219: Remove the dead helper _get_cuda_dll_excludes including
its inner symbols (torch_lib, cuda_prefixes, exclude_dlls) from
backend/build_binary.py; it is never invoked and the current Windows CPU build
flow already handles CUDA by swapping to CPU torch, and PyInstaller cannot
selectively exclude DLL filenames via CLI. After deleting the function, run a
quick search to ensure no callers remain and tidy up any now-unused imports
(e.g., Path) that existed solely for this function.

---

Duplicate comments:
In `@backend/server.py`:
- Around line 116-118: In the watchdog startup check, do not simply return when
the parent is already dead; replace the early "return" with a process exit so
the server does not become an orphan—use a clean exit (e.g., sys.exit(1) or
os._exit(1)) and keep the existing watchdog_logger.warning that references
parent_pid and alive to record the reason; update the code in the function
containing this block (the watchdog/startup check using variables alive,
parent_pid and watchdog_logger) to call the exit API instead of returning.

---

Nitpick comments:
In `@backend/server.py`:
- Around line 68-76: The current try/except around log directory and FileHandler
creation swallows errors; update the except in the block that references
data_dir, log_dir, FileHandler and watchdog_logger to catch Exception as e and
log the failure instead of passing — e.g., emit a clear error via
watchdog_logger.error or logging.exception including the exception information
(exc_info=True) and a contextual message like "failed to set up watchdog log
handler"; keep the function flow unchanged so failures don't raise further.
- Around line 122-128: The comment above the platform-specific exit branch is
misleading: replace it to state that os._exit(0) performs a hard process exit
that bypasses cleanup handlers/atexit (unlike sys.exit which raises SystemExit
in the current thread), and clarify that because there is no reliable
cross-platform way to gracefully stop uvicorn from a background thread the code
deliberately chooses a hard exit on Windows (os._exit(0)) while using
os.kill(os.getpid(), signal.SIGTERM) on non-Windows; update the comment adjacent
to the if sys.platform == "win32": / os._exit(0) / os.kill(os.getpid(),
signal.SIGTERM) block to reflect this rationale and the cleanup implications.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7dac785c-036c-4888-ba47-9f294b64a206

📥 Commits

Reviewing files that changed from the base of the PR and between f1ba73a and e239be5.

📒 Files selected for processing (3)
  • backend/build_binary.py
  • backend/server.py
  • justfile

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
tauri/src-tauri/src/main.rs (1)

533-537: ⚠️ Potential issue | 🟠 Major

Handle failed Windows /shutdown and add force-stop fallback.

Line [533] discards the HTTP result, so stop_server can report success while the old server is still alive. Since PID/state is already cleared, restart_server can then reuse stale state/process behavior.

Suggested patch
         #[cfg(windows)]
         {
+            use std::process::Command;
             // Send graceful shutdown via HTTP — the server's parent-pid watchdog
             // will also handle cleanup if this app process exits.
             println!("Sending graceful shutdown via HTTP...");
             let client = reqwest::blocking::Client::builder()
                 .timeout(std::time::Duration::from_secs(2))
                 .build()
                 .unwrap();

-            let _ = client
+            let shutdown_ok = client
                 .post(&format!("http://127.0.0.1:{}/shutdown", SERVER_PORT))
-                .send();
+                .send()
+                .map(|r| r.status().is_success())
+                .unwrap_or(false);
+
+            if !shutdown_ok {
+                println!("Graceful shutdown failed, forcing process termination...");
+                let _ = Command::new("taskkill")
+                    .args(["/PID", &pid.to_string(), "/T", "/F"])
+                    .output();
+            }

             println!("Shutdown request sent (server watchdog will handle cleanup)");
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tauri/src-tauri/src/main.rs` around lines 533 - 537, The shutdown HTTP
request result is being ignored causing stop_server to report success even if
the server didn't stop; update the code around the client.post(...).send() call
in stop_server to check the Result/Response, treat non-success or errors
(especially on Windows) as a failure, and implement a force-stop fallback that
kills the process by PID when the HTTP shutdown fails; ensure stop_server only
clears state after a successful shutdown or after the force-stop path completes,
and make restart_server rely on the cleared state only after stop_server
confirms termination.
backend/server.py (1)

125-141: ⚠️ Potential issue | 🟠 Major

Unify parent-death handling; avoid orphaning on first check and hard-exit on Windows.

Line [127] returns when the parent is already gone, which leaves the server running orphaned.
Line [138] uses os._exit(0), which bypasses normal cleanup/shutdown hooks. Use one consistent shutdown path for both the initial and loop checks.

In CPython on Windows, what cleanup/shutdown hooks (atexit, finally blocks, FastAPI/Uvicorn shutdown events) run when calling os._exit(0) versus sys.exit() from a background thread?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 125 - 141, The early return when the parent
PID is missing and the hard os._exit(0) call on Windows create inconsistent
shutdown behavior; update the parent-death handling so the initial check follows
the same graceful shutdown path as the loop: instead of returning when not
alive, log the parent-gone message and invoke the same shutdown branch used
inside the while (use sys.exit(0) on Windows to allow cleanup and
uvicorn/FastAPI shutdown handlers, and os.kill(os.getpid(), signal.SIGTERM) on
non-Windows), and remove/replace any os._exit(0) usage; keep references to
_is_pid_alive, watchdog_logger, _watchdog_disabled and ensure both the initial
check and the loop use the identical shutdown sequence.
🧹 Nitpick comments (1)
backend/server.py (1)

84-85: Don’t silently swallow watchdog file-logger setup failures.

Catching broad Exception and pass makes production watchdog failures hard to diagnose. Prefer narrowing the exception and logging once.

Suggested patch
-        except Exception:
-            pass
+        except OSError as e:
+            watchdog_logger.warning("Failed to initialize watchdog file logger: %s", e)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 84 - 85, Replace the silent broad except in
the watchdog file-logger setup with a narrowed exception handler that logs the
error; specifically, instead of "except Exception: pass" around the watchdog
file-logger setup in backend/server.py, catch likely OS/file errors (e.g.,
OSError, IOError, PermissionError) as e and call the module logger (e.g.,
logger.error or logger.exception) with a clear message like "watchdog
file-logger setup failed" and the exception details so the failure is recorded;
do not swallow other unexpected exceptions (let them propagate) and ensure the
logging call uses the existing logger instance used elsewhere in server.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tauri/src-tauri/src/main.rs`:
- Around line 770-776: The POST to "/watchdog/disable" currently ignores the
response and errors; change the call using the reqwest::blocking::Client (the
client built with Client::builder() and SERVER_PORT) to capture the Result from
.send(), check for errors and for a non-success HTTP status (e.g.,
response.status().is_success()), and handle failures by logging the error and
retrying a few times or returning a failure result so the app can ensure the
server doesn't self-terminate when keep_running_on_close=true; make the check
around the .post(...).send() call and react to both Err(_) and non-2xx statuses.

---

Duplicate comments:
In `@backend/server.py`:
- Around line 125-141: The early return when the parent PID is missing and the
hard os._exit(0) call on Windows create inconsistent shutdown behavior; update
the parent-death handling so the initial check follows the same graceful
shutdown path as the loop: instead of returning when not alive, log the
parent-gone message and invoke the same shutdown branch used inside the while
(use sys.exit(0) on Windows to allow cleanup and uvicorn/FastAPI shutdown
handlers, and os.kill(os.getpid(), signal.SIGTERM) on non-Windows), and
remove/replace any os._exit(0) usage; keep references to _is_pid_alive,
watchdog_logger, _watchdog_disabled and ensure both the initial check and the
loop use the identical shutdown sequence.

In `@tauri/src-tauri/src/main.rs`:
- Around line 533-537: The shutdown HTTP request result is being ignored causing
stop_server to report success even if the server didn't stop; update the code
around the client.post(...).send() call in stop_server to check the
Result/Response, treat non-success or errors (especially on Windows) as a
failure, and implement a force-stop fallback that kills the process by PID when
the HTTP shutdown fails; ensure stop_server only clears state after a successful
shutdown or after the force-stop path completes, and make restart_server rely on
the cleared state only after stop_server confirms termination.

---

Nitpick comments:
In `@backend/server.py`:
- Around line 84-85: Replace the silent broad except in the watchdog file-logger
setup with a narrowed exception handler that logs the error; specifically,
instead of "except Exception: pass" around the watchdog file-logger setup in
backend/server.py, catch likely OS/file errors (e.g., OSError, IOError,
PermissionError) as e and call the module logger (e.g., logger.error or
logger.exception) with a clear message like "watchdog file-logger setup failed"
and the exception details so the failure is recorded; do not swallow other
unexpected exceptions (let them propagate) and ensure the logging call uses the
existing logger instance used elsewhere in server.py.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 30502e96-91fe-4878-8cbf-e3f12fbe89e5

📥 Commits

Reviewing files that changed from the base of the PR and between e239be5 and 410413d.

📒 Files selected for processing (3)
  • backend/main.py
  • backend/server.py
  • tauri/src-tauri/src/main.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/main.py

@jamiepine jamiepine merged commit 732270b into main Mar 15, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant