Context
Follow-up from the #1279 rework (#1282). The init wizard confirms a config-reload restart completed by reading the daemon's PID-file start time (DaemonManager.TryGetRecordedStartTime()) and requiring it to be strictly newer than the value captured before the config write (HealthCheckStepViewModel.IsRestartedGeneration: current > before).
Problem
A filesystem timestamp is a fragile readiness proxy:
- Wall-clock step-back (NTP correction, VM resume, container host clock adjustment) between capturing
generationBefore and the restarted daemon's StartedAt makes current > before false forever → the wizard polls to the full ReloadReadyTimeout (90s) and falsely reports "Daemon did not become ready" on a daemon that actually reloaded fine.
- Coarse/equal-tick collisions across two restarts have the same effect.
- Two independent writers touch the PID file (the CLI writes a 1-line preliminary PID in
DaemonManager.Start(); the daemon's PidFileService writes the authoritative 2-line form), so a torn read can momentarily yield a null/!2-line value.
Proposal
Use the authoritative signal the daemon already exposes instead of a derived file timestamp:
DaemonStartClock.StartedAt is surfaced as StartedAtUtc on GET /api/health/status (DaemonRuntimeStatusService). Gate readiness on that advancing, or on a monotonic restart counter, rather than the PID-file mtime.
- Auth caveat:
/api/health/status requires authorization (loopback is trusted in ExposureMode.Local, which is the init case), whereas the current /api/health/ready poll is anonymous. A monotonic counter on the anonymous readiness endpoint would sidestep the auth question entirely — worth considering.
Scope
Non-blocking robustness improvement; the current PID-file approach works in the common case and is covered by tests. Orthogonal to #1279/#1282.
Context
Follow-up from the #1279 rework (#1282). The init wizard confirms a config-reload restart completed by reading the daemon's PID-file start time (
DaemonManager.TryGetRecordedStartTime()) and requiring it to be strictly newer than the value captured before the config write (HealthCheckStepViewModel.IsRestartedGeneration:current > before).Problem
A filesystem timestamp is a fragile readiness proxy:
generationBeforeand the restarted daemon'sStartedAtmakescurrent > beforefalse forever → the wizard polls to the fullReloadReadyTimeout(90s) and falsely reports "Daemon did not become ready" on a daemon that actually reloaded fine.DaemonManager.Start(); the daemon'sPidFileServicewrites the authoritative 2-line form), so a torn read can momentarily yield a null/!2-line value.Proposal
Use the authoritative signal the daemon already exposes instead of a derived file timestamp:
DaemonStartClock.StartedAtis surfaced asStartedAtUtconGET /api/health/status(DaemonRuntimeStatusService). Gate readiness on that advancing, or on a monotonic restart counter, rather than the PID-file mtime./api/health/statusrequires authorization (loopback is trusted inExposureMode.Local, which is the init case), whereas the current/api/health/readypoll is anonymous. A monotonic counter on the anonymous readiness endpoint would sidestep the auth question entirely — worth considering.Scope
Non-blocking robustness improvement; the current PID-file approach works in the common case and is covered by tests. Orthogonal to #1279/#1282.