-
Notifications
You must be signed in to change notification settings - Fork 72
tedge-watchdog: does not handle systemd service file overrides #2354
Description
Describe the bug
In systemd, instead of editing service files located in /lib/systemd/system/, it is advisable to use systemctl edit --full SERCVICE_NAME.service which creates a new unit file in /etc/systemd/system/SERVICE_NAME.service which includes modifications the user performed, or systemctl edit SERVICE_NAME.service which creates a directory /etc/systemd/system/SERVICE_NAME.service.d in which a file override.conf containing user's modifications, will be saved.
For that reason, it isn't enough to parse /lib/systemd/system/SERVICE_NAME.service file for WatchdogSec attribute, because if the user used systemctl edit command, it won't affect the original file, but it will be picked up by systemd, resulting in an undesirable behaviour where the service will be repeatedly killed by systemd, while the watchdog is not able to detect that the service has the WatchdogSec attribute.
To Reproduce
- Connect to c8y cloud
- Stop
tedge-mapper-c8yandtedge-watchdogservices - Use
systemctl edit --full tedge-mapper-c8yand addWatchdogSec=30in[Service]section, as described in the documentation - Start
tedge-mapper-c8yandtedge-watchdogservices systemctl status tedge-watchdogshould contain log line:WARN tedge_watchdog::systemd_watchdog: Watchdog is not enabled for device/main/service/tedge-mapper-c8yjournalctl -u tedge-mapper-c8y -fshould show the service being killed with signalSIGABRTdue to not notifying systemd in time.
Expected behavior
tedge-watchdog should pick up WatchdogSec attribute as present, even if it was added using systemctl edit command.
Additional context
Instead of parsing a service file from a hardcoded location, it would be better to use systemd's D-Bus interface or a crate if such exists.
Lastly, this is not a bug insofar as our code doesn't work despite the user precisely following the instructions found in the documentation - the user has to do things differently from how they're described in the documentation, but seeing how "using systemctl edit command to edit systemd unit files" is a recommended good practice, and how following this practice leads to the unit being inoperable due to constant timeouts, I've decided to classify it as such.