-
Notifications
You must be signed in to change notification settings - Fork 827
Description
Hi,
it seems to be that pcp recipe is broken (at least for kirkstone with systemd enabled). The recipe builds correctly but on target system some pcp services fails to start.
Log:
# systemctl --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● pmcd.service loaded failed failed Performance Metrics Collector Daemon
● pmie.service loaded failed failed Performance Metrics Inference Engine
● pmie_farm.service loaded failed failed pmie farm service
● pmlogger.service loaded failed failed Performance Metrics Archive Logger
● pmlogger_farm.service loaded failed failed pmlogger farm service
NOTE: I'm building for x86-64 and this log was taken when launched directly on target device. When I launch image in QEMU then systemctl --failed does not list those services as failed but services still do not start -> it seems that autorestart in QEMU takes more time and it does not trigger systemd autorestart limits so services just restarts endlessly.
pmcd.service:
# journalctl -xu pmcd
Aug 09 09:54:00 XXXX systemd[1]: Starting Performance Metrics Collector Daemon...
░░ Subject: A start job for unit pmcd.service has begun execution
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit pmcd.service has begun execution.
░░
░░ The job identifier is 84.
Aug 09 09:54:00 XXXX pmcd[303]: Rebuilding PMNS ...
Aug 09 09:54:01 XXXX systemd[1]: pmcd.service: Failed to parse MAINPID= field in notification message, ignoring:
Aug 09 09:54:01 XXXX systemd[1]: Started Performance Metrics Collector Daemon.
░░ Subject: A start job for unit pmcd.service has finished successfully
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit pmcd.service has finished successfully.
░░
░░ The job identifier is 84.
Aug 09 09:54:01 XXXX pmcd[691]: PMCD process ... 561
Aug 09 09:54:01 XXXX pmcd[691]: /usr/libexec/pcp/lib/pmcd:
Aug 09 09:54:01 XXXX pmcd[691]: Warning: process ID in /var/run/pmcd.pid () is different.
Aug 09 09:54:01 XXXX pmcd[691]: Check logfile /var/log/pmcd/pmcd.log. When you are ready to proceed,
Aug 09 09:54:01 XXXX pmcd[691]: remove /var/run/pmcd.pid before retrying.
Aug 09 09:54:01 XXXX systemd[1]: pmcd.service: Control process exited, code=exited, status=1/FAILURE
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ An ExecStop= process belonging to unit pmcd.service has exited.
░░
░░ The process' exit code is 'exited' and its exit status is 1.
Aug 09 09:54:01 XXXX systemd[1]: pmcd.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ The unit pmcd.service has entered the 'failed' state with result 'exit-code'.
# cat /var/log/pmcd/pmcd.log
Log for pmcd on XXXX started Tue Aug 9 09:54:02 2022
Cannot find pmcd DSO at "/var/lib/pcp/pmdas/pmcd/pmda_pmcd.so"
Cannot find pmproxy DSO at "/var/lib/pcp/pmdas/mmv/pmda_mmv.so"
Cannot find mmv DSO at "/var/lib/pcp/pmdas/mmv/pmda_mmv.so"
Cannot find jbd2 DSO at "/var/lib/pcp/pmdas/jbd2/pmda_jbd2.so"
pmcd: unexpected end-of-file at initial exchange with kvm PMDA
active agent dom pid in out ver protocol parameters
============ === ===== === === === ======== ==========
root 1 %5 2375 6 7 bin pipe cmd=/var/lib/pcp/pmdas/root/pmdaroot
proc 3 %5 2376 9 10 bin pipe cmd=/var/lib/pcp/pmdas/proc/pmdaproc -d 3
xfs 11 %5 2377 11 12 bin pipe cmd=/var/lib/pcp/pmdas/xfs/pmdaxfs -d 11
linux 60 %5 2378 13 14 bin pipe cmd=/var/lib/pcp/pmdas/linux/pmdalinux
Host access list:
00 01 Cur/MaxCons host-spec host-mask lvl host-name
== == =========== ======================================= ======================================= === ==============
y y 0 0 127.0.1.1 255.255.255.255 0 localhost
y y 0 0 / / 1 unix:
n 0 0 0.0.0.0 0.0.0.0 4 .*
n 0 0 :: :: 8 :*
User access list empty: user-based access control turned off
Group access list empty: group-based access control turned off
pmcd: PID = , PDU version = 2
pmcd request port(s):
sts fd port family address
=== ==== ===== ====== =======
ok 4 unix /var/run/pmcd.socket
ok 0 44321 inet INADDR_ANY
ok 3 44321 ipv6 INADDR_ANY
pmie.service:
# journalctl -xu pmie
Aug 09 09:54:01 XXXX systemd[1]: Starting Performance Metrics Inference Engine...
░░ Subject: A start job for unit pmie.service has begun execution
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit pmie.service has begun execution.
░░
░░ The job identifier is 101.
Aug 09 09:54:01 XXXX systemd[1]: pmie.service: Failed with result 'protocol'.
pmlogger.service:
# journalctl -xu pmlogger
Aug 09 09:54:01 XXXX systemd[1]: Starting Performance Metrics Archive Logger...
░░ Subject: A start job for unit pmlogger.service has begun execution
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit pmlogger.service has begun execution.
░░
░░ The job identifier is 107.
Aug 09 09:54:01 XXXX pmlogger[598]: /usr/libexec/pcp/lib/pmlogger: Warning: Performance Co-Pilot archive logger(s) not permanently enabled.
Aug 09 09:54:01 XXXX pmlogger[598]: To enable pmlogger, run the following as root:
Aug 09 09:54:01 XXXX pmlogger[598]: # ln -sf ../init.d/pmlogger /etc/rc.d/rc2.d/S94pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]: # ln -sf ../init.d/pmlogger /etc/rc.d/rc2.d/K06pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]: # ln -sf ../init.d/pmlogger /etc/rc.d/rc3.d/S94pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]: # ln -sf ../init.d/pmlogger /etc/rc.d/rc3.d/K06pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]: # ln -sf ../init.d/pmlogger /etc/rc.d/rc4.d/S94pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]: # ln -sf ../init.d/pmlogger /etc/rc.d/rc4.d/K06pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]: # ln -sf ../init.d/pmlogger /etc/rc.d/rc5.d/S94pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]: # ln -sf ../init.d/pmlogger /etc/rc.d/rc5.d/K06pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]: /usr/libexec/pcp/lib/pmlogger:
Aug 09 09:54:01 XXXX pmlogger[598]: Warning: Performance Co-Pilot installation is incomplete (at least the
Aug 09 09:54:01 XXXX pmlogger[598]: script "pmlogger_check" is missing) and the PCP archive logger(s)
Aug 09 09:54:01 XXXX pmlogger[598]: cannot be started.
Aug 09 09:54:01 XXXX systemd[1]: pmlogger.service: Failed with result 'protocol'.
I can see two things that could be possible issues here. The first one - it seems that PID for pmcd is not witten to /var/run/pmcd.pid file. The PID file is created with following permissions on pmcd startup:
-r--r--r-- 1 root root 0 Aug 9 10:29 pmcd.pid
but it's empty. PMCD daemon complains that it's PID is different than the one written to empty PID file and it exits. Additionally systemd unit file for pmcd.service points to this file as service PID so systemd could be mad about wrong pid file too.
The second issue could be the fact that those services' systemd unit files declares:
[Service]
Type=notify
but systemd complains things such as:
systemd[1]: pmcd.service: Failed to parse MAINPID= field in notification message, ignoring:
or:
systemd[1]: pmlogger.service: Failed with result 'protocol'..
Maybe pcp lacks systemd support and notify just won't work in that case?