I think this is due to the original process having an exclusive lock on the spool file. Here is the relevant strace output:
[pid 2218] lstat("/var/lib/filebeat/spool.dat", {st_mode=S_IFREG|0600, st_size=101916672, ...}) = 0
[pid 2218] openat(AT_FDCWD, "/var/lib/filebeat/spool.dat", O_RDWR|O_CREAT|O_CLOEXEC, 0600) = 5
[pid 2218] epoll_ctl(4, EPOLL_CTL_ADD, 5, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2808401664, u64=140491188653824}}) = -1 EPERM (Operation not permitted)
[pid 2218] epoll_ctl(4, EPOLL_CTL_DEL, 5, 0xc4204f101c) = -1 EPERM (Operation not permitted)
[pid 2218] flock(5, LOCK_EX <unfinished ...>
[pid 2219] <... pselect6 resumed> ) = 0 (Timeout)
[pid 2219] epoll_pwait(4, [], 128, 0, NULL, 6810877) = 0
[pid 2219] pselect6(0, NULL, NULL, NULL, {0, 10000000}, NULL) = 0 (Timeout)
[pid 2219] epoll_pwait(4, [], 128, 0, NULL, 6810877) = 0
[pid 2219] futex(0xc420096d48, FUTEX_WAKE, 1) = 1
[pid 2230] <... futex resumed> ) = 0
[pid 2219] pselect6(0, NULL, NULL, NULL, {0, 20000}, NULL <unfinished ...>
[pid 2230] futex(0xc420096d48, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 2219] <... pselect6 resumed> ) = 0 (Timeout)
[pid 2219] futex(0x20ee650, FUTEX_WAIT, 0, {60, 0} <unfinished ...>
[pid 2221] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid 2221] futex(0x20ee650, FUTEX_WAKE, 1) = 1
[pid 2219] <... futex resumed> ) = 0
[pid 2221] futex(0xc420096d48, FUTEX_WAKE, 1 <unfinished ...>
[pid 2230] <... futex resumed> ) = 0
...
I am using filebeat & metricbeat 6.6.0 on CentOS 7.
If a current running beat has a spool file configured (for example,
queue.spool.file' => {}) and aconfig testis run against a configuration that specifies the same spool file, the configuration test will hang indefinitely.I think this is due to the original process having an exclusive lock on the spool file. Here is the relevant strace output:
I am using filebeat & metricbeat 6.6.0 on CentOS 7.
I was not able to test the version that fixes a similar issue in #9874.