Describe the bug
in_tail pos file compaction doesn't seem to work as expected. The expectation is that compaction will remove entries for deleted log files from the pos file, but we are not seeing that happen. We see in the logs that compaction was triggered, but when we look into pos file the entries for deleted log files still exist.
For example, we see that compaction took place via this log:
2021-06-23 21:55:07 +0000 [info]: #0 [in_tail_container_logs] Clean up the pos file
However, when we look at the pos file on the node, it contains entries for logs files that we know no longer exist. For example, we see these lines in the pos file:
/var/log/containers/alex-busybox_default_alex-busybox-0584af93443bad99c0b64bf291190609f5d22557c7be937f36c9af2dde9ce54b.log 000000000001e6c3 0000000018776252
/var/log/containers/alex-busybox_default_alex-busybox-d8180a67753ee1ab99c49d028a1f58676c211f233b74934930aecfeef54e7d0d.log 000000000000134f 000000001012930f
/var/log/containers/alex-busybox_default_alex-busybox-a01184a2cb8064d7c776488188c1d1bbd5bf57c509ea3024da9dbee7fabc7c38.log 00000000000046b0 0000000018605f76
But these logs do NOT exist on the node anymore as corresponding pods have long been deleted:
# ls -l /var/log/containers/alex-busybox_default_alex-busybox-0584af93443bad99c0b64bf291190609f5d22557c7be937f36c9af2dde9ce54b.log
ls: cannot access '/var/log/containers/alex-busybox_default_alex-busybox-0584af93443bad99c0b64bf291190609f5d22557c7be937f36c9af2dde9ce54b.log': No such file or directory
# ls -l var/log/containers/alex-busybox_default_alex-busybox-d8180a67753ee1ab99c49d028a1f58676c211f233b74934930aecfeef54e7d0d.log
ls: cannot access 'var/log/containers/alex-busybox_default_alex-busybox-d8180a67753ee1ab99c49d028a1f58676c211f233b74934930aecfeef54e7d0d.log': No such file or directory
# ls -l /var/log/containers/alex-busybox_default_alex-busybox-a01184a2cb8064d7c776488188c1d1bbd5bf57c509ea3024da9dbee7fabc7c38.log
ls: cannot access '/var/log/containers/alex-busybox_default_alex-busybox-a01184a2cb8064d7c776488188c1d1bbd5bf57c509ea3024da9dbee7fabc7c38.log': No such file or directory
To Reproduce
tail logs of kubernetes containers with a pos file and compaction setup, while creating/deleting k8s containers.
Expected behavior
The expectation is that compaction will remove entries for deleted log files from the pos file.
Your Environment
Fluentd or td-agent version: fluentd 1.13.0.
Operating system: Ubuntu 20.04.1 LTS
Kernel version: 5.4.0-62-generic
Your Configuration
<source>
@type tail
@id in_tail_container_logs
path /var/log/containers/*.log
pos_file /var/log/kube-fluentd-operator-fluentd-containers.log.pos
tag kubernetes.*
read_from_head true
read_bytes_limit_per_second 8192
pos_file_compaction_interval 1h
<parse>
@type multiline
# cri-o
format1 /^(?<partials>([^\n]+ (stdout|stderr) P [^\n]+\n)*)/
format2 /(?<time>[^\n]+) (?<stream>stdout|stderr) F (?<log>[^\n]*)/
# docker
format3 /|(?<json>{.*})/
time_format %Y-%m-%dT%H:%M:%S.%N%:z
</parse>
</source>
Your Error Log
There are no errors in the log, but we do see compaction taking place:
2021-06-23 21:55:07 +0000 [info]: #0 [in_tail_container_logs] Clean up the pos file
Describe the bug
in_tail pos file compaction doesn't seem to work as expected. The expectation is that compaction will remove entries for deleted log files from the pos file, but we are not seeing that happen. We see in the logs that compaction was triggered, but when we look into pos file the entries for deleted log files still exist.
For example, we see that compaction took place via this log:
However, when we look at the pos file on the node, it contains entries for logs files that we know no longer exist. For example, we see these lines in the pos file:
But these logs do NOT exist on the node anymore as corresponding pods have long been deleted:
To Reproduce
tail logs of kubernetes containers with a pos file and compaction setup, while creating/deleting k8s containers.
Expected behavior
The expectation is that compaction will remove entries for deleted log files from the pos file.
Your Environment
Fluentd or td-agent version: fluentd
1.13.0.Operating system:
Ubuntu 20.04.1 LTSKernel version:
5.4.0-62-genericYour Configuration
Your Error Log
There are no errors in the log, but we do see compaction taking place: