Skip to content

in_tail pos file compaction doesn't seem to work as expected #3433

@alex-vmw

Description

@alex-vmw

Describe the bug
in_tail pos file compaction doesn't seem to work as expected. The expectation is that compaction will remove entries for deleted log files from the pos file, but we are not seeing that happen. We see in the logs that compaction was triggered, but when we look into pos file the entries for deleted log files still exist.

For example, we see that compaction took place via this log:

2021-06-23 21:55:07 +0000 [info]: #0 [in_tail_container_logs] Clean up the pos file

However, when we look at the pos file on the node, it contains entries for logs files that we know no longer exist. For example, we see these lines in the pos file:

/var/log/containers/alex-busybox_default_alex-busybox-0584af93443bad99c0b64bf291190609f5d22557c7be937f36c9af2dde9ce54b.log      000000000001e6c3        0000000018776252
/var/log/containers/alex-busybox_default_alex-busybox-d8180a67753ee1ab99c49d028a1f58676c211f233b74934930aecfeef54e7d0d.log      000000000000134f        000000001012930f
/var/log/containers/alex-busybox_default_alex-busybox-a01184a2cb8064d7c776488188c1d1bbd5bf57c509ea3024da9dbee7fabc7c38.log      00000000000046b0        0000000018605f76

But these logs do NOT exist on the node anymore as corresponding pods have long been deleted:

# ls -l /var/log/containers/alex-busybox_default_alex-busybox-0584af93443bad99c0b64bf291190609f5d22557c7be937f36c9af2dde9ce54b.log
ls: cannot access '/var/log/containers/alex-busybox_default_alex-busybox-0584af93443bad99c0b64bf291190609f5d22557c7be937f36c9af2dde9ce54b.log': No such file or directory
# ls -l var/log/containers/alex-busybox_default_alex-busybox-d8180a67753ee1ab99c49d028a1f58676c211f233b74934930aecfeef54e7d0d.log
ls: cannot access 'var/log/containers/alex-busybox_default_alex-busybox-d8180a67753ee1ab99c49d028a1f58676c211f233b74934930aecfeef54e7d0d.log': No such file or directory
# ls -l /var/log/containers/alex-busybox_default_alex-busybox-a01184a2cb8064d7c776488188c1d1bbd5bf57c509ea3024da9dbee7fabc7c38.log
ls: cannot access '/var/log/containers/alex-busybox_default_alex-busybox-a01184a2cb8064d7c776488188c1d1bbd5bf57c509ea3024da9dbee7fabc7c38.log': No such file or directory

To Reproduce
tail logs of kubernetes containers with a pos file and compaction setup, while creating/deleting k8s containers.

Expected behavior
The expectation is that compaction will remove entries for deleted log files from the pos file.

Your Environment
Fluentd or td-agent version: fluentd 1.13.0.
Operating system: Ubuntu 20.04.1 LTS
Kernel version: 5.4.0-62-generic

Your Configuration

<source>
  @type tail
  @id in_tail_container_logs
  path /var/log/containers/*.log
  pos_file /var/log/kube-fluentd-operator-fluentd-containers.log.pos
  tag kubernetes.*
  read_from_head true
  read_bytes_limit_per_second 8192
  pos_file_compaction_interval 1h
  <parse>
    @type multiline
    # cri-o
    format1 /^(?<partials>([^\n]+ (stdout|stderr) P [^\n]+\n)*)/
    format2 /(?<time>[^\n]+) (?<stream>stdout|stderr) F (?<log>[^\n]*)/
    # docker
    format3 /|(?<json>{.*})/
    time_format %Y-%m-%dT%H:%M:%S.%N%:z
  </parse>
</source>

Your Error Log
There are no errors in the log, but we do see compaction taking place:

2021-06-23 21:55:07 +0000 [info]: #0 [in_tail_container_logs] Clean up the pos file

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions