Describe the bug
We were doing stress test on our EFK stack. We deploy a simple openresty web server that writes access logs to stdout, and use vetega tool to send HTTP requests at a constant rate (QPS: 10000 for 180 seconds). We find that when log file is rotated quite fast, it takes a long time (around 2 mins) for in_tail to discover the new log file, and in_tail loses some of the rotated files.
To Reproduce
Docker logging driver config:
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m",
"max-file": "50",
"compress": "true"
},
"storage-driver": "overlay2"
}
# echo "GET http://172.16.151.239" | vegeta attack -rate=10000 -duration=180s | tee results.bin | vegeta report
Requests [total, rate, throughput] 1799999, 9999.95, 9999.94
Duration [total, attack, wait] 3m0s, 3m0s, 324.017µs
Latencies [min, mean, 50, 90, 95, 99, max] 107.842µs, 904.156µs, 306.604µs, 1.96ms, 3.064ms, 8.051ms, 114.653ms
Bytes In [total, mean] 0, 0.00
Bytes Out [total, mean] 0, 0.00
Success [ratio] 100.00%
Status Codes [code:count] 204:1799999
Error Set:
Expected behavior
We expect Kibana to also receive 1799999 logs. However, it only receives 622756 entires.

Your Environment
- Fluentd or td-agent version:
v1.12.0
- Operating system:
Debian GNU/Linux 10 (buster)
- Kernel version:
5.4.0-48-generic
Your Configuration
<source>
@type tail
@log_level debug
@id in_tail_container_logs
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.log.pos
tag kubernetes.*
exclude_path ["/var/log/containers/*_kube-system_*.log", "/var/log/containers/*_default_*.log", "/var/log/containers/*_rook-ceph_*.log", "/var/log/contain
ers/*fluent*.log"]
read_from_head true
follow_inodes true
<parse>
@type json
time_format "%Y-%m-%dT%H:%M:%S.%NZ"
keep_time_key true
</parse>
</source>
<match kubernetes.** >
@type elasticsearch_data_stream
data_stream_name logs-efk-container
@log_level debug
hosts https://elasticsearch-client.default.svc.cluster.local:9200
user elastic
password xxxxxx
ssl_verify false
reload_connections false
reconnect_on_error true
reload_on_failure true
log_es_400_reason false
request_timeout 1000s
include_timestamp true
<buffer>
flush_thread_count 8
flush_interval 1s
retry_max_interval 5
retry_forever true
</buffer>
</match>
Additional context
Docker logs
# wc -l *.log*
34143 8a703c5ad9672f8f8101d6ad47c4ed61b4b04d69621e84cbd0851ba1ee3d3d5f-json.log
588621 8a703c5ad9672f8f8101d6ad47c4ed61b4b04d69621e84cbd0851ba1ee3d3d5f-json.log.1
588621 8a703c5ad9672f8f8101d6ad47c4ed61b4b04d69621e84cbd0851ba1ee3d3d5f-json.log.2
588620 8a703c5ad9672f8f8101d6ad47c4ed61b4b04d69621e84cbd0851ba1ee3d3d5f-json.log.3
1800005 total
Get the last log at 22:24. It is at line number 588614 of the log file - 8a703c5ad9672f8f8101d6ad47c4ed61b4b04d69621e84cbd0851ba1ee3d3d5f-json.log.3

# grep '2021-06-24T14:24:00.849102284Z' 8a703c5ad9672f8f8101d6ad47c4ed61b4b04d69621e84cbd0851ba1ee3d3d5f-json.log.3 -n
588614:{"log":"172.24.0.214 - - [24/Jun/2021:22:24:00 +0800] \"GET / HTTP/1.1\" 204 0 \"-\" \"Go-http-client/1.1\"\n","stream":"stdout","time":"2021-06-24T14:24:00.849102284Z"}
Get the first log at 22:28. It is at the first line of the log file - 8a703c5ad9672f8f8101d6ad47c4ed61b4b04d69621e84cbd0851ba1ee3d3d5f-json.log

# grep '2021-06-24T14:25:58.573922367Z' 8a703c5ad9672f8f8101d6ad47c4ed61b4b04d69621e84cbd0851ba1ee3d3d5f-json.log -n
1:{"log":"172.24.0.214 - - [24/Jun/2021:22:25:58 +0800] \"GET / HTTP/1.1\" 204 0 \"-\" \"Go-http-client/1.1\"\n","stream":"stdout","time":"2021-06-24T14:25:58.573922367Z"}
It proofs that in_tail fails to check below 2 files when log rotation happening quite fast.
8a703c5ad9672f8f8101d6ad47c4ed61b4b04d69621e84cbd0851ba1ee3d3d5f-json.log.1.gz
8a703c5ad9672f8f8101d6ad47c4ed61b4b04d69621e84cbd0851ba1ee3d3d5f-json.log.2.gz
Describe the bug
We were doing stress test on our EFK stack. We deploy a simple openresty web server that writes access logs to stdout, and use vetega tool to send HTTP requests at a constant rate (QPS: 10000 for 180 seconds). We find that when log file is rotated quite fast, it takes a long time (around 2 mins) for in_tail to discover the new log file, and in_tail loses some of the rotated files.
To Reproduce
Docker logging driver config:
Expected behavior

We expect Kibana to also receive 1799999 logs. However, it only receives 622756 entires.
Your Environment
v1.12.0Debian GNU/Linux 10 (buster)5.4.0-48-genericYour Configuration
Additional context
Docker logs
Get the last log at 22:24. It is at line number 588614 of the log file - 8a703c5ad9672f8f8101d6ad47c4ed61b4b04d69621e84cbd0851ba1ee3d3d5f-json.log.3

Get the first log at 22:28. It is at the first line of the log file - 8a703c5ad9672f8f8101d6ad47c4ed61b4b04d69621e84cbd0851ba1ee3d3d5f-json.log

It proofs that in_tail fails to check below 2 files when log rotation happening quite fast.