Describe the bug
Continue on #3434. I followed @ashie suggestion and did the stress test again on our EFK stack with read_bytes_limit_per_second parameter and Fluentd version v1.13.2. However, I found that logs are duplicated in Elasticsearch.
To Reproduce
Container Runtime Version: containerd://1.4
Kubelet logging configuration: --container-log-max-files=50 --container-log-max-size=100Mi
# echo "GET http://172.16.5.154" | vegeta attack -rate=10000 -duration=120s | tee results.bin | vegeta report
Requests [total, rate, throughput] 1200000, 10000.01, 9999.99
Duration [total, attack, wait] 2m0s, 2m0s, 289.975µs
Latencies [min, mean, 50, 90, 95, 99, max] 119.123µs, 488.862µs, 280.767µs, 913.659µs, 1.335ms, 4.028ms, 61.799ms
Bytes In [total, mean] 0, 0.00
Bytes Out [total, mean] 0, 0.00
Success [ratio] 100.00%
Status Codes [code:count] 204:1200000
Error Set:
Expected behavior
Expect Kibana to also receive 1,200,000 logs. However, it receives 1,622,486 entires.

Your Environment
- Fluentd version: 1.13.2
- Operating system: Debian GNU/Linux 10 (buster)
- Kernel version: 5.4.0-72-generic
Your Configuration
Fluentd Chart values.yaml:
dashboards:
enabled: "false"
image:
tag: v1.13.2-debian-elasticsearch7-1.0
resources:
limits:
cpu: 400m
memory: 400Mi
requests:
cpu: 200m
memory: 200Mi
plugins:
- fluent-plugin-elasticsearch
nodeSelector:
node-role.kubernetes.io/efktest: "true"
service:
type: "ClusterIP"
annotations: {}
ports:
- name: "aggregator"
protocol: TCP
containerPort: 24224
volumes:
- name: varlogcontainers
hostPath:
path: /var/log/containers
- name: varlogpods
hostPath:
path: /var/log/pods
- name: fluentpos
hostPath:
path: /var/log/fluent/
- name: etcfluentd-main
configMap:
name: fluentd-main
defaultMode: 0777
- name: etcfluentd-config
configMap:
name: fluentd-config
defaultMode: 0777
volumeMounts:
- name: varlogcontainers
mountPath: /var/log/containers
readOnly: true
- name: varlogpods
mountPath: /var/log/pods
readOnly: true
- name: fluentpos
mountPath: /var/log/fluent/
- name: etcfluentd-main
mountPath: /etc/fluent
- name: etcfluentd-config
mountPath: /etc/fluent/config.d/
## Fluentd configurations:
fileConfigs:
00_global.conf: |-
<system>
log_level debug
</system>
01_sources.conf: |-
<source>
@type tail
@id in_tail_container_logs
path /var/log/containers/*.log
pos_file /var/log/fluent/fluentd-containers.log.pos
tag kubernetes.*
exclude_path ["/var/log/containers/*_kube-system_*.log", "/var/log/containers/*_default_*.log", "/var/log/containers/*fluent*.log"]
read_from_head true
follow_inodes true
refresh_interval 1
read_bytes_limit_per_second 100000
<parse>
@type none
message_key message
</parse>
</source>
# Send the logs to the elasticsearch; one config for one namesapce
<match kubernetes.** >
@type elasticsearch_data_stream
data_stream_name logs-efk-container
hosts https://elasticsearch-client.default.svc.cluster.local:9200
user elastic
password xxxxxx
ssl_verify false
reload_connections false
reconnect_on_error true
reload_on_failure true
log_es_400_reason false
request_timeout 1000s
include_timestamp true
<buffer>
@type memory
flush_interval 3s
retry_max_times 10
</buffer>
</match>
02_filters.conf: |-
03_dispatch.conf: |-
04_outputs.conf: |-
Your Error Log
Additional context
When the log is rotated, 2 entries added to the pos file.
# ls -lrth
\total 167M
-rw-r----- 1 root root 103M Jul 21 16:47 0.log.20210721-164752
-rw-r----- 1 root root 47M Jul 21 16:48 0.log
Pos file:
# cat /var/log/fluent/fluentd-containers.log.pos
/var/log/containers/openresty-efk-stress-test-558464d485-wg7l5_efktest_openresty-7d2236f476b49cf828caec125558eb0724532481067814e85cf2c50b85754b3f.log 00000000009c0246 000000006a4316ab
/var/log/containers/openresty-efk-stress-test-558464d485-wg7l5_efktest_openresty-7d2236f476b49cf828caec125558eb0724532481067814e85cf2c50b85754b3f.log 00000000005fdfd8 000000006a4316ac
/var/log/containers/openresty-efk-stress-test-558464d485-wg7l5_efktest_openresty-7d2236f476b49cf828caec125558eb0724532481067814e85cf2c50b85754b3f.log 00000000005e3ffe 000000006a4316ac
One example of the duplicated log:
# grep 2021-07-21T16:47:52.958955667+08:00 *
0.log:2021-07-21T16:47:52.958955667+08:00 stdout F 172.24.0.69 - - [21/Jul/2021:16:47:52 +0800] "GET / HTTP/1.1" 204 0 "-" "Go-http-client/1.1"
The above log is found duplicated in Kibana. However, it only appears once in 0.log file.

Describe the bug
Continue on #3434. I followed @ashie suggestion and did the stress test again on our EFK stack with read_bytes_limit_per_second parameter and Fluentd version
v1.13.2. However, I found that logs are duplicated in Elasticsearch.To Reproduce
Container Runtime Version:
containerd://1.4Kubelet logging configuration:
--container-log-max-files=50 --container-log-max-size=100MiExpected behavior
Expect Kibana to also receive 1,200,000 logs. However, it receives 1,622,486 entires.

Your Environment
Your Configuration
Fluentd Chart
values.yaml:Your Error Log
Additional context
When the log is rotated, 2 entries added to the pos file.
Pos file:
One example of the duplicated log:
The above log is found duplicated in Kibana. However, it only appears once in 0.log file.
