Describe the enhancement:
Current S3 input without SQS notification calls ListObjects API to collect all logs/objects from the given S3 bucket. There is no filter functionality so users will get logs both old and new from the bucket.
It would be nice to have a start_timestamp config parameter for users to specify a timestamp. Instead of ingesting all logs from the bucket, we can call the same ListObjects API call, filter the results using the start_timestamp and only store logs that has a content LastModified >= start_timestamp.
Describe a specific use case for the enhancement or feature:
With the config below, we should only store logs with LastModified after start_timestamp: 2024-10-11T00:00:00+00:00.
filebeat.inputs:
- type: aws-s3
enabled: true
bucket_arn: arn:aws:s3:::test-s3-bucket
start_timestamp: 2024-10-11T00:00:00+00:00
This is what the ListObjects API call returns:
kaiyansheng ~ $ aws s3api list-objects --profile elastic-observability --bucket test-s3-bucket-ks
{
"Contents": [
{
"Key": "AWSLogs/123/",
"LastModified": "2024-10-11T17:14:41+00:00",
"ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
"Size": 0,
"StorageClass": "STANDARD",
"Owner": {
"ID": "xxx"
}
},
{
"Key": "AWSLogs/123/vpcflowlogs/us-east-1/2024/10/11/627286350134_vpcflowlogs_us-east-1_fl-076d15c25200b764f_20241011T1715Z_b12bee6c.log.gz",
"LastModified": "2024-10-11T17:21:43+00:00",
"ETag": "\"910555fdc5893bc433a020f5baee904e\"",
"Size": 1021,
"StorageClass": "STANDARD",
"Owner": {
"ID": "xxx"
}
},
...
Describe the enhancement:
Current S3 input without SQS notification calls
ListObjectsAPI to collect all logs/objects from the given S3 bucket. There is no filter functionality so users will get logs both old and new from the bucket.It would be nice to have a
start_timestampconfig parameter for users to specify a timestamp. Instead of ingesting all logs from the bucket, we can call the sameListObjectsAPI call, filter the results using thestart_timestampand only store logs that has a contentLastModified>=start_timestamp.Describe a specific use case for the enhancement or feature:
With the config below, we should only store logs with
LastModifiedafterstart_timestamp: 2024-10-11T00:00:00+00:00.This is what the
ListObjectsAPI call returns: