Skip to content

[AWS-S3] Add a timestamp filter for s3 polling mode #41232

@kaiyan-sheng

Description

@kaiyan-sheng

Describe the enhancement:

Current S3 input without SQS notification calls ListObjects API to collect all logs/objects from the given S3 bucket. There is no filter functionality so users will get logs both old and new from the bucket.

It would be nice to have a start_timestamp config parameter for users to specify a timestamp. Instead of ingesting all logs from the bucket, we can call the same ListObjects API call, filter the results using the start_timestamp and only store logs that has a content LastModified >= start_timestamp.

Describe a specific use case for the enhancement or feature:
With the config below, we should only store logs with LastModified after start_timestamp: 2024-10-11T00:00:00+00:00.

filebeat.inputs:
- type: aws-s3
  enabled: true
  bucket_arn: arn:aws:s3:::test-s3-bucket
  start_timestamp: 2024-10-11T00:00:00+00:00

This is what the ListObjects API call returns:

kaiyansheng ~  $ aws s3api list-objects --profile elastic-observability --bucket test-s3-bucket-ks
{
    "Contents": [
        {
            "Key": "AWSLogs/123/",
            "LastModified": "2024-10-11T17:14:41+00:00",
            "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
            "Size": 0,
            "StorageClass": "STANDARD",
            "Owner": {
                "ID": "xxx"
            }
        },
        {
            "Key": "AWSLogs/123/vpcflowlogs/us-east-1/2024/10/11/627286350134_vpcflowlogs_us-east-1_fl-076d15c25200b764f_20241011T1715Z_b12bee6c.log.gz",
            "LastModified": "2024-10-11T17:21:43+00:00",
            "ETag": "\"910555fdc5893bc433a020f5baee904e\"",
            "Size": 1021,
            "StorageClass": "STANDARD",
            "Owner": {
                "ID": "xxx"
            }
        },
...

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions