Skip to content

aws-s3 input's bucket polling accumulates state in the registry #39116

@faec

Description

@faec

When scanning an S3 bucket, metadata from each object is saved to the registry (including whether it has been successfully downloaded). Each object's metadata consumes approximately 1KB of space in the registry.

The intention in the code was for this metadata to be deleted after a bucket scan, but this deletion was implemented incorrectly (see also #39065), so most S3 object metadata is persisted forever and never cleaned up. This accumulates even after objects have been removed from the original bucket, or the target bucket has been changed, so that the input adds ~1GB to the registry for every million objects it has ever seen across all time and all buckets. These objects are also stored in memory during Filebeat execution and can significantly increase memory requirements on large buckets.

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions