Skip to content

[AWS] improve S3 input states copy by only storing filtered entries#41869

Merged
Kavindu-Dodan merged 2 commits intoelastic:mainfrom
Kavindu-Dodan:chore/improve-s3-store-when-using-prefix
Dec 3, 2024
Merged

[AWS] improve S3 input states copy by only storing filtered entries#41869
Kavindu-Dodan merged 2 commits intoelastic:mainfrom
Kavindu-Dodan:chore/improve-s3-store-when-using-prefix

Conversation

@Kavindu-Dodan
Copy link
Copy Markdown
Contributor

@Kavindu-Dodan Kavindu-Dodan commented Dec 3, 2024

Proposed commit message

Improves S3 polling mode state registry copy by considering the bucket_list_prefix usage. Before this change, input stored all registry entries loaded from the underlying registry (ex:- when storing, restarting beats or upgrading while pointing to the same registry). With this improvement, when bucket_list_prefix is used, the state registry copy only holds entries matching the given prefix.

image

This improvement benefits the state registry clean-up planned through #41694. Further, when beats restart, this change improves the input-specific state copy memory usage by only storing relevant entries, matching the configured prefix.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

None - this change only concerns the input-specific copy of the registry entries

How to test this PR locally

Requires a build and S3 bucket entries with prefixes.

  • Generate a mix of S3 bucket entries with prefixes and without. You may use data gen tool 1
  • Build filebeat from this branch & configure multiple inputs with prefixes
  • Observe state registry filling at startup. You may first run without a prefix to store all entries and later restart beats to observe the loading behavior

Related issues

#39116

Footnotes

  1. https://github.com/Kavindu-Dodan/data-gen

@Kavindu-Dodan Kavindu-Dodan requested a review from a team as a code owner December 3, 2024 17:26
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Dec 3, 2024
@Kavindu-Dodan Kavindu-Dodan added the Team:obs-ds-hosted-services Label for the Observability Hosted Services team label Dec 3, 2024
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Dec 3, 2024
@Kavindu-Dodan Kavindu-Dodan added enhancement needs_team Indicates that the issue/PR needs a Team:* label backport-8.x Automated backport to the 8.x branch with mergify and removed needs_team Indicates that the issue/PR needs a Team:* label labels Dec 3, 2024
@Kavindu-Dodan Kavindu-Dodan changed the title [AWS] improve S3 registry states copy by only storing filtered entries [AWS] improve S3 input states copy by only storing filtered entries Dec 3, 2024
@Kavindu-Dodan Kavindu-Dodan force-pushed the chore/improve-s3-store-when-using-prefix branch from 20a8edf to e6296f6 Compare December 3, 2024 18:41
Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>
Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>
@Kavindu-Dodan Kavindu-Dodan force-pushed the chore/improve-s3-store-when-using-prefix branch from e6296f6 to f128cf0 Compare December 3, 2024 18:47
Copy link
Copy Markdown
Contributor

@kaiyan-sheng kaiyan-sheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good. Just one question: if we have two s3 input running, one with prefix A and one without any prefix. Will the state be missing entries?

@Kavindu-Dodan
Copy link
Copy Markdown
Contributor Author

The change looks good. Just one question: if we have two s3 input running, one with prefix A and one without any prefix. Will the state be missing entries?

No, there won't be missing entries. The prefixed input store will store and handle entries with prefixes. The non-prefixed input will maintain all entries, including prefixed ones. This holds even when restarting/upgrading, pointing to the same registry.

@Kavindu-Dodan Kavindu-Dodan merged commit 91070bf into elastic:main Dec 3, 2024
mergify bot pushed a commit that referenced this pull request Dec 3, 2024
…41869)

* s3 state imporvement with prefix filtering

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>

* add changelog entry

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>

---------

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>
(cherry picked from commit 91070bf)
Kavindu-Dodan added a commit that referenced this pull request Dec 4, 2024
…41869) (#41883)

* s3 state imporvement with prefix filtering

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>

* add changelog entry

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>

---------

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>
(cherry picked from commit 91070bf)

Co-authored-by: Kavindu Dodanduwa <Kavindu-Dodan@users.noreply.github.com>
@Kavindu-Dodan Kavindu-Dodan added the backport-8.16 Automated backport with mergify label Dec 5, 2024
mergify bot pushed a commit that referenced this pull request Dec 5, 2024
…41869)

* s3 state imporvement with prefix filtering

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>

* add changelog entry

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>

---------

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>
(cherry picked from commit 91070bf)
@Kavindu-Dodan Kavindu-Dodan added the backport-8.17 Automated backport with mergify label Dec 5, 2024
mergify bot pushed a commit that referenced this pull request Dec 5, 2024
…41869)

* s3 state imporvement with prefix filtering

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>

* add changelog entry

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>

---------

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>
(cherry picked from commit 91070bf)
Kavindu-Dodan added a commit that referenced this pull request Dec 6, 2024
…oring filtered entries (#41922)

* [AWS] improve S3 input states copy by only storing filtered entries (#41869)

* s3 state imporvement with prefix filtering

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>

* add changelog entry

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>

---------

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>
(cherry picked from commit 91070bf)

* Update CHANGELOG.next.asciidoc

---------

Co-authored-by: Kavindu Dodanduwa <Kavindu-Dodan@users.noreply.github.com>
Kavindu-Dodan added a commit that referenced this pull request Dec 6, 2024
…oring filtered entries (#41921)

* [AWS] improve S3 input states copy by only storing filtered entries (#41869)

* s3 state imporvement with prefix filtering

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>

* add changelog entry

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>

---------

Signed-off-by: Kavindu Dodanduwa <kavindu.dodanduwa@elastic.co>
(cherry picked from commit 91070bf)

* Update CHANGELOG.next.asciidoc

---------

Co-authored-by: Kavindu Dodanduwa <Kavindu-Dodan@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-8.x Automated backport to the 8.x branch with mergify backport-8.16 Automated backport with mergify backport-8.17 Automated backport with mergify enhancement Team:obs-ds-hosted-services Label for the Observability Hosted Services team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants