Skip to content

AWS SD: Optimise MSK Role#18041

Merged
krajorama merged 1 commit intoprometheus:mainfrom
matt-gp:aws-sd-msk-optimisations
Feb 10, 2026
Merged

AWS SD: Optimise MSK Role#18041
krajorama merged 1 commit intoprometheus:mainfrom
matt-gp:aws-sd-msk-optimisations

Conversation

@matt-gp
Copy link
Collaborator

@matt-gp matt-gp commented Feb 8, 2026

This change makes some optimisations to the MSK role.

These optimisations include:

  • Moving the external call that gets the nodes outside the main goroutine so we get all the nodes up front, instead of when we loop through the clusters.
  • Using error groups in the listNodes method (previously used wait groups), this is to mitigate the possibility of hitting rate limits if someone had a lot of clusters.
  • Using error groups in the describeClusters method (previously used wait groups), this is to mitigate the possibility of hitting rate limits if someone had a lot of clusters.

Which issue(s) does the PR fix:

Does this PR introduce a user-facing change?

NONE

@matt-gp matt-gp force-pushed the aws-sd-msk-optimisations branch from 14d7504 to 8c1b4b0 Compare February 8, 2026 16:51
@matt-gp matt-gp marked this pull request as ready for review February 8, 2026 18:50
@matt-gp matt-gp requested review from a team and sysadmind as code owners February 8, 2026 18:50
@matt-gp
Copy link
Collaborator Author

matt-gp commented Feb 8, 2026

@SuperQ @sysadmind would you be able to review when you have a minute?

@matt-gp matt-gp changed the title AWS SD: Optmise MSK Role AWS SD: Optimise MSK Role Feb 8, 2026
Signed-off-by: matt-gp <small_minority@hotmail.com>
@matt-gp matt-gp force-pushed the aws-sd-msk-optimisations branch from 8c1b4b0 to 96a87a0 Compare February 8, 2026 19:42
@SuperQ
Copy link
Member

SuperQ commented Feb 8, 2026

Do we really need that much request concurrency here? That's slightly concerning on its own.

@matt-gp
Copy link
Collaborator Author

matt-gp commented Feb 8, 2026

Happy to lower it if needs be, I've been trying to find what the actual limits are but not found them anywhere.

For reference ECS allows upto 20 on a lot it's API.

Should we lower it down to something like 3, or should we go even lower?

@SuperQ
Copy link
Member

SuperQ commented Feb 8, 2026

Ah, yea, I hadn't noticed that ECS also did this. I guess I'm just surprised by the need for it.

Copy link
Member

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@krajorama krajorama merged commit 246341b into prometheus:main Feb 10, 2026
53 of 54 checks passed
@matt-gp matt-gp deleted the aws-sd-msk-optimisations branch February 11, 2026 09:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants