Skip to content

feat: aws elastic container service Discovery#15856

Closed
dennis-tra wants to merge 6 commits intoprometheus:mainfrom
dennis-tra:ecs-discovery
Closed

feat: aws elastic container service Discovery#15856
dennis-tra wants to merge 6 commits intoprometheus:mainfrom
dennis-tra:ecs-discovery

Conversation

@dennis-tra
Copy link

@dennis-tra dennis-tra commented Jan 22, 2025

Hi everyone,

this PR adds service discovery support for AWS Elastic Container Services. There were discussions around adding support going back three years ago here: #9310

The current solution for ECS integration with Prometheus is to deploy a service that exposes targets via an HTTP endpoint and let Prometheus pull from there. The service discovery itself is then outsourced to a third party component of which there exist:

Needless to say that it would be nicer to not have the additional operational complexity and instead have it natively supported by prometheus. Previously, the integration of ECS service discovery was rejected due to rate limiting concerns:

Some SD mechanisms have rate limits that make them challenging to use. As an example we have unfortunately had to reject Amazon ECS service discovery due to the rate limits being so low that it would not be usable for anything beyond small setups.

For reference, the limits are published here: https://docs.aws.amazon.com/AmazonECS/latest/APIReference/request-throttling.html

This ECS service discovery PR performs the following requests in sequence:

  1. ListClusters (unless manually configured) - Burst 50, refill 20, max results 100
  2. ListTasks (for each cluster) - Burst 100, refill 20, max results 100
  3. DescribeTasks - Counts against the same Cluster resource read actions bucket as the previous ListTasks
  4. DescribeTaskDefinition - Burst 50, refill 20, max results 1.

The final step is supposedly the reason why ECS service discovery was thought to only apply to "small setups". However, the response can be cached and reused (which is done in this PR). Then I would argue ECS service discovery would also apply to medium-sized setups (YMMV of course).

If you use the ECS EC2 launch type there would be additional requests (ec2.DescribeInstances) which I didn't include in the example above because the rate limits seem high enough to integrate EC2 service discovery into Prometheus.

I intend to use the changes here for our own setup because we're not operating at such a large scale. However, I would obviously prefer to not live on a fork indefinitely. I'm happy to iterate on the PR if it has a realistic chance of landing. I acknowledge that a decision has already been made against adding ECS service discovery but maybe the information above warrants a reconsideration.

Some additional remarks regarding this PR:

alanprot and others added 6 commits December 30, 2024 13:36
Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: alanprot <alanprot@gmail.com>
The configuration already contains the credentials. Overwriting it here again with a potential `nil` credential object would prohibit using the credentials that were loaded via the regular credential chain in LoadDefaultConfig.
@nsowen
Copy link

nsowen commented Mar 21, 2025

Also interested. Are there any plans yet to (fix and) merge this MR?

@krajorama
Copy link
Member

Hello from the bug scrub!

Are you @dennis-tra willing to become a maintainer of the AWS EC service discovery to triage/review issues and PRs when necessary?

We'll also add this PR to the Prometheus Dev Summit backlog to discuss the direction.

@dennis-tra
Copy link
Author

Hi @krajorama, sorry for the delay.

Considering all my other commitments I'm not sure if I'm able to put the time aside. That being said, I have no clue what the regular time invest would be. Is it a PR / month? Less? More? It really depends.

Generally, I'm open to doing that but I would have limited time to put aside to this.

@matt-gp
Copy link
Collaborator

matt-gp commented Aug 25, 2025

Hey, is anything happening with this? I was about to implement something similar but this ticks all the boxes. Would be happy to maintain it if this is needed.

@dennis-tra
Copy link
Author

dennis-tra commented Aug 25, 2025

Hi @matt-gp, feel free to take over the work that I started here. IIUC this PR was discussed here: https://docs.google.com/document/d/1uurQCi5iVufhYHGlBZ8mJMK_freDFKPG0iYBQqJ9fvA/edit?tab=t.0 (section "Bryan: Add AWS ECS SD")

The outcome was: "Let’s only add new SDs where we have a dedicated maintainer for it."

I would love to commit to maintaining this feature but I just don't have the capacity, so if you want to take the lead, it looks like the chances aren't too bad that this will eventually land.

Let me know if you have a new PR and I'll close this one and link to yours' 👍

@bboreham
Copy link
Member

IIUC this PR was discussed

It was not; it is in the backlog of items to discuss.

@matt-gp
Copy link
Collaborator

matt-gp commented Aug 29, 2025

Created 17105

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants