Skip to content

feat: Unified AWS Service Discovery#17406

Merged
jan--f merged 1 commit intoprometheus:mainfrom
matt-gp:aws-ecs-service-discovery
Nov 7, 2025
Merged

feat: Unified AWS Service Discovery#17406
jan--f merged 1 commit intoprometheus:mainfrom
matt-gp:aws-ecs-service-discovery

Conversation

@matt-gp
Copy link
Collaborator

@matt-gp matt-gp commented Oct 27, 2025

Hi All,

Going on the feedback that was received for 17105, this PR implements Unified AWS Service Discovery, similar to how the Kubernetes one works. This should make adding more (eg MSK) easier in the future.

Currently this will support ec2, ecs and lightsail. Users using the existing ec2 and lightsail implementations will not be affected so they do not have to migrate if they don't want to.

An example of the config would be something along the lines of:

- job_name: ecs
  aws_sd_configs:
    role: ecs
    region: us-east-1
    ...

Which issue(s) does the PR fix:

Does this PR introduce a user-facing change?

[FEATURE] Add unified AWS service discovery #17046

@matt-gp matt-gp force-pushed the aws-ecs-service-discovery branch from 4c8050d to 7e85ce5 Compare October 27, 2025 18:47
Copy link
Member

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great!

@matt-gp matt-gp force-pushed the aws-ecs-service-discovery branch 4 times, most recently from c6f5bc3 to 8f0f1d1 Compare October 27, 2025 19:22
@matt-gp matt-gp force-pushed the aws-ecs-service-discovery branch 3 times, most recently from 3b88412 to e3774a1 Compare October 27, 2025 20:13
Copy link
Member

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic!

@SuperQ
Copy link
Member

SuperQ commented Oct 28, 2025

After a couple of releases we could mark the old ec2_sd_configs and lightsail_sd_configs as deprecated.

@bwplotka
Copy link
Member

bwplotka commented Oct 28, 2025

We just had a few incidents of bugs in AWS (EC2) SD (example1, example2, example3). We have a trouble finding some sponsor for AWS infra we could use for testing on every PR, on top of the trouble to find owner for the AWS related SD components.

This PR adds another AWS piece, which only increases the problem. Should we fix the sustainability and testing issues first?

Reading through #17105 (comment) sounds like you want to help us own this, that would be amazing. Is this still true?

I'm not sure how to do this with governance but we could add non maintainer (for now, hopefully you could officially be maintainer soon!), to the https://github.com/prometheus/prometheus/blob/main/MAINTAINERS.md I guess? WDYT @gouthamve @bboreham @sysadmind

Any ideas around e2e testability? 🤔

@matt-gp
Copy link
Collaborator Author

matt-gp commented Oct 28, 2025

Hi, yes I'd be happy too help maintain this and help own it.

In terms of automated testing it I wondered if we could use something like localstack?

@SuperQ
Copy link
Member

SuperQ commented Oct 28, 2025

@bwplotka No, I think fixing the testing is a separate project. I understand the concern, but I think it's better to move forward rather than block everything on vague refactoring / testing requirements.

That's how we ended up with a multi-year discovery moratorium that did not help the community.

@bwplotka
Copy link
Member

bwplotka commented Oct 28, 2025

The fact we literally cannot ensure SD works as a team is kind of not vague (: And it blocks our releases. Just one e2e test again ms infra, even a best effort one would be amazing.

EDIT: Just as a data point, so far auth was the most fragile path.

But yea, let's iterate!

Signed-off-by: matt-gp <small_minority@hotmail.com>
@matt-gp matt-gp force-pushed the aws-ecs-service-discovery branch from e3774a1 to 1b52ab9 Compare November 6, 2025 22:48
@SuperQ
Copy link
Member

SuperQ commented Nov 7, 2025

Any other review comments?

@jan--f jan--f merged commit 3d41840 into prometheus:main Nov 7, 2025
28 checks passed
@marcus-crane
Copy link

Exciting stuff!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants