Skip to content

Fix kuma_sd targetgroup reporting#9157

Merged
roidelapluie merged 3 commits intoprometheus:mainfrom
austince:fix/kuma-sd/unique-sources
Aug 5, 2021
Merged

Fix kuma_sd targetgroup reporting#9157
roidelapluie merged 3 commits intoprometheus:mainfrom
austince:fix/kuma-sd/unique-sources

Conversation

@austince
Copy link
Contributor

@austince austince commented Aug 4, 2021

There are currently a few related issues this PR addresses:

  • xDS targetgroups are currently all given the same source name, so each job can only have max 1 group
  • xDS targetgroups are not tracked between polling, so it is impossible to remove a target
  • The Kuma MADS API's MonitoringAssignment is mapped directly into a targetgroup.Group, but does not contain a unique identifier.

This PR bundles all discovered targets into a single targetgroup.Group by changing the resourceParser func to return a set of targets instead of groups.

I've built an image with these patches from 2.29-rc.0 and pushed it to austince/prometheus:2.29-rc.0-fix2. I've also tested both adding multiple targets in a single job and removing targets. Testing repo here: https://github.com/austince/kuma-sd-playground

Fixes #9155

@austince austince changed the title Fix/kuma sd/unique sources Fix kuma_sd targetgroup reporting Aug 4, 2021
@austince austince force-pushed the fix/kuma-sd/unique-sources branch 2 times, most recently from a8eed7d to 329d98b Compare August 4, 2021 21:47
@austince
Copy link
Contributor Author

austince commented Aug 4, 2021

@roidelapluie, could you have a look at this? 🙏🏼 endless thank-yous 🙏🏼

@roidelapluie roidelapluie changed the base branch from main to release-2.29 August 4, 2021 22:39
@roidelapluie roidelapluie requested a review from codesome as a code owner August 4, 2021 22:39
@roidelapluie
Copy link
Member

Can you rebase this against release-2.29?

@roidelapluie roidelapluie changed the base branch from release-2.29 to main August 4, 2021 22:40
@austince austince force-pushed the fix/kuma-sd/unique-sources branch from 329d98b to 02ec91f Compare August 4, 2021 22:40
@austince austince requested a review from roidelapluie as a code owner August 4, 2021 22:40
@austince austince force-pushed the fix/kuma-sd/unique-sources branch from 02ec91f to 665e9cd Compare August 4, 2021 22:41
@austince austince changed the base branch from main to release-2.29 August 4, 2021 22:42
@austince
Copy link
Contributor Author

austince commented Aug 4, 2021

Wait, which should be the target branch?

@roidelapluie
Copy link
Member

release-2.29, but I have changed back to main for my initial review :) please rebase and change the branch at the same time :)

@austince
Copy link
Contributor Author

austince commented Aug 4, 2021

release-2.29, but I have changed back to main for my initial review :) please rebase and change the branch at the same time :)

ah, okie :) rebase and change back to main incoming...

@austince austince force-pushed the fix/kuma-sd/unique-sources branch from 665e9cd to c27857f Compare August 4, 2021 22:54
@austince austince changed the base branch from release-2.29 to main August 4, 2021 22:54
@austince austince force-pushed the fix/kuma-sd/unique-sources branch from c27857f to d9e0aab Compare August 4, 2021 22:55
@roidelapluie roidelapluie changed the base branch from main to release-2.29 August 5, 2021 07:13
@roidelapluie roidelapluie changed the base branch from release-2.29 to main August 5, 2021 07:13
@roidelapluie
Copy link
Member

Could we just group all the targets in one targetgroup?

@roidelapluie
Copy link
Member

roidelapluie commented Aug 5, 2021

I have discussed with Austin ; it would be better to stop using Labels in TargetGroup, putting all the labels in Targets, therefore having a single TargetGroup with a "kuma" Source. That would simplify the code a lot.

@austince austince force-pushed the fix/kuma-sd/unique-sources branch 2 times, most recently from d4a2ab9 to 2e432e0 Compare August 5, 2021 17:52
@austince
Copy link
Contributor Author

austince commented Aug 5, 2021

I have discussed with Austin ; it would be better to stop using Labels in TargetGroup, putting all the labels in Targets, therefore having a single TargetGroup with a "kuma" Source. That would simplify the code a lot.

Implemented and tested @roidelapluie !

Signed-off-by: austin ce <austin.cawley@gmail.com>
@austince austince force-pushed the fix/kuma-sd/unique-sources branch from 2e432e0 to bd2999a Compare August 5, 2021 19:29
@austince austince requested a review from roidelapluie August 5, 2021 20:01
Signed-off-by: austin ce <austin.cawley@gmail.com>
Signed-off-by: austin ce <austin.cawley@gmail.com>
@austince austince force-pushed the fix/kuma-sd/unique-sources branch from bd2999a to ffbe964 Compare August 5, 2021 20:53
@roidelapluie roidelapluie merged commit 5df3789 into prometheus:main Aug 5, 2021
roidelapluie pushed a commit to roidelapluie/prometheus that referenced this pull request Aug 5, 2021
* Bundle all xDS targets into a single group

Signed-off-by: austin ce <austin.cawley@gmail.com>
@austince austince deleted the fix/kuma-sd/unique-sources branch August 6, 2021 00:12
brancz added a commit that referenced this pull request Aug 6, 2021
Fix `kuma_sd` targetgroup reporting (#9157)
codesome added a commit that referenced this pull request Aug 11, 2021
* Fix `kuma_sd` targetgroup reporting (#9157)

* Bundle all xDS targets into a single group

Signed-off-by: austin ce <austin.cawley@gmail.com>

* Snapshot in-memory chunks on shutdown for faster restarts (#7229)

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

* Rename links

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Remove Individual Data Type Caps in Per-shard Buffering for Remote Write (#8921)

* Moved everything to nPending buffer

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Simplify exemplar capacity addition

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Added pre-allocation

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Don't allocate if not sending exemplars

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Avoid deadlock when processing duplicate series record (#9170)

* Avoid deadlock when processing duplicate series record

`processWALSamples()` needs to be able to send on its output channel
before it can read the input channel, so reads to allow this in case the
output channel is full.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* processWALSamples: update comment

Previous text seems to relate to an earlier implementation.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Optimise WAL loading by removing extra map and caching min-time (#9160)

* BenchmarkLoadWAL: close WAL after use

So that goroutines are stopped and resources released

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* BenchmarkLoadWAL: make series IDs co-prime with #workers

Series are distributed across workers by taking the modulus of the
ID with the number of workers, so multiples of 100 are a poor choice.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* BenchmarkLoadWAL: simulate mmapped chunks

Real Prometheus cuts chunks every 120 samples, then skips those samples
when re-reading the WAL. Simulate this by creating a single mapped chunk
for each series, since the max time is all the reader looks at.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Fix comment

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Remove series map from processWALSamples()

The locks that is commented to reduce contention in are now sharded
32,000 ways, so won't be contended. Removing the map saves memory and
goes just as fast.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* loadWAL: Cache the last mmapped chunk time

So we can skip calling append() for samples it will reject.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Improvements from code review

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Full stops and capitals on comments

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Cache max time in both places mmappedChunks is updated

Including refactor to extract function `setMMappedChunks`, to reduce
code duplication.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Update head min/max time when mmapped chunks added

This ensures we have the correct values if no WAL samples are added for
that series.

Note that `mSeries.maxTime()` was always `math.MinInt64` before, since
that function doesn't consider mmapped chunks.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Split Go and React Tests (#8897)

* Added go-ci and react-ci

Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu>
Signed-off-by: Levi Harrison <git@leviharrison.dev>

* Remove search keymap from new expression editor (#9184)

Signed-off-by: Julius Volz <julius.volz@gmail.com>

Co-authored-by: Austin Cawley-Edwards <austin.cawley@gmail.com>
Co-authored-by: Levi Harrison <git@leviharrison.dev>
Co-authored-by: Julien Pivotto <roidelapluie@inuits.eu>
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
juliusv pushed a commit that referenced this pull request Aug 12, 2021
* PromQL: Fix start and end keywords masking label and metric names

This commit fixes an issue with the "at modifier" that introduced two
new keywords: `start` and `end`. In grouping options and in metric
names, these keywords took precedence over metric or label names, so
that those metrics and labels could no longer be referenced.

Signed-off-by: Clayton Peters <clayton.peters@man.com>

* Add in additional tests for metrics and/or labels called start/end.

Signed-off-by: Clayton Peters <clayton.peters@man.com>

* *: Cut 2.29.0-rc.0

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>

* VERSION: bump to 2.29.0-rc.0

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>

* Remove experimental wording on size-based retention

Followup of #9004

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Fix PR reference in changelog

Signed-off-by: George Brighton <george@gebn.co.uk>

* Describe EC2 availability zone IDs at most once per refresh (#9142)

Signed-off-by: George Brighton <george@gebn.co.uk>

* Describe EC2 availability zones at most once per SD load

Closes #9142.

Signed-off-by: George Brighton <george@gebn.co.uk>

* Incorporate feedback

Signed-off-by: George Brighton <george@gebn.co.uk>

* Integrate feedback

Signed-off-by: George Brighton <george@gebn.co.uk>

* Add a compatibility note for macOS users.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* *: Cut v2.29.0-rc.1

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>

* Fix `kuma_sd` targetgroup reporting (#9157)

* Bundle all xDS targets into a single group

Signed-off-by: austin ce <austin.cawley@gmail.com>

* *: cut v2.29.0-rc.2

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>

* Rename links

Signed-off-by: Levi Harrison <git@leviharrison.dev>

* bump codemirror-promql to 0.17.0

Signed-off-by: Augustin Husson <husson.augustin@gmail.com>

* *: cut v2.29.0

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>

* tsdb: align atomically accessed int64 (#9192)

This prevents a panic in 32-bit archs:
https://pkg.go.dev/sync/atomic#pkg-note-BUG

Fixed #9190

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

* Release 2.29.1 (#9193)

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

Co-authored-by: Clayton Peters <clayton.peters@man.com>
Co-authored-by: Frederic Branczyk <fbranczyk@gmail.com>
Co-authored-by: George Brighton <george@gebn.co.uk>
Co-authored-by: Austin Cawley-Edwards <austin.cawley@gmail.com>
Co-authored-by: Levi Harrison <git@leviharrison.dev>
Co-authored-by: Augustin Husson <husson.augustin@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kuma_sd does not report all targets

2 participants