Skip to content

Cut v2.39.0-rc.0#11344

Merged
codesome merged 1 commit intoprometheus:release-2.39from
codesome:v2.39-rc.0
Sep 29, 2022
Merged

Cut v2.39.0-rc.0#11344
codesome merged 1 commit intoprometheus:release-2.39from
codesome:v2.39-rc.0

Conversation

@codesome
Copy link
Member

@codesome codesome commented Sep 23, 2022

NOTE: this PR is open against main branch right now because there are a few PRs that I am still waiting to be merged. Once they are merged, I will create a release-2.39 branch and point this PR against that.

I am still waiting for the following things to be done

@codesome
Copy link
Member Author

Making it ready for review. The CHANGELOG assumes that the PRs mentioned in the PR description have been merged (I will merge this PR after those are merged).

@beorn7
Copy link
Member

beorn7 commented Sep 27, 2022

What about #11317? It fixes a quite serious bug. We only need to review…

@codesome
Copy link
Member Author

Added, thanks

@codesome
Copy link
Member Author

@rfratto does #9876 require a CHANGELOG entry here?

@rfratto
Copy link
Contributor

rfratto commented Sep 27, 2022

@rfratto does #9876 require a CHANGELOG entry here?

@codesome I'd say so, since it's a bug and was causing issues with growing WAL sizes. Is there anything you need from me to generate a CHANGELOG entry?

@codesome
Copy link
Member Author

[BUGFIX] Agent: Fix validation of flag options and prevent WAL from growing more than desired. #9876

@rfratto could you confirm if the above entry looks fine? (from what I can see it is not just the default options as the PR title says)

@rfratto
Copy link
Contributor

rfratto commented Sep 27, 2022

@codesome Looks good!

Copy link
Member

@Nexucis Nexucis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice. Thank you @codesome

As a side note, it would be interesting to sort the changelog entries per type. Like instead of

* [FEATURE] **experimental** TSDB: Add support for ingesting out-of-order samples. This is configured via `out_of_order_time_window` field in the config file; check config file docs for more info. #11075
* [ENHANCEMENT] API: `/-/healthy` and `/-/ready` API calls now also respond to a `HEAD` request on top of existing `GET` support. #11160 
* [ENHANCEMENT] PuppetDB SD: Add `__meta_puppetdb_query` label. #11238
* [ENHANCEMENT] AWS Lightsail SD: Add `__meta_ec2_region` label. #11326
* [ENHANCEMENT] AWS EC2 SD: Add `____meta_lightsail_region` label. #11326
* [ENHANCEMENT] TSDB: Improve WAL replay timings. #10973 #11307 #11319
* [ENHANCEMENT] Scrape: Optimise relabeling by re-using memory. #11147
* [ENHANCEMENT] TSDB: Optimise memory by not storing unnecessary data in the memory. #11280 #11288 #11296
* [ENHANCEMENT] UI: Click to copy label-value pair from query result to clipboard. #11229
* [ENHANCEMENT] TSDB: Allow overlapping blocks by default. `--storage.tsdb.allow-overlapping-blocks` now has no effect. #11331
* [BUGFIX] TSDB: Turn off isolation for Head compaction to fix a memory leak. #11317
* [BUGFIX] PromQL: Properly close file descriptor when logging unfinished queries. #11148
* [BUGFIX] TSDB: Fix 'invalid magic number 0' error on Prometheus startup. #11338
* [BUGFIX] Agent: Fix validation of flag options and prevent WAL from growing more than desired. #9876

we have

* [FEATURE] **experimental** TSDB: Add support for ingesting out-of-order samples. This is configured via `out_of_order_time_window` field in the config file; check config file docs for more info. #11075
* [ENHANCEMENT] API: `/-/healthy` and `/-/ready` API calls now also respond to a `HEAD` request on top of existing `GET` support. #11160 
* [ENHANCEMENT] PuppetDB SD: Add `__meta_puppetdb_query` label. #11238
* [ENHANCEMENT] AWS Lightsail SD: Add `__meta_ec2_region` label. #11326
* [ENHANCEMENT] AWS EC2 SD: Add `____meta_lightsail_region` label. #11326
* [ENHANCEMENT] Scrape: Optimise relabeling by re-using memory. #11147
* [ENHANCEMENT] TSDB: Improve WAL replay timings. #10973 #11307 #11319
* [ENHANCEMENT] TSDB: Optimise memory by not storing unnecessary data in the memory. #11280 #11288 #11296
* [ENHANCEMENT] TSDB: Allow overlapping blocks by default. `--storage.tsdb.allow-overlapping-blocks` now has no effect. #11331
* [ENHANCEMENT] UI: Click to copy label-value pair from query result to clipboard. #11229
* [BUGFIX] TSDB: Turn off isolation for Head compaction to fix a memory leak. #11317
* [BUGFIX] TSDB: Fix 'invalid magic number 0' error on Prometheus startup. #11338
* [BUGFIX] PromQL: Properly close file descriptor when logging unfinished queries. #11148
* [BUGFIX] Agent: Fix validation of flag options and prevent WAL from growing more than desired. #9876

@codesome
Copy link
Member Author

Thanks, makes sense

@codesome codesome changed the base branch from main to release-2.39 September 28, 2022 16:41
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
@codesome codesome merged commit 9b2b993 into prometheus:release-2.39 Sep 29, 2022
@codesome
Copy link
Member Author

/prombench v2.38.0

@prombot
Copy link
Contributor

prombot commented Sep 29, 2022

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-11344 and v2.38.0

After successful deployment, the benchmarking metrics can be viewed at:

Other Commands:
To stop benchmark: /prombench cancel
To restart benchmark: /prombench restart v2.38.0

@codesome
Copy link
Member Author

Benchmark analysis:

I can clearly see the impact of disabling isolation for head compaction - the isolation watermark different panel does not spike anymore.

In terms of CPU/Memory, seems similar, nothing conclusive. @bboreham is this unexpected?

There was one small difference: see this, this PR seems to have a few more compactions at this time than the earlier release. The number of compactions should not have changed. All earlier compactions match.

@bboreham
Copy link
Member

bboreham commented Oct 1, 2022

In terms of CPU/Memory, seems similar, nothing conclusive. @bboreham is this unexpected?

Memory: #11280 removed 8 bytes per series; #11288 removed 16; #11296 removed 56.
However #11075 added 48 bytes per series (assuming out-of-order function is not used).
So we should come out 32 bytes per series ahead.
The benchmark has around 5 million series, so we should save 150MB of heap, doubled to 300MB (by Go runtime allowing 100% growth).
The Next GC number is around 20GB, so we are looking for a 1.5% improvement.
I agree it's hard to see in the charts.

CPU I would have to look in more detail at profiles of PromBench running to say whether it is hitting any of the things that were improved.

@prombot
Copy link
Contributor

prombot commented Oct 2, 2022

Benchmark tests are running for 3 days! If this is intended ignore this message otherwise you can cancel it by commenting: /prombench cancel

@prombot
Copy link
Contributor

prombot commented Oct 2, 2022

Benchmark tests are running for 3 days! If this is intended ignore this message otherwise you can cancel it by commenting: /prombench cancel

@codesome
Copy link
Member Author

codesome commented Oct 2, 2022

/prombench cancel

@prombot
Copy link
Contributor

prombot commented Oct 2, 2022

Benchmark cancel is in progress.

@codesome
Copy link
Member Author

codesome commented Oct 4, 2022

Something that was not visible in the prombench https://twitter.com/_codesome/status/1577264183208398848

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants