feat: Add flag that blocks lvl 1 compactions until upload is confirmed in an external JSON file by prymitive · Pull Request #17435 · prometheus/prometheus

prymitive · 2025-10-31T11:51:18Z

Using Thanos sidecar with Prometheus requires us to disable TSDB compactions on Prometheus side by setting --storage.tsdb.min-block-duration and --storage.tsdb.max-block-duration to the same value. See https://thanos.io/tip/components/sidecar.md. The main problem this avoids is that Prometheus might compact given block before Thanos uploads it, creating a gap in Thanos metrics. Thanos does not upload compacted blocks because that would upload the same sample multiple times. You can tell Thanos to upload compacted blocks but that is aimed at one time migrations. This patch creates a bridge between Thanos and Prometheus by allowing Prometheus to read the shipper file Thanos creates, where it tracks which blocks were already uploaded, and using that data delays compaction of blocks until they are marked as uploaded by Thanos. Thanks to this both services can coordinate with each other (in a way) and we can stop disabling compaction on Prometheus side when Thanos uploads are enabled.

The reason to have this is that disabling compactions have very dramatic performance cost. Since most time series exist for longer than a single block duration (2h by default) large chunks of block index will reference the same series, so 10 * 2h blocks will each have an index that is usually fairly big and is almost the same for all 10 blocks. Compaction de-duplicates the index so merging 10 blocks together would leave us with a single index that is around the same size as each of these 10 2h blocks would have (plus some extra for series that only exists in some blocks, but not all). Every range query that iterates over all 10 blocks would then have to read each index and so we're doing 10x more work then if we had a single compacted block.

We are running with this patch for over a month now and it reduced cpu usage on instances with Thanos uploads enabled dramatically, plus it increased the effective retention because we do>

On Thanos side this requires --shipper.ignore-unequal-block-size so Thanos stops complaining about the fact that compactions are enabled on Prometheus.
If this patch is accepted we could follow up with Thanos and to make it be aware of this Prometheus flag and if set stop complaining about compactions.

cc @bwplotka

Which issue(s) does the PR fix:

Does this PR introduce a user-facing change?

[FEATURE] Add --storage.tsdb.delay-compact-file.path flag for better interoperability with block upload side processes (e.g. Thanos sidecar). When this flag is enabled Prometheus will not compact level 1 blocks until block ULID is in JSON under this path.

gregwork · 2025-10-31T13:54:14Z

This would be extremely useful for environments I manage.

bwplotka

Thanks for this!

So if I understand correctly, it's a special mode where Prometheus compaction will depend on a special file using Thanos Shipper meta file format that is a simple JSON with a list of local blocks that were already uploaded to object storage:

type Meta struct {
	Version  int         `json:"version"`
        // Existing
	Uploaded []ulid.ULID `json:"uploaded"`
}

Commonly the file is generated by Thanos sidecar as thanos.shipper.json

With this change, Prometheus only starts level1+ compaction, only when blocks appear in this external JSON file, massively improving hybrid setups (using Thanos with a sidecar which still queries local Prometheus).

I assume the benefit is only for users with more than ~4-5h retention. Do we know how common it is to pay for Thanos and yet having longer local retention these days?

Should it be in Prometheus?

So the main argument why we never did this integration with Prometheus is that Thanos was new and it adds special provider handling for Prometheus code, literally only for Thanos. However it's super simple, makes sense and shipper file format never changed for the last 8 years. We could adopt and control this format in Prometheus too to fully make this feature our own. In this case I would literally avoid thanos words in flag and code to make sure any other integration can use this.

Is there an alternative for abstracting this logic to be useful for wider range of projects (aka API for compactions? or delaying Compactions?). I guess we could have a Prometheus mode that don't compact further unless a HTTP API /compact/uulid trigger happens. Would it be useful? It would mean some more code on Thanos side, but it's doable, and we add a functionality that is perhaps easier to use for others.

I do like the solution from this PR (modulo thanos wording in the flag and code) mostly because we have use cases in Prometheus to have a native block upload to objstore. cc @jesusvazquez @bboreham @SuperQ so maybe it's a good first step towards that support? For the native objstore support we would need to start a bigger discussion on DevSummit and proposals.

However, I am co-creator of Thanos project (and Prometheus maintainer) which makes me potentially biased. I would like to add this AND follow up eventually with the discussion for the native upload. But let's get some opinions from other maintainers (:

cmd/prometheus/main.go

xiu · 2025-11-23T19:32:20Z

I assume the benefit is only for users with more than ~4-5h retention. Do we know how common it is to pay for Thanos and yet having longer local retention these days?

At least in my deployments, we have alerting local to Prometheus and keep a 30+ days local retention to record SLOs.

bwplotka

Amazing, thanks!

I reviewed in detail and I'd be keen to merge it, but we need another LGTM from someone else too. Will ping around once comments are addressed.

Some readability suggestions and perhaps one quick optimization one, otherwise LGTM from my side!

cmd/prometheus/main.go

Using Thanos sidecar with Prometheus requires us to disable TSDB compactions on Prometheus side by setting --storage.tsdb.min-block-duration and --storage.tsdb.max-block-duration to the same value. See https://thanos.io/tip/components/sidecar.md. The main problem this avoids is that Prometheus might compact given block before Thanos uploads it, creating a gap in Thanos metrics. Thanos does not upload compacted blocks because that would upload the same sample multiple times. You can tell Thanos to upload compacted blocks but that is aimed at one time migrations. This patch creates a bridge between Thanos and Prometheus by allowing Prometheus to read the shipper file Thanos creates, where it tracks which blocks were already uploaded, and using that data delays compaction of blocks until they are marked as uploaded by Thanos. Thanks to this both services can coordinate with each other (in a way) and we can stop disabling compaction on Prometheus side when Thanos uploads are enabled. The reason to have this is that disabling compactions have very dramatic performance cost. Since most time series exist for longer than a single block duration (2h by default) large chunks of block index will reference the same series, so 10 * 2h blocks will each have an index that is usually fairly big and is almost the same for all 10 blocks. Compaction de-duplicates the index so merging 10 blocks together would leave us with a single index that is around the same size as each of these 10 2h blocks would have (plus some extra for series that only exists in some blocks, but not all). Every range query that iterates over all 10 blocks would then have to read each index and so we're doing 10x more work then if we had a single compacted block. Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>

bwplotka

This is great for me. I think this is important for the ecosystem and it's relevant for any block backup side-channel, not only Thanos.

LGTM, thanks!

I will wait with merging until we have a non-Thanos related maintainer approve.

Perhaps @roidelapluie @beorn7 @bboreham @ArthurSens @krajorama?

beorn7

I'm not opposed. :)

bwplotka · 2025-12-02T10:36:41Z

cmd/prometheus/main.go

+
+// Cache the last read UploadMeta.
+var (
+	tsdbDelayCompactLastMeta     *UploadMeta // The content of uploadMetaPath from the last time we've opened it.


Just noticed, I wished it was not a global just to be future proof and a good practice.

Not a big deal as it's in main (not importable), let's merge anyway.

bwplotka · 2025-12-02T10:40:02Z

I allowed myself to update PR title and Release notes, hope that makes sense.

… flag Prometheus has a new flag --storage.tsdb.delay-compact-file.path - prometheus/prometheus#17435. When this flag is passed Prometheus will check which blocks are marked as uploaded in external file and only compact these. Thanos should look for this flag and if it's set then it can stop forcing people to disable compactions. Signed-off-by: Lukasz Mierzwa <lukasz@cloudflare.com>

prymitive · 2025-12-02T11:34:00Z

Raised thanos-io/thanos#8582 for Thanos to support this flag

jesusvazquez

I'm late but this also LGTM. I was concerned at first with all the thanos mentioning inside Prometheus but the latest modifications have made it agnostic enough. Good work.

xiu · 2025-12-02T13:26:35Z

Thanks a lot, all!

dimitarvdimitrov · 2025-12-08T13:25:11Z

tsdb/compact.go

 			return nil, err
 		}
+		if c.blockExcludeFunc != nil && c.blockExcludeFunc(meta) {
+			break


shouldn't this continue instead of break?

No, I don't think it should.
Compactions work from oldest to newest, uploads do the same (usually).

If you continue here you'll skip compactions on this one block, but:

all further blocks are NOT yet uploaded

some or all further blocks are uploaded

If we continue and there are newer blocks to pick from then you will compact in a non-continuous way, leaving gaps of individual un-compacted blocks.

oh i see. this wasn't obvious. So this relies on the sorting that blockDirs and transitively os.ReadDir return, which is by filename, which is a block ULID, whose lexicographical sorting is also a sorting by timestamp

this requires a lot of gymnastics to understand. A comment or two on BlockExcludeFilterFunc and here in the loop would help

… flag Prometheus has a new flag --storage.tsdb.delay-compact-file.path - prometheus/prometheus#17435. When this flag is passed Prometheus will check which blocks are marked as uploaded in external file and only compact these. Thanos should look for this flag and if it's set then it can stop forcing people to disable compactions. Signed-off-by: Lukasz Mierzwa <lukasz@cloudflare.com>

The LeveledCompactor.Plan() function incorrectly used `break` instead of `continue` when a block matched the BlockExcludeFilter. This caused all blocks after the first excluded block to be silently ignored, preventing them from being considered for compaction. This bug affects the `--storage.tsdb.delay-compact-file.path` feature used for Prometheus/Thanos coordination, introduced in PR prometheus#17435. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

##### [\`v3.9.0\`](https://github.com/prometheus/prometheus/releases/tag/v3.9.0) #### Note for users of Native Histograms In version 3.9, Native Histograms is no longer experimental, and the feature flag `native-histogram` has no effect. You must now turn on the config setting `scrape_native_histograms` to collect Native Histogram samples from exporters. #### Changelog - \[CHANGE] Native Histograms are no longer experimental! Make the `native-histogram` feature flag a no-op. Use `scrape_native_histograms` config option instead. [#17528](prometheus/prometheus#17528) - \[CHANGE] API: Add maximum limit of 10,000 sets of statistics to TSDB status endpoint. [#17647](prometheus/prometheus#17647) - \[FEATURE] API: Add /api/v1/features for clients to understand which features are supported. [#17427](prometheus/prometheus#17427) - \[FEATURE] Promtool: Add `start_timestamp` field for unit tests. [#17636](prometheus/prometheus#17636) - \[FEATURE] Promtool: Add `--format seriesjson` option to `tsdb dump` to output just series labels in JSON format. [#13409](prometheus/prometheus#13409) - \[FEATURE] Add `--storage.tsdb.delay-compact-file.path` flag for better interoperability with Thanos. [#17435](prometheus/prometheus#17435) - \[FEATURE] UI: Add an option on the query drop-down menu to duplicate that query panel. [#17714](prometheus/prometheus#17714) - \[ENHANCEMENT]: TSDB: add flag `--storage.tsdb.block-reload-interval` to configure TSDB Block Reload Interval. [#16728](prometheus/prometheus#16728) - \[ENHANCEMENT] UI: Add graph option to start the chart's Y axis at zero. [#17565](prometheus/prometheus#17565) - \[ENHANCEMENT] Scraping: Classic protobuf format no longer requires the unit in the metric name. [#16834](prometheus/prometheus#16834) - \[ENHANCEMENT] PromQL, Rules, SD, Scraping: Add native histograms to complement existing summaries. [#17374](prometheus/prometheus#17374) - \[ENHANCEMENT] Notifications: Add a histogram `prometheus_notifications_latency_histogram_seconds` to complement the existing summary. [#16637](prometheus/prometheus#16637) - \[ENHANCEMENT] Remote-write: Add custom scope support for AzureAD authentication. [#17483](prometheus/prometheus#17483) - \[ENHANCEMENT] SD: add a `config` label with job name for most `prometheus_sd_refresh` metrics. [#17138](prometheus/prometheus#17138) - \[ENHANCEMENT] TSDB: New histogram `prometheus_tsdb_sample_ooo_delta`, the distribution of out-of-order samples in seconds. Collected for all samples, accepted or not. [#17477](prometheus/prometheus#17477) - \[ENHANCEMENT] Remote-read: Validate histograms received via remote-read. [#17561](prometheus/prometheus#17561) - \[PERF] TSDB: Small optimizations to postings index. [#17439](prometheus/prometheus#17439) - \[PERF] Scraping: Speed up relabelling of series. [#17530](prometheus/prometheus#17530) - \[PERF] PromQL: Small optimisations in binary operators. [#17524](prometheus/prometheus#17524), [#17519](prometheus/prometheus#17519). - \[BUGFIX] UI: PromQL autocomplete now shows the correct type and HELP text for OpenMetrics counters whose samples end in `_total`. [#17682](prometheus/prometheus#17682) - \[BUGFIX] UI: Fixed codemirror-promql incorrectly showing label completion suggestions after the closing curly brace of a vector selector. [#17602](prometheus/prometheus#17602) - \[BUGFIX] UI: Query editor no longer suggests a duration unit if one is already present after a number. [#17605](prometheus/prometheus#17605) - \[BUGFIX] PromQL: Fix some "vector cannot contain metrics with the same labelset" errors when experimental delayed name removal is enabled. [#17678](prometheus/prometheus#17678) - \[BUGFIX] PromQL: Fix possible corruption of PromQL text if the query had an empty `ignoring()` and non-empty grouping. [#17643](prometheus/prometheus#17643) - \[BUGFIX] PromQL: Fix resets/changes to return empty results for anchored selectors when all samples are outside the range. [#17479](prometheus/prometheus#17479) - \[BUGFIX] PromQL: Check more consistently for many-to-one matching in filter binary operators. [#17668](prometheus/prometheus#17668) - \[BUGFIX] PromQL: Fix collision in unary negation with non-overlapping series. [#17708](prometheus/prometheus#17708) - \[BUGFIX] PromQL: Fix collision in label\_join and label\_replace with non-overlapping series. [#17703](prometheus/prometheus#17703) - \[BUGFIX] PromQL: Fix bug with inconsistent results for queries with OR expression when experimental delayed name removal is enabled. [#17161](prometheus/prometheus#17161) - \[BUGFIX] PromQL: Ensure that `rate`/`increase`/`delta` of histograms results in a gauge histogram. [#17608](prometheus/prometheus#17608) - \[BUGFIX] PromQL: Do not panic while iterating over invalid histograms. [#17559](prometheus/prometheus#17559) - \[BUGFIX] TSDB: Reject chunk files whose encoded chunk length overflows int. [#17533](prometheus/prometheus#17533) - \[BUGFIX] TSDB: Do not panic during resolution reduction of invalid histograms. [#17561](prometheus/prometheus#17561) - \[BUGFIX] Remote-write Receive: Avoid duplicate labels when experimental type-and-unit-label feature is enabled. [#17546](prometheus/prometheus#17546) - \[BUGFIX] OTLP Receiver: Only write metadata to disk when experimental metadata-wal-records feature is enabled. [#17472](prometheus/prometheus#17472)

##### [\`v3.9.1\`](https://github.com/prometheus/prometheus/releases/tag/v3.9.1) - \[BUGFIX] Agent: fix crash shortly after startup from invalid type of object. [#17802](prometheus/prometheus#17802) - \[BUGFIX] Scraping: fix relabel keep/drop not working. [#17807](prometheus/prometheus#17807) --- ##### [\`v3.9.0\`](https://github.com/prometheus/prometheus/releases/tag/v3.9.0) #### Note for users of Native Histograms In version 3.9, Native Histograms is no longer experimental, and the feature flag `native-histogram` has no effect. You must now turn on the config setting `scrape_native_histograms` to collect Native Histogram samples from exporters. #### Changelog - \[CHANGE] Native Histograms are no longer experimental! Make the `native-histogram` feature flag a no-op. Use `scrape_native_histograms` config option instead. [#17528](prometheus/prometheus#17528) - \[CHANGE] API: Add maximum limit of 10,000 sets of statistics to TSDB status endpoint. [#17647](prometheus/prometheus#17647) - \[FEATURE] API: Add /api/v1/features for clients to understand which features are supported. [#17427](prometheus/prometheus#17427) - \[FEATURE] Promtool: Add `start_timestamp` field for unit tests. [#17636](prometheus/prometheus#17636) - \[FEATURE] Promtool: Add `--format seriesjson` option to `tsdb dump` to output just series labels in JSON format. [#13409](prometheus/prometheus#13409) - \[FEATURE] Add `--storage.tsdb.delay-compact-file.path` flag for better interoperability with Thanos. [#17435](prometheus/prometheus#17435) - \[FEATURE] UI: Add an option on the query drop-down menu to duplicate that query panel. [#17714](prometheus/prometheus#17714) - \[ENHANCEMENT]: TSDB: add flag `--storage.tsdb.block-reload-interval` to configure TSDB Block Reload Interval. [#16728](prometheus/prometheus#16728) - \[ENHANCEMENT] UI: Add graph option to start the chart's Y axis at zero. [#17565](prometheus/prometheus#17565) - \[ENHANCEMENT] Scraping: Classic protobuf format no longer requires the unit in the metric name. [#16834](prometheus/prometheus#16834) - \[ENHANCEMENT] PromQL, Rules, SD, Scraping: Add native histograms to complement existing summaries. [#17374](prometheus/prometheus#17374) - \[ENHANCEMENT] Notifications: Add a histogram `prometheus_notifications_latency_histogram_seconds` to complement the existing summary. [#16637](prometheus/prometheus#16637) - \[ENHANCEMENT] Remote-write: Add custom scope support for AzureAD authentication. [#17483](prometheus/prometheus#17483) - \[ENHANCEMENT] SD: add a `config` label with job name for most `prometheus_sd_refresh` metrics. [#17138](prometheus/prometheus#17138) - \[ENHANCEMENT] TSDB: New histogram `prometheus_tsdb_sample_ooo_delta`, the distribution of out-of-order samples in seconds. Collected for all samples, accepted or not. [#17477](prometheus/prometheus#17477) - \[ENHANCEMENT] Remote-read: Validate histograms received via remote-read. [#17561](prometheus/prometheus#17561) - \[PERF] TSDB: Small optimizations to postings index. [#17439](prometheus/prometheus#17439) - \[PERF] Scraping: Speed up relabelling of series. [#17530](prometheus/prometheus#17530) - \[PERF] PromQL: Small optimisations in binary operators. [#17524](prometheus/prometheus#17524), [#17519](prometheus/prometheus#17519). - \[BUGFIX] UI: PromQL autocomplete now shows the correct type and HELP text for OpenMetrics counters whose samples end in `_total`. [#17682](prometheus/prometheus#17682) - \[BUGFIX] UI: Fixed codemirror-promql incorrectly showing label completion suggestions after the closing curly brace of a vector selector. [#17602](prometheus/prometheus#17602) - \[BUGFIX] UI: Query editor no longer suggests a duration unit if one is already present after a number. [#17605](prometheus/prometheus#17605) - \[BUGFIX] PromQL: Fix some "vector cannot contain metrics with the same labelset" errors when experimental delayed name removal is enabled. [#17678](prometheus/prometheus#17678) - \[BUGFIX] PromQL: Fix possible corruption of PromQL text if the query had an empty `ignoring()` and non-empty grouping. [#17643](prometheus/prometheus#17643) - \[BUGFIX] PromQL: Fix resets/changes to return empty results for anchored selectors when all samples are outside the range. [#17479](prometheus/prometheus#17479) - \[BUGFIX] PromQL: Check more consistently for many-to-one matching in filter binary operators. [#17668](prometheus/prometheus#17668) - \[BUGFIX] PromQL: Fix collision in unary negation with non-overlapping series. [#17708](prometheus/prometheus#17708) - \[BUGFIX] PromQL: Fix collision in label\_join and label\_replace with non-overlapping series. [#17703](prometheus/prometheus#17703) - \[BUGFIX] PromQL: Fix bug with inconsistent results for queries with OR expression when experimental delayed name removal is enabled. [#17161](prometheus/prometheus#17161) - \[BUGFIX] PromQL: Ensure that `rate`/`increase`/`delta` of histograms results in a gauge histogram. [#17608](prometheus/prometheus#17608) - \[BUGFIX] PromQL: Do not panic while iterating over invalid histograms. [#17559](prometheus/prometheus#17559) - \[BUGFIX] TSDB: Reject chunk files whose encoded chunk length overflows int. [#17533](prometheus/prometheus#17533) - \[BUGFIX] TSDB: Do not panic during resolution reduction of invalid histograms. [#17561](prometheus/prometheus#17561) - \[BUGFIX] Remote-write Receive: Avoid duplicate labels when experimental type-and-unit-label feature is enabled. [#17546](prometheus/prometheus#17546) - \[BUGFIX] OTLP Receiver: Only write metadata to disk when experimental metadata-wal-records feature is enabled. [#17472](prometheus/prometheus#17472)

… flag Prometheus has a new flag --storage.tsdb.delay-compact-file.path - prometheus/prometheus#17435. When this flag is passed Prometheus will check which blocks are marked as uploaded in external file and only compact these. Thanos should look for this flag and if it's set then it can stop forcing people to disable compactions. Signed-off-by: Lukasz Mierzwa <lukasz@cloudflare.com>

* promql: fix histogram_fraction issue when lower falls within the first bucket (prometheus#17424) Signed-off-by: Mohammad Alavi <m.alavi1986@gmail.com> * prepare release 3.8.0-rc.0 Signed-off-by: Jan Fajerski <jfajersk@redhat.com> * test: skip TestRemoteWrite_ReshardingWithoutDeadlock temporarily as flaky (prometheus#17534) (prometheus#17543) (cherry picked from commit 35c3232) Signed-off-by: machine424 <ayoubmrini424@gmail.com> Signed-off-by: Jan Fajerski <jfajersk@redhat.com> Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> * chore(deps): bump prometheus/promci from 0.4.7 to 0.5.0 Signed-off-by: Jan Fajerski <jfajersk@redhat.com> * chore(deps): bump prometheus/promci from 0.5.0 to 0.5.1 Signed-off-by: Jan Fajerski <jfajersk@redhat.com> * chore(deps): bump prometheus/promci from 0.5.1 to 0.5.2 Signed-off-by: Jan Fajerski <jfajersk@redhat.com> * chore(deps): bump prometheus/promci from 0.5.2 to 0.5.3 Signed-off-by: Jan Fajerski <jfajersk@redhat.com> * prw2: Move Remote Write 2.0 CT to be per Sample; Rename to ST (start timestamp) (prometheus#17411) Relates to prometheus#16944 (comment) Signed-off-by: bwplotka <bwplotka@gmail.com> (cherry picked from commit cefefc6) * chore: prepare 3.8.0-rc.1 entry Signed-off-by: bwplotka <bwplotka@gmail.com> * [chore]: bump common dep to support RFC7523 3.1 Signed-off-by: Jorge Turrado <jorge.turrado@mail.schwarz> * Update Prometheus Agent doc (prometheus#17591) * Add a nav title to fix docs website generator. * Make it more clear that "Prometheus Agent" is a mode, not a seaparate service. * Add to index. * Cleanup some wording. * Add a downsides section. Signed-off-by: SuperQ <superq@gmail.com> (cherry picked from commit d0d2699) * chore(deps): bump github.com/prometheus/common from 0.67.3 to 0.67.4 (prometheus#17594) Signed-off-by: Jan Fajerski <jfajersk@redhat.com> * prepare release v3.8.0-rc.1 Signed-off-by: Jan Fajerski <jfajersk@redhat.com> * prepare release v3.8.0 Signed-off-by: Jan Fajerski <jfajersk@redhat.com> * chore: Fix function name typo in createBatchSpan comment Signed-off-by: zjumathcode <pai314159@2980.com> * feat: Add flag that blocks lvl 1 compactions until upload is confirmed in an external JSON file (prometheus#17435) * Delay compactions until Thanos uploads all blocks Using Thanos sidecar with Prometheus requires us to disable TSDB compactions on Prometheus side by setting --storage.tsdb.min-block-duration and --storage.tsdb.max-block-duration to the same value. See https://thanos.io/tip/components/sidecar.md. The main problem this avoids is that Prometheus might compact given block before Thanos uploads it, creating a gap in Thanos metrics. Thanos does not upload compacted blocks because that would upload the same sample multiple times. You can tell Thanos to upload compacted blocks but that is aimed at one time migrations. This patch creates a bridge between Thanos and Prometheus by allowing Prometheus to read the shipper file Thanos creates, where it tracks which blocks were already uploaded, and using that data delays compaction of blocks until they are marked as uploaded by Thanos. Thanks to this both services can coordinate with each other (in a way) and we can stop disabling compaction on Prometheus side when Thanos uploads are enabled. The reason to have this is that disabling compactions have very dramatic performance cost. Since most time series exist for longer than a single block duration (2h by default) large chunks of block index will reference the same series, so 10 * 2h blocks will each have an index that is usually fairly big and is almost the same for all 10 blocks. Compaction de-duplicates the index so merging 10 blocks together would leave us with a single index that is around the same size as each of these 10 2h blocks would have (plus some extra for series that only exists in some blocks, but not all). Every range query that iterates over all 10 blocks would then have to read each index and so we're doing 10x more work then if we had a single compacted block. Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com> * Rename structs and functions to make this more generic Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com> * Address review comments Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com> * Cache UploadMeta for 1 minute Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com> --------- Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com> * RW2: Allow custom scope in azuread (prometheus#17483) Signed-off-by: Ben Edmunds <sammybenblue2@gmail.com> * docs: Describe how time() is set to start at 0 in unit tests The return value of functions relating to the current time, e.g. time(), is set by promtool to start at timestamp 0 at the start of a test's evaluation. This has the very nice consequence that tests can run reliably without depending on when they are run. It does, however, mean that tests will give out results that can be unexpected by users. If this behaviour is documented, then users will be empowered to write tests for their rules that use time-dependent functions. (Closes: prometheus/docs#1464) Signed-off-by: Gabriel Filion <lelutin@torproject.org> * refactor(tsdb): use one test newTestDB constructor (prometheus#17638) For tests only, we had various ways of opening DB. Reduced to one instead of: * Open * newTestDB * newTestDBOpts * openTestDB This so prometheus#17629 is smaller and bit easier. Also for test maintainability and consistency. Signed-off-by: bwplotka <bwplotka@gmail.com> * Add start_timestamp field for unit tests. This commit adds support for configuring a custom start timestamp for Prometheus unit tests, allowing tests to use realistic timestamps instead of starting at Unix epoch 0. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com> * Fix serialization for empty `ignoring()` in combination with `group_x()` Currently both the backend and frontend printers/formatters/serializers incorrectly transform the following expression: ``` up * ignoring() group_left(__name__) node_boot_time_seconds ``` ...into: ``` up * node_boot_time_seconds ``` ...which yields a different result (including the metric name in the result vs. no metric name). We need to keep empty `ignoring()` modifiers if there is a grouping modifier present. Signed-off-by: Julius Volz <julius.volz@gmail.com> * Simplify StartTime assignment in unit test setup. Remove redundant IsZero check since promqltest.LazyLoader already handles zero StartTime by defaulting to Unix epoch. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com> * Update golangci-lint and add modernize check (prometheus#17640) * add modernize check Signed-off-by: dongjiang1989 <dongjiang1989@126.com> * fix golangci lint Signed-off-by: dongjiang1989 <dongjiang1989@126.com> --------- Signed-off-by: dongjiang1989 <dongjiang1989@126.com> * fix lint --------- Signed-off-by: Mohammad Alavi <m.alavi1986@gmail.com> Signed-off-by: Jan Fajerski <jfajersk@redhat.com> Signed-off-by: machine424 <ayoubmrini424@gmail.com> Signed-off-by: bwplotka <bwplotka@gmail.com> Signed-off-by: Jorge Turrado <jorge.turrado@mail.schwarz> Signed-off-by: SuperQ <superq@gmail.com> Signed-off-by: zjumathcode <pai314159@2980.com> Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com> Signed-off-by: Ben Edmunds <sammybenblue2@gmail.com> Signed-off-by: Gabriel Filion <lelutin@torproject.org> Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com> Signed-off-by: Julius Volz <julius.volz@gmail.com> Signed-off-by: dongjiang1989 <dongjiang1989@126.com> Co-authored-by: Mohammad Alavi <m.alavi1986@gmail.com> Co-authored-by: Jan Fajerski <jfajersk@redhat.com> Co-authored-by: Jan Fajerski <jan--f@users.noreply.github.com> Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Jorge Turrado <jorge.turrado@mail.schwarz> Co-authored-by: Ben Kochie <superq@gmail.com> Co-authored-by: zjumathcode <pai314159@2980.com> Co-authored-by: Łukasz Mierzwa <l.mierzwa@gmail.com> Co-authored-by: Ben Edmunds <Tigger2014@users.noreply.github.com> Co-authored-by: Julien <291750+roidelapluie@users.noreply.github.com> Co-authored-by: Gabriel Filion <lelutin@torproject.org> Co-authored-by: Julius Volz <julius.volz@gmail.com> Co-authored-by: dongjiang <dongjiang1989@126.com> Co-authored-by: Jeanette Tan <jeanette.tan@grafana.com>

prymitive requested a review from jesusvazquez as a code owner October 31, 2025 11:51

prymitive force-pushed the thanos-compactions branch 2 times, most recently from 79046a7 to c046572 Compare October 31, 2025 11:56

bwplotka reviewed Nov 17, 2025

View reviewed changes

cmd/prometheus/main.go Outdated Show resolved Hide resolved

cmd/prometheus/main.go Outdated Show resolved Hide resolved

bwplotka reviewed Nov 17, 2025

View reviewed changes

cmd/prometheus/main.go Outdated Show resolved Hide resolved

prymitive force-pushed the thanos-compactions branch from 4f317e5 to 9295fff Compare November 18, 2025 17:04

bwplotka reviewed Nov 24, 2025

View reviewed changes

prymitive added 3 commits November 28, 2025 12:48

Rename structs and functions to make this more generic

b8c528a

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>

Address review comments

9396abe

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>

prymitive force-pushed the thanos-compactions branch from f4cbc91 to 9396abe Compare November 28, 2025 12:48

Cache UploadMeta for 1 minute

60c3fbc

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>

bwplotka approved these changes Dec 1, 2025

View reviewed changes

beorn7 approved these changes Dec 2, 2025

View reviewed changes

bwplotka reviewed Dec 2, 2025

View reviewed changes

bwplotka changed the title ~~Delay compactions until Thanos uploads all blocks~~ feat: Add flag that blocks lvl 1 compactions until upload is confirmed in an external JSON file Dec 2, 2025

bwplotka merged commit 8a1086a into prometheus:main Dec 2, 2025
30 checks passed

prymitive mentioned this pull request Dec 2, 2025

Support newly added --storage.tsdb.delay-compact-file.path Prometheus… thanos-io/thanos#8582

Merged

2 tasks

prymitive deleted the thanos-compactions branch December 2, 2025 11:33

jesusvazquez reviewed Dec 2, 2025

View reviewed changes

dimitarvdimitrov reviewed Dec 8, 2025

View reviewed changes

aknuds1 mentioned this pull request Dec 24, 2025

enhancement(tsdb): add test for LeveledCompactor.Plan stopping after excluding block #17738

Merged

simonpasquier mentioned this pull request Jan 9, 2026

Improve integration of Prometheus and Thanos sidecar with the --storage.tsdb.delay-compact-file.path flag prometheus-operator/prometheus-operator#8266

Open

Conversation

prymitive commented Oct 31, 2025 • edited by bwplotka Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue(s) does the PR fix:

Does this PR introduce a user-facing change?

Uh oh!

gregwork commented Oct 31, 2025

Uh oh!

bwplotka left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Should it be in Prometheus?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xiu commented Nov 23, 2025

Uh oh!

bwplotka left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bwplotka left a comment

Choose a reason for hiding this comment

Uh oh!

beorn7 left a comment

Choose a reason for hiding this comment

Uh oh!

bwplotka Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bwplotka commented Dec 2, 2025

Uh oh!

prymitive commented Dec 2, 2025

Uh oh!

jesusvazquez left a comment

Choose a reason for hiding this comment

Uh oh!

xiu commented Dec 2, 2025

Uh oh!

dimitarvdimitrov Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

prymitive Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

dimitarvdimitrov Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

prymitive commented Oct 31, 2025 •

edited by bwplotka

Loading

bwplotka left a comment •

edited

Loading

bwplotka left a comment •

edited

Loading