Skip to content

remote_write azure auth : add custom scope support#17483

Merged
bwplotka merged 1 commit intoprometheus:mainfrom
Tigger2014:azuread-scope
Dec 2, 2025
Merged

remote_write azure auth : add custom scope support#17483
bwplotka merged 1 commit intoprometheus:mainfrom
Tigger2014:azuread-scope

Conversation

@Tigger2014
Copy link
Contributor

azuread: add custom scope support for remote write authentication

Summary

This PR adds support for custom OAuth 2.0 scopes in Azure AD authentication for remote write endpoints. Previously, users were limited to cloud-specific monitoring scopes. This enhancement allows authentication against custom Azure applications and non-standard endpoints while maintaining full backward compatibility.

Changes

Core Implementation

  • Added Scope field to AzureADConfig struct in storage/remote/azuread/azuread.go
  • Updated newTokenProvider to use custom scope when provided, falling back to existing cloud-specific audiences
  • Added scope validation using regex pattern to ensure valid OAuth 2.0 scope characters

Behavior

Before

Users could only authenticate against predefined monitoring endpoints:

  • AzurePublic: https://monitor.azure.com//.default
  • AzureGovernment: https://monitor.azure.us//.default
  • AzureChina: https://monitor.azure.cn//.default

After

Users can specify custom scopes while maintaining backward compatibility:

remote_write:
  - url: "https://custom-endpoint.com/api/v1/write"
    azuread:
      cloud: AzurePublic
      oauth:
        client_id: "12345678-1234-1234-1234-123456789012"
        client_secret: "your-secret"
        tenant_id: "87654321-4321-4321-4321-210987654321"
      scope: "api://custom-app-id/.default"

When scope is not specified, the system automatically uses the appropriate cloud-specific monitoring scope, ensuring zero breaking changes.

Use Cases

This enhancement enables:

  • Custom Azure applications: Authenticate against user-registered Azure AD applications
  • Multi-tenant scenarios: Use application-specific scopes for tenant isolation
  • Non-standard endpoints: Support for custom monitoring solutions requiring specific OAuth scopes
  • Workload Identity with custom scopes: Enhanced support for Kubernetes workload identity scenarios

Which issue(s) does the PR fix:

Fixes #16918

Does this PR introduce a user-facing change?

[ENHANCEMENT] azuread: Add custom scope support for remote write authentication. Users can now specify custom OAuth 2.0 scopes in Azure AD configuration while maintaining backward compatibility with existing cloud-specific monitoring scopes.

Backward Compatibility

This change is fully backward compatible:

  • No breaking changes to existing configurations

Signed-off-by: Ben Edmunds <sammybenblue2@gmail.com>
@Tigger2014
Copy link
Contributor Author

@bwplotka I noticed you were last to approve an auth change here. Would you be able to review this change, too?

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very solid change, thanks for this.

LGTM

return nil, err
var scopes []string

// Use custom scope if provided, otherwise fallback to cloud-specific audience
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Use custom scope if provided, otherwise fallback to cloud-specific audience
// Use custom scope if provided, otherwise fallback to cloud-specific audience.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a biggie, merging without, can be fixed later.

@bwplotka bwplotka merged commit 0e682a7 into prometheus:main Dec 2, 2025
28 checks passed
@Tigger2014 Tigger2014 deleted the azuread-scope branch December 2, 2025 12:06
renovate bot added a commit to sdwilsh/ansible-playbooks that referenced this pull request Jan 10, 2026
##### [\`v3.9.0\`](https://github.com/prometheus/prometheus/releases/tag/v3.9.0)

#### Note for users of Native Histograms

In version 3.9, Native Histograms is no longer experimental, and the feature flag `native-histogram` has no effect.  You must now turn on
the config setting `scrape_native_histograms` to collect Native Histogram samples from exporters.

#### Changelog

- \[CHANGE] Native Histograms are no longer experimental! Make the `native-histogram` feature flag a no-op. Use `scrape_native_histograms` config option instead. [#17528](prometheus/prometheus#17528)
- \[CHANGE] API: Add maximum limit of 10,000 sets of statistics to TSDB status endpoint. [#17647](prometheus/prometheus#17647)
- \[FEATURE] API: Add /api/v1/features for clients to understand which features are supported. [#17427](prometheus/prometheus#17427)
- \[FEATURE] Promtool: Add `start_timestamp` field for unit tests. [#17636](prometheus/prometheus#17636)
- \[FEATURE] Promtool: Add `--format seriesjson` option to `tsdb dump` to output just series labels in JSON format. [#13409](prometheus/prometheus#13409)
- \[FEATURE] Add `--storage.tsdb.delay-compact-file.path` flag for better interoperability with Thanos. [#17435](prometheus/prometheus#17435)
- \[FEATURE] UI: Add an option on the query drop-down menu to duplicate that query panel. [#17714](prometheus/prometheus#17714)
- \[ENHANCEMENT]: TSDB: add flag `--storage.tsdb.block-reload-interval` to configure TSDB Block Reload Interval. [#16728](prometheus/prometheus#16728)
- \[ENHANCEMENT] UI: Add graph option to start the chart's Y axis at zero. [#17565](prometheus/prometheus#17565)
- \[ENHANCEMENT] Scraping: Classic protobuf format no longer requires the unit in the metric name. [#16834](prometheus/prometheus#16834)
- \[ENHANCEMENT] PromQL, Rules, SD, Scraping: Add native histograms to complement existing summaries. [#17374](prometheus/prometheus#17374)
- \[ENHANCEMENT] Notifications: Add a histogram `prometheus_notifications_latency_histogram_seconds` to complement the existing summary. [#16637](prometheus/prometheus#16637)
- \[ENHANCEMENT] Remote-write: Add custom scope support for AzureAD authentication. [#17483](prometheus/prometheus#17483)
- \[ENHANCEMENT] SD: add a `config` label with job name for most `prometheus_sd_refresh` metrics. [#17138](prometheus/prometheus#17138)
- \[ENHANCEMENT] TSDB: New histogram `prometheus_tsdb_sample_ooo_delta`, the distribution of out-of-order samples in seconds. Collected for all samples, accepted or not. [#17477](prometheus/prometheus#17477)
- \[ENHANCEMENT] Remote-read: Validate histograms received via remote-read. [#17561](prometheus/prometheus#17561)
- \[PERF] TSDB: Small optimizations to postings index. [#17439](prometheus/prometheus#17439)
- \[PERF] Scraping: Speed up relabelling of series. [#17530](prometheus/prometheus#17530)
- \[PERF] PromQL: Small optimisations in binary operators. [#17524](prometheus/prometheus#17524), [#17519](prometheus/prometheus#17519).
- \[BUGFIX] UI: PromQL autocomplete now shows the correct type and HELP text for OpenMetrics counters whose samples end in `_total`. [#17682](prometheus/prometheus#17682)
- \[BUGFIX] UI: Fixed codemirror-promql incorrectly showing label completion suggestions after the closing curly brace of a vector selector. [#17602](prometheus/prometheus#17602)
- \[BUGFIX] UI: Query editor no longer suggests a duration unit if one is already present after a number. [#17605](prometheus/prometheus#17605)
- \[BUGFIX] PromQL: Fix some "vector cannot contain metrics with the same labelset" errors when experimental delayed name removal is enabled. [#17678](prometheus/prometheus#17678)
- \[BUGFIX] PromQL: Fix possible corruption of PromQL text if the query had an empty `ignoring()` and non-empty grouping. [#17643](prometheus/prometheus#17643)
- \[BUGFIX] PromQL: Fix resets/changes to return empty results for anchored selectors when all samples are outside the range. [#17479](prometheus/prometheus#17479)
- \[BUGFIX] PromQL: Check more consistently for many-to-one matching in filter binary operators. [#17668](prometheus/prometheus#17668)
- \[BUGFIX] PromQL: Fix collision in unary negation with non-overlapping series. [#17708](prometheus/prometheus#17708)
- \[BUGFIX] PromQL: Fix collision in label\_join and label\_replace with non-overlapping series. [#17703](prometheus/prometheus#17703)
- \[BUGFIX] PromQL: Fix bug with inconsistent results for queries with OR expression when experimental delayed name removal is enabled. [#17161](prometheus/prometheus#17161)
- \[BUGFIX] PromQL: Ensure that `rate`/`increase`/`delta` of histograms results in a gauge histogram. [#17608](prometheus/prometheus#17608)
- \[BUGFIX] PromQL: Do not panic while iterating over invalid histograms. [#17559](prometheus/prometheus#17559)
- \[BUGFIX] TSDB: Reject chunk files whose encoded chunk length overflows int. [#17533](prometheus/prometheus#17533)
- \[BUGFIX] TSDB: Do not panic during resolution reduction of invalid histograms. [#17561](prometheus/prometheus#17561)
- \[BUGFIX] Remote-write Receive: Avoid duplicate labels when experimental type-and-unit-label feature is enabled. [#17546](prometheus/prometheus#17546)
- \[BUGFIX] OTLP Receiver: Only write metadata to disk when experimental metadata-wal-records feature is enabled. [#17472](prometheus/prometheus#17472)
renovate bot added a commit to sdwilsh/ansible-playbooks that referenced this pull request Jan 10, 2026
##### [\`v3.9.1\`](https://github.com/prometheus/prometheus/releases/tag/v3.9.1)

- \[BUGFIX] Agent: fix crash shortly after startup from invalid type of object. [#17802](prometheus/prometheus#17802)
- \[BUGFIX] Scraping: fix relabel keep/drop not working. [#17807](prometheus/prometheus#17807)

---
##### [\`v3.9.0\`](https://github.com/prometheus/prometheus/releases/tag/v3.9.0)

#### Note for users of Native Histograms

In version 3.9, Native Histograms is no longer experimental, and the feature flag `native-histogram` has no effect.  You must now turn on
the config setting `scrape_native_histograms` to collect Native Histogram samples from exporters.

#### Changelog

- \[CHANGE] Native Histograms are no longer experimental! Make the `native-histogram` feature flag a no-op. Use `scrape_native_histograms` config option instead. [#17528](prometheus/prometheus#17528)
- \[CHANGE] API: Add maximum limit of 10,000 sets of statistics to TSDB status endpoint. [#17647](prometheus/prometheus#17647)
- \[FEATURE] API: Add /api/v1/features for clients to understand which features are supported. [#17427](prometheus/prometheus#17427)
- \[FEATURE] Promtool: Add `start_timestamp` field for unit tests. [#17636](prometheus/prometheus#17636)
- \[FEATURE] Promtool: Add `--format seriesjson` option to `tsdb dump` to output just series labels in JSON format. [#13409](prometheus/prometheus#13409)
- \[FEATURE] Add `--storage.tsdb.delay-compact-file.path` flag for better interoperability with Thanos. [#17435](prometheus/prometheus#17435)
- \[FEATURE] UI: Add an option on the query drop-down menu to duplicate that query panel. [#17714](prometheus/prometheus#17714)
- \[ENHANCEMENT]: TSDB: add flag `--storage.tsdb.block-reload-interval` to configure TSDB Block Reload Interval. [#16728](prometheus/prometheus#16728)
- \[ENHANCEMENT] UI: Add graph option to start the chart's Y axis at zero. [#17565](prometheus/prometheus#17565)
- \[ENHANCEMENT] Scraping: Classic protobuf format no longer requires the unit in the metric name. [#16834](prometheus/prometheus#16834)
- \[ENHANCEMENT] PromQL, Rules, SD, Scraping: Add native histograms to complement existing summaries. [#17374](prometheus/prometheus#17374)
- \[ENHANCEMENT] Notifications: Add a histogram `prometheus_notifications_latency_histogram_seconds` to complement the existing summary. [#16637](prometheus/prometheus#16637)
- \[ENHANCEMENT] Remote-write: Add custom scope support for AzureAD authentication. [#17483](prometheus/prometheus#17483)
- \[ENHANCEMENT] SD: add a `config` label with job name for most `prometheus_sd_refresh` metrics. [#17138](prometheus/prometheus#17138)
- \[ENHANCEMENT] TSDB: New histogram `prometheus_tsdb_sample_ooo_delta`, the distribution of out-of-order samples in seconds. Collected for all samples, accepted or not. [#17477](prometheus/prometheus#17477)
- \[ENHANCEMENT] Remote-read: Validate histograms received via remote-read. [#17561](prometheus/prometheus#17561)
- \[PERF] TSDB: Small optimizations to postings index. [#17439](prometheus/prometheus#17439)
- \[PERF] Scraping: Speed up relabelling of series. [#17530](prometheus/prometheus#17530)
- \[PERF] PromQL: Small optimisations in binary operators. [#17524](prometheus/prometheus#17524), [#17519](prometheus/prometheus#17519).
- \[BUGFIX] UI: PromQL autocomplete now shows the correct type and HELP text for OpenMetrics counters whose samples end in `_total`. [#17682](prometheus/prometheus#17682)
- \[BUGFIX] UI: Fixed codemirror-promql incorrectly showing label completion suggestions after the closing curly brace of a vector selector. [#17602](prometheus/prometheus#17602)
- \[BUGFIX] UI: Query editor no longer suggests a duration unit if one is already present after a number. [#17605](prometheus/prometheus#17605)
- \[BUGFIX] PromQL: Fix some "vector cannot contain metrics with the same labelset" errors when experimental delayed name removal is enabled. [#17678](prometheus/prometheus#17678)
- \[BUGFIX] PromQL: Fix possible corruption of PromQL text if the query had an empty `ignoring()` and non-empty grouping. [#17643](prometheus/prometheus#17643)
- \[BUGFIX] PromQL: Fix resets/changes to return empty results for anchored selectors when all samples are outside the range. [#17479](prometheus/prometheus#17479)
- \[BUGFIX] PromQL: Check more consistently for many-to-one matching in filter binary operators. [#17668](prometheus/prometheus#17668)
- \[BUGFIX] PromQL: Fix collision in unary negation with non-overlapping series. [#17708](prometheus/prometheus#17708)
- \[BUGFIX] PromQL: Fix collision in label\_join and label\_replace with non-overlapping series. [#17703](prometheus/prometheus#17703)
- \[BUGFIX] PromQL: Fix bug with inconsistent results for queries with OR expression when experimental delayed name removal is enabled. [#17161](prometheus/prometheus#17161)
- \[BUGFIX] PromQL: Ensure that `rate`/`increase`/`delta` of histograms results in a gauge histogram. [#17608](prometheus/prometheus#17608)
- \[BUGFIX] PromQL: Do not panic while iterating over invalid histograms. [#17559](prometheus/prometheus#17559)
- \[BUGFIX] TSDB: Reject chunk files whose encoded chunk length overflows int. [#17533](prometheus/prometheus#17533)
- \[BUGFIX] TSDB: Do not panic during resolution reduction of invalid histograms. [#17561](prometheus/prometheus#17561)
- \[BUGFIX] Remote-write Receive: Avoid duplicate labels when experimental type-and-unit-label feature is enabled. [#17546](prometheus/prometheus#17546)
- \[BUGFIX] OTLP Receiver: Only write metadata to disk when experimental metadata-wal-records feature is enabled. [#17472](prometheus/prometheus#17472)
renovate bot added a commit to sdwilsh/ansible-playbooks that referenced this pull request Jan 10, 2026
##### [\`v3.9.1\`](https://github.com/prometheus/prometheus/releases/tag/v3.9.1)

- \[BUGFIX] Agent: fix crash shortly after startup from invalid type of object. [#17802](prometheus/prometheus#17802)
- \[BUGFIX] Scraping: fix relabel keep/drop not working. [#17807](prometheus/prometheus#17807)

---
##### [\`v3.9.0\`](https://github.com/prometheus/prometheus/releases/tag/v3.9.0)

#### Note for users of Native Histograms

In version 3.9, Native Histograms is no longer experimental, and the feature flag `native-histogram` has no effect.  You must now turn on
the config setting `scrape_native_histograms` to collect Native Histogram samples from exporters.

#### Changelog

- \[CHANGE] Native Histograms are no longer experimental! Make the `native-histogram` feature flag a no-op. Use `scrape_native_histograms` config option instead. [#17528](prometheus/prometheus#17528)
- \[CHANGE] API: Add maximum limit of 10,000 sets of statistics to TSDB status endpoint. [#17647](prometheus/prometheus#17647)
- \[FEATURE] API: Add /api/v1/features for clients to understand which features are supported. [#17427](prometheus/prometheus#17427)
- \[FEATURE] Promtool: Add `start_timestamp` field for unit tests. [#17636](prometheus/prometheus#17636)
- \[FEATURE] Promtool: Add `--format seriesjson` option to `tsdb dump` to output just series labels in JSON format. [#13409](prometheus/prometheus#13409)
- \[FEATURE] Add `--storage.tsdb.delay-compact-file.path` flag for better interoperability with Thanos. [#17435](prometheus/prometheus#17435)
- \[FEATURE] UI: Add an option on the query drop-down menu to duplicate that query panel. [#17714](prometheus/prometheus#17714)
- \[ENHANCEMENT]: TSDB: add flag `--storage.tsdb.block-reload-interval` to configure TSDB Block Reload Interval. [#16728](prometheus/prometheus#16728)
- \[ENHANCEMENT] UI: Add graph option to start the chart's Y axis at zero. [#17565](prometheus/prometheus#17565)
- \[ENHANCEMENT] Scraping: Classic protobuf format no longer requires the unit in the metric name. [#16834](prometheus/prometheus#16834)
- \[ENHANCEMENT] PromQL, Rules, SD, Scraping: Add native histograms to complement existing summaries. [#17374](prometheus/prometheus#17374)
- \[ENHANCEMENT] Notifications: Add a histogram `prometheus_notifications_latency_histogram_seconds` to complement the existing summary. [#16637](prometheus/prometheus#16637)
- \[ENHANCEMENT] Remote-write: Add custom scope support for AzureAD authentication. [#17483](prometheus/prometheus#17483)
- \[ENHANCEMENT] SD: add a `config` label with job name for most `prometheus_sd_refresh` metrics. [#17138](prometheus/prometheus#17138)
- \[ENHANCEMENT] TSDB: New histogram `prometheus_tsdb_sample_ooo_delta`, the distribution of out-of-order samples in seconds. Collected for all samples, accepted or not. [#17477](prometheus/prometheus#17477)
- \[ENHANCEMENT] Remote-read: Validate histograms received via remote-read. [#17561](prometheus/prometheus#17561)
- \[PERF] TSDB: Small optimizations to postings index. [#17439](prometheus/prometheus#17439)
- \[PERF] Scraping: Speed up relabelling of series. [#17530](prometheus/prometheus#17530)
- \[PERF] PromQL: Small optimisations in binary operators. [#17524](prometheus/prometheus#17524), [#17519](prometheus/prometheus#17519).
- \[BUGFIX] UI: PromQL autocomplete now shows the correct type and HELP text for OpenMetrics counters whose samples end in `_total`. [#17682](prometheus/prometheus#17682)
- \[BUGFIX] UI: Fixed codemirror-promql incorrectly showing label completion suggestions after the closing curly brace of a vector selector. [#17602](prometheus/prometheus#17602)
- \[BUGFIX] UI: Query editor no longer suggests a duration unit if one is already present after a number. [#17605](prometheus/prometheus#17605)
- \[BUGFIX] PromQL: Fix some "vector cannot contain metrics with the same labelset" errors when experimental delayed name removal is enabled. [#17678](prometheus/prometheus#17678)
- \[BUGFIX] PromQL: Fix possible corruption of PromQL text if the query had an empty `ignoring()` and non-empty grouping. [#17643](prometheus/prometheus#17643)
- \[BUGFIX] PromQL: Fix resets/changes to return empty results for anchored selectors when all samples are outside the range. [#17479](prometheus/prometheus#17479)
- \[BUGFIX] PromQL: Check more consistently for many-to-one matching in filter binary operators. [#17668](prometheus/prometheus#17668)
- \[BUGFIX] PromQL: Fix collision in unary negation with non-overlapping series. [#17708](prometheus/prometheus#17708)
- \[BUGFIX] PromQL: Fix collision in label\_join and label\_replace with non-overlapping series. [#17703](prometheus/prometheus#17703)
- \[BUGFIX] PromQL: Fix bug with inconsistent results for queries with OR expression when experimental delayed name removal is enabled. [#17161](prometheus/prometheus#17161)
- \[BUGFIX] PromQL: Ensure that `rate`/`increase`/`delta` of histograms results in a gauge histogram. [#17608](prometheus/prometheus#17608)
- \[BUGFIX] PromQL: Do not panic while iterating over invalid histograms. [#17559](prometheus/prometheus#17559)
- \[BUGFIX] TSDB: Reject chunk files whose encoded chunk length overflows int. [#17533](prometheus/prometheus#17533)
- \[BUGFIX] TSDB: Do not panic during resolution reduction of invalid histograms. [#17561](prometheus/prometheus#17561)
- \[BUGFIX] Remote-write Receive: Avoid duplicate labels when experimental type-and-unit-label feature is enabled. [#17546](prometheus/prometheus#17546)
- \[BUGFIX] OTLP Receiver: Only write metadata to disk when experimental metadata-wal-records feature is enabled. [#17472](prometheus/prometheus#17472)
charleskorn pushed a commit to charleskorn/prometheus that referenced this pull request Jan 15, 2026
* promql: fix histogram_fraction issue when lower falls within the first bucket (prometheus#17424)

Signed-off-by: Mohammad Alavi <m.alavi1986@gmail.com>

* prepare release 3.8.0-rc.0

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* test: skip TestRemoteWrite_ReshardingWithoutDeadlock temporarily as flaky (prometheus#17534) (prometheus#17543)

(cherry picked from commit 35c3232)

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>

* chore(deps): bump prometheus/promci from 0.4.7 to 0.5.0

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* chore(deps): bump prometheus/promci from 0.5.0 to 0.5.1

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* chore(deps): bump prometheus/promci from 0.5.1 to 0.5.2

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* chore(deps): bump prometheus/promci from 0.5.2 to 0.5.3

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* prw2: Move Remote Write 2.0 CT to be per Sample; Rename to ST (start timestamp) (prometheus#17411)

Relates to
prometheus#16944 (comment)

Signed-off-by: bwplotka <bwplotka@gmail.com>
(cherry picked from commit cefefc6)

* chore: prepare 3.8.0-rc.1 entry

Signed-off-by: bwplotka <bwplotka@gmail.com>

* [chore]: bump common dep to support RFC7523 3.1

Signed-off-by: Jorge Turrado <jorge.turrado@mail.schwarz>

* Update Prometheus Agent doc (prometheus#17591)

* Add a nav title to fix docs website generator.
* Make it more clear that "Prometheus Agent" is a mode, not a seaparate
  service.
* Add to index.
* Cleanup some wording.
* Add a downsides section.

Signed-off-by: SuperQ <superq@gmail.com>
(cherry picked from commit d0d2699)

* chore(deps): bump github.com/prometheus/common from 0.67.3 to 0.67.4 (prometheus#17594)

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* prepare release v3.8.0-rc.1

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* prepare release v3.8.0

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

* chore: Fix function name typo in createBatchSpan comment

Signed-off-by: zjumathcode <pai314159@2980.com>

* feat: Add flag that blocks lvl 1 compactions until upload is confirmed in an external JSON file (prometheus#17435)

* Delay compactions until Thanos uploads all blocks

Using Thanos sidecar with Prometheus requires us to disable TSDB compactions on Prometheus side by setting --storage.tsdb.min-block-duration and --storage.tsdb.max-block-duration to the same value. See https://thanos.io/tip/components/sidecar.md. The main problem this avoids is that Prometheus might compact given block before Thanos uploads it, creating a gap in Thanos metrics. Thanos does not upload compacted blocks because that would upload the same sample multiple times. You can tell Thanos to upload compacted blocks but that is aimed at one time migrations. This patch creates a bridge between Thanos and Prometheus by allowing Prometheus to read the shipper file Thanos creates, where it tracks which blocks were already uploaded, and using that data delays compaction of blocks until they are marked as uploaded by Thanos. Thanks to this both services can coordinate with each other (in a way) and we can stop disabling compaction on Prometheus side when Thanos uploads are enabled.

The reason to have this is that disabling compactions have very dramatic performance cost. Since most time series exist for longer than a single block duration (2h by default) large chunks of block index will reference the same series, so 10 * 2h blocks will each have an index that is usually fairly big and is almost the same for all 10 blocks. Compaction de-duplicates the index so merging 10 blocks together would leave us with a single index that is around the same size as each of these 10 2h blocks would have (plus some extra for series that only exists in some blocks, but not all). Every range query that iterates over all 10 blocks would then have to read each index and so we're doing 10x more work then if we had a single compacted block.

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>

* Rename structs and functions to make this more generic

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>

* Address review comments

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>

* Cache UploadMeta for 1 minute

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>

---------

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>

* RW2: Allow custom scope in azuread (prometheus#17483)

Signed-off-by: Ben Edmunds <sammybenblue2@gmail.com>

* docs: Describe how time() is set to start at 0 in unit tests

The return value of functions relating to the current time, e.g. time(),
is set by promtool to start at timestamp 0 at the start of a test's
evaluation.

This has the very nice consequence that tests can run reliably without
depending on when they are run.

It does, however, mean that tests will give out results that can be
unexpected by users.

If this behaviour is documented, then users will be empowered to write
tests for their rules that use time-dependent functions.

(Closes: prometheus/docs#1464)

Signed-off-by: Gabriel Filion <lelutin@torproject.org>

* refactor(tsdb): use one test newTestDB constructor (prometheus#17638)

For tests only, we had various ways of opening DB. Reduced to one
instead of:

* Open
* newTestDB
* newTestDBOpts
* openTestDB

This so prometheus#17629 is smaller
and bit easier. Also for test maintainability and consistency.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Add start_timestamp field for unit tests.

This commit adds support for configuring a custom start timestamp
for Prometheus unit tests, allowing tests to use realistic timestamps
instead of starting at Unix epoch 0.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

* Fix serialization for empty `ignoring()` in combination with `group_x()`

Currently both the backend and frontend printers/formatters/serializers
incorrectly transform the following expression:

```
up * ignoring() group_left(__name__) node_boot_time_seconds
```

...into:

```
up * node_boot_time_seconds
```

...which yields a different result (including the metric name in the result
vs. no metric name).

We need to keep empty `ignoring()` modifiers if there is a grouping modifier
present.

Signed-off-by: Julius Volz <julius.volz@gmail.com>

* Simplify StartTime assignment in unit test setup.

Remove redundant IsZero check since promqltest.LazyLoader already
handles zero StartTime by defaulting to Unix epoch.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>

* Update golangci-lint and add modernize check (prometheus#17640)

* add modernize check

Signed-off-by: dongjiang1989 <dongjiang1989@126.com>

* fix golangci lint

Signed-off-by: dongjiang1989 <dongjiang1989@126.com>

---------

Signed-off-by: dongjiang1989 <dongjiang1989@126.com>

* fix lint

---------

Signed-off-by: Mohammad Alavi <m.alavi1986@gmail.com>
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: Jorge Turrado <jorge.turrado@mail.schwarz>
Signed-off-by: SuperQ <superq@gmail.com>
Signed-off-by: zjumathcode <pai314159@2980.com>
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
Signed-off-by: Ben Edmunds <sammybenblue2@gmail.com>
Signed-off-by: Gabriel Filion <lelutin@torproject.org>
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
Signed-off-by: Julius Volz <julius.volz@gmail.com>
Signed-off-by: dongjiang1989 <dongjiang1989@126.com>
Co-authored-by: Mohammad Alavi <m.alavi1986@gmail.com>
Co-authored-by: Jan Fajerski <jfajersk@redhat.com>
Co-authored-by: Jan Fajerski <jan--f@users.noreply.github.com>
Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Jorge Turrado <jorge.turrado@mail.schwarz>
Co-authored-by: Ben Kochie <superq@gmail.com>
Co-authored-by: zjumathcode <pai314159@2980.com>
Co-authored-by: Łukasz Mierzwa <l.mierzwa@gmail.com>
Co-authored-by: Ben Edmunds <Tigger2014@users.noreply.github.com>
Co-authored-by: Julien <291750+roidelapluie@users.noreply.github.com>
Co-authored-by: Gabriel Filion <lelutin@torproject.org>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Co-authored-by: dongjiang <dongjiang1989@126.com>
Co-authored-by: Jeanette Tan <jeanette.tan@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support setting scope for remote_write when authenticating through AzureAD SDK

2 participants