Skip to content

fix(prometheus.remote_write): Fix sent_batch_duration_seconds measuring before the request was sent#5697

Merged
kgeckhart merged 2 commits intomainfrom
kgeckhart/fix-sent-batch-duration
Mar 2, 2026
Merged

fix(prometheus.remote_write): Fix sent_batch_duration_seconds measuring before the request was sent#5697
kgeckhart merged 2 commits intomainfrom
kgeckhart/fix-sent-batch-duration

Conversation

@kgeckhart
Copy link
Contributor

prometheus_remote_storage_sent_batch_duration_seconds was measuring before the HTTP request was sent rather than after, causing the metric to reflect encoding/serialization time rather than the actual send duration.

Applies the fix from prometheus/prometheus#18214 via a fork replace directive pointing to https://github.com/grafana/prometheus/tree/fix-sent-batch-duration-v0.309.1.

Remove the replace directive when upstream PR #18214 is merged and Prometheus is upgraded.

@kgeckhart kgeckhart requested a review from a team as a code owner March 2, 2026 17:21
@kgeckhart kgeckhart added the backport/v1.14 Backport to release/v1.14 label Mar 2, 2026
@kgeckhart kgeckhart changed the title fix(prometheus.remote_write): fix sent_batch_duration_seconds measuring before the request was sent fix(prometheus.remote_write): Fix sent_batch_duration_seconds measuring before the request was sent Mar 2, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

🔍 Dependency Review

github.com/prometheus/prometheus v0.309.1 -> github.com/grafana/prometheus v1.8.2-0.20260302171028-8cf60eef5463 — ✅ Safe
  • What changed

  • Impact assessment

    • API stability: No public API or config schema changes to Prometheus packages that are commonly imported by the OpenTelemetry Collector or its Prometheus receiver were introduced by this fork patch. It is a scoped behavioral fix to how an internal metric is timed during remote write.
    • Compatibility: The fork is based on upstream v0.309.1 with the single backported fix. No intermediate upstream breaking changes are included.
    • Runtime/metrics effect: The observed values of sent_batch_duration_seconds may decrease slightly because the timer starts closer to the actual send operation. This affects metric semantics only; not compilation or runtime behavior of the code depending on Prometheus libraries.
  • Code changes required in this repository

    • None. The change is internal to the Prometheus remote write timing and does not require call-site or import changes.
    • If you maintain dashboards/alerts that rely on sent_batch_duration_seconds, consider validating thresholds due to the corrected timing window. This is not a code change.
  • Evidence

    • Upstream PR description (intent of change): “Fix sent_batch_duration_seconds measuring before the request was sent.” (PR #18214)
    • Grafana fork branch name indicates it is a backport on v0.309.1: fix-sent-batch-duration-v0.309.1.
  • Relevant code snippet (illustrative of the fix; based on the PR description)

    • The fix moves the timer start to immediately before the HTTP send:
      - start := time.Now()
        req, err := http.NewRequest("POST", url, body)
        if err != nil {
            // ...
        }
      + start := time.Now()
        resp, err := client.Do(req)
        // ...
        sentBatchDurationSeconds.Observe(time.Since(start).Seconds())
  • Notes on versioning/replace

    • Although the fork is tagged v1.8.2-..., the replace maps github.com/prometheus/prometheus to github.com/grafana/prometheus, so no import path or module major-version changes are required in this repository.

Notes

  • The other replace directives for contrib receivers remain unchanged; only the Prometheus module is newly replaced to a fork with a single targeted fix.
  • Once upstream PR #18214 is merged and the project updates to a Prometheus release that includes it, the replace can be removed as indicated by your inline comments.

@kgeckhart kgeckhart enabled auto-merge (squash) March 2, 2026 17:36
@kgeckhart kgeckhart merged commit 10cfb6c into main Mar 2, 2026
47 checks passed
@kgeckhart kgeckhart deleted the kgeckhart/fix-sent-batch-duration branch March 2, 2026 17:49
@grafana-alloybot grafana-alloybot bot mentioned this pull request Mar 2, 2026
grafana-alloybot bot pushed a commit that referenced this pull request Mar 2, 2026
…ng before the request was sent (#5697)

`prometheus_remote_storage_sent_batch_duration_seconds` was measuring
before the HTTP request was sent rather than after, causing the metric
to reflect encoding/serialization time rather than the actual send
duration.

Applies the fix from prometheus/prometheus#18214
via a fork replace directive pointing to
https://github.com/grafana/prometheus/tree/fix-sent-batch-duration-v0.309.1.

Remove the replace directive when upstream PR #18214 is merged and
Prometheus is upgraded.

(cherry picked from commit 10cfb6c)
kgeckhart added a commit that referenced this pull request Mar 2, 2026
…ng before the request was sent [backport] (#5698)

## Backport of #5697

This PR backports #5697 to release/v1.14.

### Original PR Author
@kgeckhart

### Description
`prometheus_remote_storage_sent_batch_duration_seconds` was measuring
before the HTTP request was sent rather than after, causing the metric
to reflect encoding/serialization time rather than the actual send
duration.

Applies the fix from prometheus/prometheus#18214
via a fork replace directive pointing to
https://github.com/grafana/prometheus/tree/fix-sent-batch-duration-v0.309.1.

Remove the replace directive when upstream PR #18214 is merged and
Prometheus is upgraded.

---
*This backport was created automatically.*

Co-authored-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/v1.14 Backport to release/v1.14

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants