Skip to content

Deduplication returning deduped and non-deduped results in 0.31.0+ #6257

@diranged

Description

@diranged

Thanos, Prometheus and Golang version used:

Thanos Version: 0.31.0
Prometheus Version: 2.42.0

Object Storage Provider:

S3

What happened:

Slack Thread: https://cloud-native.slack.com/archives/CK5RSSC10/p1680534545839619

We recently upgraded Thanos from v0.28.1 to v0.31.0… and we’re seeing some odd behavior now with portions of our queries. Basically queries that pull data from the last few hours are all correct. However, longer queries will return duplicate data as long as “use deduplication” is checked (which has been the default setting forever). The really weird part is if we uncheck that box, we suddenly get the right data.

We have the following CLI flags configured on our thanos-query pods:

            - --query.replica-label=replica
            - --query.replica-label=prometheus_replica
            - --query.replica-label=cluster
            - --query.replica-label=__replica__
            - --query.replica-label=tenant_id
            - --query.replica-label=receive
            - --query.replica-label=prometheus
            - --query.replica-label=thanos_ruler_replica
            - --query.auto-downsampling
            - --query.max-concurrent-select=25
            - --store.response-timeout=1m
            - --query.max-concurrent=500
            - --query.metadata.default-time-range=6h

image
image

What you expected to happen:

I expect to get the right results as happened in the 0.28.0, 0.29.0, 0.30.0, 0.30.2 releases.

Anything else we need to know:

After discovering the issue, we rolled back to 0.28.0 and then upgraded one release at a time... the problem only occurs when we upgrade to the 0.31.0 release.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions