Thanos, Prometheus and Golang version used:
Thanos Version: 0.31.0
Prometheus Version: 2.42.0
Object Storage Provider:
S3
What happened:
Slack Thread: https://cloud-native.slack.com/archives/CK5RSSC10/p1680534545839619
We recently upgraded Thanos from v0.28.1 to v0.31.0… and we’re seeing some odd behavior now with portions of our queries. Basically queries that pull data from the last few hours are all correct. However, longer queries will return duplicate data as long as “use deduplication” is checked (which has been the default setting forever). The really weird part is if we uncheck that box, we suddenly get the right data.
We have the following CLI flags configured on our thanos-query pods:
- --query.replica-label=replica
- --query.replica-label=prometheus_replica
- --query.replica-label=cluster
- --query.replica-label=__replica__
- --query.replica-label=tenant_id
- --query.replica-label=receive
- --query.replica-label=prometheus
- --query.replica-label=thanos_ruler_replica
- --query.auto-downsampling
- --query.max-concurrent-select=25
- --store.response-timeout=1m
- --query.max-concurrent=500
- --query.metadata.default-time-range=6h


What you expected to happen:
I expect to get the right results as happened in the 0.28.0, 0.29.0, 0.30.0, 0.30.2 releases.
Anything else we need to know:
After discovering the issue, we rolled back to 0.28.0 and then upgraded one release at a time... the problem only occurs when we upgrade to the 0.31.0 release.
Thanos, Prometheus and Golang version used:
Thanos Version: 0.31.0
Prometheus Version: 2.42.0
Object Storage Provider:
S3
What happened:
Slack Thread: https://cloud-native.slack.com/archives/CK5RSSC10/p1680534545839619
We recently upgraded Thanos from v0.28.1 to v0.31.0… and we’re seeing some odd behavior now with portions of our queries. Basically queries that pull data from the last few hours are all correct. However, longer queries will return duplicate data as long as “use deduplication” is checked (which has been the default setting forever). The really weird part is if we uncheck that box, we suddenly get the right data.
We have the following CLI flags configured on our thanos-query pods:
What you expected to happen:
I expect to get the right results as happened in the
0.28.0,0.29.0,0.30.0,0.30.2releases.Anything else we need to know:
After discovering the issue, we rolled back to
0.28.0and then upgraded one release at a time... the problem only occurs when we upgrade to the0.31.0release.