apply max-series-per-req to non-tagged queries#1926
Conversation
1faf7d3 to
b9c3b64
Compare
fetchFunc can use this to determine the ratio of how much data the target peer owns compared to the cluster as a whole. Caveat: this is based on live cluster state. If shardgroups go completely down it'll look like the target peer owns more of the cluster than it actually does. if query limits are set based on this, the limits would loosen up as shards leave the cluster.
we do this by adjusting queryAllShards such that it doesn't assume a simple peer.Post but allows passing in the fetchFunc, such that we can define a custom fetchFunc that can figure out the fraction of the data that our target peer is responsible for
|
tested in docker-cluster-query with these tweaks: and on all metrictank processes demonstration single targetdemonstration of breaching limit across multiple targetsnote: this one works because we dedup identical targets. bit of an (unlikely) loophole. it returns all targets twice. I also verified that the check for http400 leading to aborting the spec-exec works leading to a cancelation. I also made a couple of prints for the findCache and tested querying |
b9c3b64 to
df76f3e
Compare
note that we also now cache limit breaches in the findcache
if the request is bad, there is no point retrying on a different replica of the same shard. breaking the limit on one, will also break it on the 2nd. See also #985
df76f3e to
68637ef
Compare
|
caveat: this all sounds reasonable though. in practice this should typically not have an adverse effect. |
cache results depend on the limit.
68637ef to
596df84
Compare
robert-milan
left a comment
There was a problem hiding this comment.
I've reviewed it, tested it locally. Overall I am pretty sure it is fine, but would like a little more time to look into 1 or 2 things.
robert-milan
left a comment
There was a problem hiding this comment.
I think this should work fine.
max-series-per-reqhas been implemented for tagged queries for a quite a while, but wasn't implemented yet for non-tagged queries. We have seen customers do excessively large queries leading to lots of allocated memory, we want to reject the queries instead. fix #1916This PR does the following:
100 * 2 / 10 = 20.see all individual commits for more details.