-
Notifications
You must be signed in to change notification settings - Fork 499
Description
The grpc RateLimitService has only one method ShouldRateLimit and hits_addend is assumed to be 1 when no value has been explicitly set.
Why would a person want to get quota without incrementing it?
One example is if you're dealing with expensive requests but you don't know how expensive they are until after they've run. Maybe you add up the CPU seconds spent serving a request and increment it at the end of the request. You'd still need to check (but not increment) at the beginning of requests whether the CPU-second quota has been exhausted by previous requests during the current time unit.
A hacky workaround today might be to make sure that expensive requests are measured in large numbers (like hundreds of thousands) and check quota by incrementing a very small number (like 1) that might effectively act as if you haven't incremented it at all.
How might this be implemented?
Seems like there are a few possible approaches to getting quota without changing it:
- Do something clever and subtle like saying that a
hits_addendequal to uint32_max means zero - Add some kind of a
check_onlyflag to theRateLimitRequestmessage - Make a separate
IsRateLimitedmethod onRateLimitService - Make a separate non-standard grpc service specific to this rate limiter implementation so the
RateLimitServicein envoy proxy doesn't need to change