-
Notifications
You must be signed in to change notification settings - Fork 70
Description
The various request functions in backend_request.func.py will set output.success = False if they don't get a HTTP 200 status code back for a request. There is no logic to retry a refused request and metrics will be calculated skipping any failed requests. This means an overloaded server will perform better on this benchmark for metrics like E2E latency and TTFT if it refuses requests rather than accept them and serve them slowly. As the number of failed requests isn't included in the results json it's not easy to tell if this is a factor for any benchmarks.
If one setup refuses requests under load and another accepts them there doesn't seem to be a fair way to directly compare these metrics. But hopefully this isn't happening. Adding the failure rate to the results output would mean this can be checked and investigated if it does happen.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status