-
Notifications
You must be signed in to change notification settings - Fork 2.2k
SIGABRT crash on search with huge limit #4321
Description
Current Behavior
When searching with a huge limit such as 99999999999999999, Qdrant crashes with a SIGABRT.
Such limit is obviously unrealistic, but that doesn't mean we should accept the hard crash.
Steps to Reproduce
- Create collection with 100k points:
bfb -d 2 - Do search request with a huge limit:
POST collections/benchmark/points/search { "limit": 99999999999999999, "vector": [0.1, 0.2] }
- Watch Qdrant crash:
memory allocation of memory allocation of 800000000000000008800000000000000008 bytes failed bytes failed fish: Job 1, './target/perf/qdrant' terminated by signal SIGABRT (Abort)
Expected Behavior
No matter he input, we prevent crashing in all cases. When using this on a cluster, you might take down 100% of the nodes instantly with one request.
Ideally we'd simply respond with the complete set of points as search result.
If that isn't possible, we should respond with an error explaining the specified limit is problematic.
Possible Solution
Here are some potential solutions, we probably want to pick one:
-
Memory allocation fails. We either want to test for memory availability or catch the allocation failure somehow so we can return an error to the user.
-
We could internally clamp the limit to the total number of points available in a collection. You cannot return more results than points you have. The total point count is somewhat unreliable though, which makes this hard to implement correctly.
-
We could implement a different code path for when a huge limit is set, this code path would sort search results in a different way not requiring us to allocate capacity for limit items.
-
We can also set an artificial upper bound for the limit, but I'd rather see any of the above solutions instead.
Detailed Description
As far as I understand, memory allocation fails in the fixed length priority queue we use during searches. It tries to allocate capacity for 99999999999999999 items, for which we do not have enough memory.
We had something else before, maybe it is better suited for huge limits not requiring such huge allocations.