[vector sets] VRANGE implementation#14235
Conversation
🎉 Snyk checks have passed. No issues have been found so far.✅ security/snyk check is complete. No issues have been found. (View Details) ✅ license/snyk check is complete. No issues have been found. (View Details) |
|
New Issues (7)Checkmarx found the following issues in this Pull Request
Fixed Issues (4)Great job! The following issues were fixed in this Pull Request
|
GuyAv46
left a comment
There was a problem hiding this comment.
- Are tests not running on CI? Coverage is 0
- Are we set on
VRANGEas the command name? IMO it's confusing with vector range queries. MaybeVLEXRANGE?
| void *data; | ||
| while ((key_data = RedisModule_DictNextC(iter, &key_len, &data)) != NULL) { |
There was a problem hiding this comment.
Not using data, and it's an optional parameter of the API. Consider removing
| void *data; | |
| while ((key_data = RedisModule_DictNextC(iter, &key_len, &data)) != NULL) { | |
| while ((key_data = RedisModule_DictNextC(iter, &key_len, NULL)) != NULL) { |
| result = self.redis.execute_command('VRANGE', self.test_key, '(apple', '[cherry', '10') | ||
| result = [r.decode() for r in result] | ||
| assert 'apple' not in result, "Exclusive boundary should not include 'apple'" | ||
| assert 'apricot' in result and 'banana' in result and 'cherry' in result, "Should include elements after apple up to cherry" |
There was a problem hiding this comment.
Consider asserting the full reply
| result = self.redis.execute_command('VRANGE', self.test_key, '[banana', '(cherry', '10') | ||
| result = [r.decode() for r in result] | ||
| assert 'banana' in result, "Should include banana" | ||
| assert 'cherry' not in result, "Exclusive end should not include cherry" |
There was a problem hiding this comment.
Consider asserting the full reply.
The same comment applies to the cases listed below.
|
|
||
| # Test 8: Count of 0 returns empty array | ||
| result = self.redis.execute_command('VRANGE', self.test_key, '-', '+', '0') | ||
| assert len(result) == 0, "Count of 0 should return empty array" |
There was a problem hiding this comment.
Consider asserting the returned value, and not its size. If something breaks, we won't know what we have got.
| assert len(result) == 0, "Count of 0 should return empty array" | |
| assert result == [], "Count of 0 should return empty array" |
Same comment for the other cases below
|
Thank you @GuyAv46 I improved the PR with your suggestions. |
GuyAv46
left a comment
There was a problem hiding this comment.
What's the plan for vector range queries? My only question is whether VRANGE should be kept for that command, and we should call this new command VLEXRANGE or another name indicating lexicographic ranges and not of embedding distances
|
@GuyAv46 thank you for the question. No problem for range queries: there is no plan for fetching ranges from vectors, as any total order of vectors in a high dimensional space is basically meaningless, more or less, and the "range" that is useful is basically VSIM: the subset of vectors near something. So it is worth to simplify naming for the range of vector sets. In the past, the names of certain Redis commands were probably a bit too fancy, but at least for a good reason. Here there are no such concerns. |
|
I'm rebasing the changes so that the PR applies again cleanly. |
|
Rebased and force-pushed. |
| "summary": "Return elements in a lexicographical range", | ||
| "complexity": "O(log(K)+M) where K is the number of elements in the start prefix, and M is the number of elements returned. In practical terms, the command is just O(M)", | ||
| "group": "vector_set", | ||
| "since": "8.3.0", |
There was a problem hiding this comment.
Should the next version be 8.4?
In #14235 we forgot to skip vset module command `VRANGE` for schema validator.



This is basically the Vector Set iteration primitive.
It exploits the underlying radix tree implementation.
The usage pattern is strongly reminiscent of other Redis commands doing similar things.
The command usage is straightforward:
The above command returns 10 (or less, if less are available in the specified range) elements from "Redis" (inclusive) to the maximum possible element. The comparison is performed byte by byte, as
memcmp()would do, in this way the elements have a total order. The start and end range can be either a string, prefixedby
[or((the prefix is mandatory) to tell the command if the range is inclusive or exclusive, or can be the special symbols-and+that means the maximum and minimum element.More info can be found in the implementation itself and in the README file change.