-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Add HRANDFIELD and ZRANDMEMBER. improvements to SRANDMEMBER #8297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
20a5084 to
d61dfcd
Compare
d61dfcd to
8b165d0
Compare
|
|
|
@yangbodong22011 Is this PR ready for review? |
yes |
oranagra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yangbodong22011 thanks for the PR.
I made a quick first pass on the code, and added some comments.
for some comments i don't have final conclusions about some of them, we need to consult others and figure out what's the right thing to do.
(specifically talking about random order, and map response type for zset, and single bulk response when count is not provided)
In the original PR i commented that i think we may want to break the code into reusable parts, and use them in the 3 commands as shared code (that means modify the srandmemberCommand).
I think what i had in mind is that each both ziplist and dict will each have a function that implements all the logic of the 4 cases, and then Hash, Zset, and Set can just call the right one according to the encoding.
i'm not at all sure if this approach is better than what you implemented, just want to put it on the table again for re-consideration.
I Haven't reviewed the test code yet, but i think that all 4 cases (different code paths) must be covered by the (for both ziplist and non-ziplist encoding).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yangbodong22011 i think i have all the answers.
there are quite a few changes to make but most of them are not a lot of work.
in addition as i asked earlier, i think we need to have tests to cover all the cases (different code paths).
I tried to reply to the original threads, but i see Github shows it below as new threads.
however, all the comments are also shown above in the original threads too.
so please don't look at the comment of this review (blow), instead us the ones above, or use the "files" view
* Add optional WITHSCORES and WITHVALUES * Add variant without count argument * Use array response instead of map * Efficiency fixes in ziplistRandomPair and ziplistRandomPairs * Fix type of empty reply * Other cleanup and fixes
* ziplistRandomPairs returns sds (with length) rather than use strlen so they can contain null chars. * solve leaks, uninitialized memory access, double free
it intended to test a positive count (case 3 or case 4) and by accident used a negative count
|
@redis/core-team please approve |
|
High level comment, we call hashes a collection of fields and values, not members and values. Shouldn't this be HRANDOMFIELD? |
|
Interesting.. I guess you're right. |
|
Agreed, from the highest level, it should be called |
|
the doc is inconsistent. command arguments are saying while the one command name has also what about zset? should it rremain ZRANDMEMBER? |
|
i think that HKEYS is the exception.. everywhere else refers to them as |
|
I upvote HRANDFIELD |
|
Docs ready for inspection at redis/redis-doc#1504 |
madolson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing blocking from me, looks pretty good
* when count is 1, we can use the more efficient algo of non-unique rand * [hash|zset]TypeRandomElement will not create heap based sds, instead it returns a ziplisetEntry (even if source is a dict) * later we can add the value from the ziplistEntry directly to the reply buffers, without extra heap alloc or memcpy * CASE3 still create heap sds, and the reply loop was moved into it, so that the code of CASE4 can be optimized * CASE4 doesn't allocate heap strings for the value, it only adds heap string for the key, and if not in the dup detection dict, it just replies directly. * minor optimization to SRANDMEMBER: work with sds strings rather than robj * when zset needs to convert string to double, we use safer memcpy (in case the buffer is too small) * add HRANDFIELD and ZRANDMEMBER to the corrupt dump fuzzer
|
@soloestoy @madolson please see last commit for some optimizations you requested and more. note that it's possible to do even better some day.
The challenge is with the mixed ziplist and dict encoding, either we split the implementation to handle each separately. but anyway, all of that inefficiency is already present in SRANDMEMBER, and i'm not sure it's right to add all this complexity to redis for this kinda gain (still O(N)), on a command that's probably not used that much. |
madolson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with this, as you said, always room to improve later.
|
It seems this PR introduced the following compile warnings (GCC 10.2.1): |
|
solved in #8444 |
New commands: `HRANDFIELD [<count> [WITHVALUES]]` `ZRANDMEMBER [<count> [WITHSCORES]]` Algorithms are similar to the one in SRANDMEMBER. Both return a simple bulk response when no arguments are given, and an array otherwise. In case values/scores are requested, RESP2 returns a long array, and RESP3 a nested array. note: in all 3 commands, the only option that also provides random order is the one with negative count. Changes to SRANDMEMBER * Optimization when count is 1, we can use the more efficient algorithm of non-unique random * optimization: work with sds strings rather than robj Other changes: * zzlGetScore: when zset needs to convert string to double, we use safer memcpy (in case the buffer is too small) * Solve a "bug" in SRANDMEMBER test: it intended to test a positive count (case 3 or case 4) and by accident used a negative count Co-authored-by: xinluton <xinluton@qq.com> Co-authored-by: Oran Agra <oran@redislabs.com>
New commands:
HRANDFIELD [<count> [WITHVALUES]]ZRANDMEMBER [<count> [WITHSCORES]]Algorithms are similar to the one in SRANDMEMBER.
Both return a simple bulk response when no arguments are given, and an array otherwise.
In case values/scores are requested, RESP2 returns a long array, and RESP3 a nested array.
note: in all 3 commands, the only option that also provides random order is the one with negative count.
Changes to SRANDMEMBER
Other changes:
case the buffer is too small)
case 4) and by accident used a negative count