Add HRANDFIELD and ZRANDMEMBER. improvements to SRANDMEMBER #8297

yangbodong22011 · 2021-01-07T12:40:49Z

New commands:
HRANDFIELD [<count> [WITHVALUES]]
ZRANDMEMBER [<count> [WITHSCORES]]
Algorithms are similar to the one in SRANDMEMBER.

Both return a simple bulk response when no arguments are given, and an array otherwise.
In case values/scores are requested, RESP2 returns a long array, and RESP3 a nested array.
note: in all 3 commands, the only option that also provides random order is the one with negative count.

Changes to SRANDMEMBER

Optimization when count is 1, we can use the more efficient algorithm of non-unique random
optimization: work with sds strings rather than robj

Other changes:

zzlGetScore: when zset needs to convert string to double, we use safer memcpy (in
case the buffer is too small)
Solve a "bug" in SRANDMEMBER test: it intended to test a positive count (case 3 or
case 4) and by accident used a negative count

yangbodong22011 · 2021-01-12T03:29:50Z

ziplist O(n) random algorithm is done, open it, if you have any comments, please comment.

madolson · 2021-01-19T07:06:00Z

@yangbodong22011 Is this PR ready for review?

yangbodong22011 · 2021-01-19T07:11:37Z

@yangbodong22011 Is this PR ready for review?

yes

oranagra

@yangbodong22011 thanks for the PR.

I made a quick first pass on the code, and added some comments.

for some comments i don't have final conclusions about some of them, we need to consult others and figure out what's the right thing to do.
(specifically talking about random order, and map response type for zset, and single bulk response when count is not provided)

In the original PR i commented that i think we may want to break the code into reusable parts, and use them in the 3 commands as shared code (that means modify the srandmemberCommand).
I think what i had in mind is that each both ziplist and dict will each have a function that implements all the logic of the 4 cases, and then Hash, Zset, and Set can just call the right one according to the encoding.
i'm not at all sure if this approach is better than what you implemented, just want to put it on the table again for re-consideration.

I Haven't reviewed the test code yet, but i think that all 4 cases (different code paths) must be covered by the (for both ziplist and non-ziplist encoding).

src/t_hash.c

src/ziplist.c

src/t_hash.c

src/t_zset.c

src/t_hash.c

src/ziplist.c

src/t_zset.c

oranagra

@yangbodong22011 i think i have all the answers.
there are quite a few changes to make but most of them are not a lot of work.
in addition as i asked earlier, i think we need to have tests to cover all the cases (different code paths).

I tried to reply to the original threads, but i see Github shows it below as new threads.
however, all the comments are also shown above in the original threads too.
so please don't look at the comment of this review (blow), instead us the ones above, or use the "files" view

src/t_hash.c

src/t_zset.c

* Add optional WITHSCORES and WITHVALUES * Add variant without count argument * Use array response instead of map * Efficiency fixes in ziplistRandomPair and ziplistRandomPairs * Fix type of empty reply * Other cleanup and fixes

* ziplistRandomPairs returns sds (with length) rather than use strlen so they can contain null chars. * solve leaks, uninitialized memory access, double free

it intended to test a positive count (case 3 or case 4) and by accident used a negative count

oranagra · 2021-01-27T13:54:14Z

@redis/core-team please approve

src/t_hash.c

src/t_zset.c

src/ziplist.c

madolson · 2021-01-27T21:48:32Z

High level comment, we call hashes a collection of fields and values, not members and values. Shouldn't this be HRANDOMFIELD?

oranagra · 2021-01-27T22:08:56Z

Interesting.. I guess you're right.
Sits well with the fact HLEN != SCARD
@itamarhaber WDYT?

itamarhaber · 2021-01-28T11:51:16Z

Agreed, from the highest level, it should be called HRANDKEY following the HKEYS convention.

oranagra · 2021-01-28T12:03:16Z

the doc is inconsistent. command arguments are saying field

HMSET key field value [field value ...]

while the one command name has keys

HKEYS key

also what about zset? should it rremain ZRANDMEMBER?

ZADD key [NX|XX] [GT|LT] [CH] [INCR] score member [score member ...]

oranagra · 2021-01-28T12:05:40Z

i think that HKEYS is the exception.. everywhere else refers to them as fields.
so i vote for HRANDFIELD and ZRANDMEMBER

itamarhaber · 2021-01-28T16:51:16Z

I upvote HRANDFIELD

itamarhaber · 2021-01-28T17:16:50Z

Docs ready for inspection at redis/redis-doc#1504

madolson

Nothing blocking from me, looks pretty good

src/t_zset.c

src/t_hash.c

src/ziplist.h

* when count is 1, we can use the more efficient algo of non-unique rand * [hash|zset]TypeRandomElement will not create heap based sds, instead it returns a ziplisetEntry (even if source is a dict) * later we can add the value from the ziplistEntry directly to the reply buffers, without extra heap alloc or memcpy * CASE3 still create heap sds, and the reply loop was moved into it, so that the code of CASE4 can be optimized * CASE4 doesn't allocate heap strings for the value, it only adds heap string for the key, and if not in the dup detection dict, it just replies directly. * minor optimization to SRANDMEMBER: work with sds strings rather than robj * when zset needs to convert string to double, we use safer memcpy (in case the buffer is too small) * add HRANDFIELD and ZRANDMEMBER to the corrupt dump fuzzer

oranagra · 2021-01-28T23:00:37Z

@soloestoy @madolson please see last commit for some optimizations you requested and more.

note that it's possible to do even better some day.

CASE3 doesn't need to duplicate all the data, it can create a shallow copy.
CASE4 doesn't need to duplicate the keys, it can use references to the keys of the main object.

The challenge is with the mixed ziplist and dict encoding, either we split the implementation to handle each separately.
Or we can add the ziplist entry pointer (p) to ziplistEntry, and create a special type of dict with hash function and compare function that works on the ziplistEntry.
The compare function only needs to match the pointers, since they're a shallow copy of either the ziplist of dict of the original object).
The hash function may need to convert a long to string.

but anyway, all of that inefficiency is already present in SRANDMEMBER, and i'm not sure it's right to add all this complexity to redis for this kinda gain (still O(N)), on a command that's probably not used that much.

madolson

I'm good with this, as you said, always room to improve later.

src/t_hash.c

src/t_zset.c

Ref: redis/redis#8297

src/t_hash.c

zuiderkwast · 2021-02-05T17:45:07Z

It seems this PR introduced the following compile warnings (GCC 10.2.1):

ziplist.c: In function ‘ziplistRandomPairs’:
ziplist.c:1534:16: warning: ‘vlval’ may be used uninitialized in this function [-Wmaybe-uninitialized]
 1534 |     dest->lval = lval;
      |     ~~~~~~~~~~~^~~~~~
ziplist.c:1543:22: note: ‘vlval’ was declared here
 1543 |     long long klval, vlval;
      |                      ^~~~~
ziplist.c:1534:16: warning: ‘klval’ may be used uninitialized in this function [-Wmaybe-uninitialized]
 1534 |     dest->lval = lval;
      |     ~~~~~~~~~~~^~~~~~
ziplist.c:1543:15: note: ‘klval’ was declared here
 1543 |     long long klval, vlval;
      |               ^~~~~
ziplist.c:1533:16: warning: ‘vlen’ may be used uninitialized in this function [-Wmaybe-uninitialized]
 1533 |     dest->slen = len;
      |     ~~~~~~~~~~~^~~~~
ziplist.c:1542:24: note: ‘vlen’ was declared here
 1542 |     unsigned int klen, vlen;
      |                        ^~~~
ziplist.c:1533:16: warning: ‘klen’ may be used uninitialized in this function [-Wmaybe-uninitialized]
 1533 |     dest->slen = len;
      |     ~~~~~~~~~~~^~~~~
ziplist.c:1542:18: note: ‘klen’ was declared here
 1542 |     unsigned int klen, vlen;
      |                  ^~~~
ziplist.c:1532:16: warning: ‘value’ may be used uninitialized in this function [-Wmaybe-uninitialized]
 1532 |     dest->sval = val;
      |     ~~~~~~~~~~~^~~~~
ziplist.c:1541:30: note: ‘value’ was declared here
 1541 |     unsigned char *p, *key, *value;
      |                              ^~~~~

oranagra · 2021-02-05T18:20:20Z

solved in #8444

New commands: `HRANDFIELD [<count> [WITHVALUES]]` `ZRANDMEMBER [<count> [WITHSCORES]]` Algorithms are similar to the one in SRANDMEMBER. Both return a simple bulk response when no arguments are given, and an array otherwise. In case values/scores are requested, RESP2 returns a long array, and RESP3 a nested array. note: in all 3 commands, the only option that also provides random order is the one with negative count. Changes to SRANDMEMBER * Optimization when count is 1, we can use the more efficient algorithm of non-unique random * optimization: work with sds strings rather than robj Other changes: * zzlGetScore: when zset needs to convert string to double, we use safer memcpy (in case the buffer is too small) * Solve a "bug" in SRANDMEMBER test: it intended to test a positive count (case 3 or case 4) and by accident used a negative count Co-authored-by: xinluton <xinluton@qq.com> Co-authored-by: Oran Agra <oran@redislabs.com>

yangbodong22011 marked this pull request as draft January 7, 2021 12:41

yangbodong22011 force-pushed the hrandmember branch from 20a5084 to d61dfcd Compare January 8, 2021 03:54

dewxin and others added 4 commits January 12, 2021 10:50

add hrandmember command

bead09b

hrandmember support uniq for ziplist

d855c86

add zrandmember command

0a7a9d3

Ziplist O(n) random algorithm

8b165d0

yangbodong22011 dismissed a stale review via 8b165d0 January 12, 2021 03:24

yangbodong22011 force-pushed the hrandmember branch from d61dfcd to 8b165d0 Compare January 12, 2021 03:24

yangbodong22011 marked this pull request as ready for review January 12, 2021 03:32

oranagra reviewed Jan 21, 2021

View reviewed changes

oranagra reviewed Jan 24, 2021

View reviewed changes

src/t_hash.c Outdated Show resolved Hide resolved

src/t_hash.c Outdated Show resolved Hide resolved

src/t_hash.c Show resolved Hide resolved

src/t_zset.c Outdated Show resolved Hide resolved

src/t_zset.c Outdated Show resolved Hide resolved

src/t_zset.c Show resolved Hide resolved

oranagra added 5 commits January 26, 2021 09:55

Code review fixes by Oran

a06d290

* Add optional WITHSCORES and WITHVALUES * Add variant without count argument * Use array response instead of map * Efficiency fixes in ziplistRandomPair and ziplistRandomPairs * Fix type of empty reply * Other cleanup and fixes

Merge remote-tracking branch 'origin/unstable' into hrandmember

2bb7396

Solve issues with binary safe strings, and valgrind warnings

09636f9

* ziplistRandomPairs returns sds (with length) rather than use strlen so they can contain null chars. * solve leaks, uninitialized memory access, double free

Solve "bug" in SRANDMEMBER test.

2329e32

it intended to test a positive count (case 3 or case 4) and by accident used a negative count

Add full test coverage for all code paths

db4a405

oranagra mentioned this pull request Jan 27, 2021

add hrandmember command #8219

Closed

oranagra added approval-needed Waiting for core team approval to be merged release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus state:needs-doc-pr requires a PR to redis-doc repository labels Jan 27, 2021

itamarhaber mentioned this pull request Jan 27, 2021

Adds HRANDMEMBER and ZRANDMEMBER, revamps SRANDMEMBER redis/redis-doc#1504

Merged

itamarhaber previously approved these changes Jan 27, 2021

View reviewed changes

yossigo previously approved these changes Jan 27, 2021

View reviewed changes

src/t_hash.c Outdated Show resolved Hide resolved

src/t_hash.c Outdated Show resolved Hide resolved

src/t_zset.c Outdated Show resolved Hide resolved

src/ziplist.c Outdated Show resolved Hide resolved

rename HRANDMEMBER to HRANDFIELD

daca897

yossigo previously approved these changes Jan 28, 2021

View reviewed changes

oranagra changed the title ~~Add hrandmember and zrandmember~~ Add hrandfield and zrandmember Jan 28, 2021

madolson previously approved these changes Jan 28, 2021

View reviewed changes

src/t_zset.c Outdated Show resolved Hide resolved

src/t_hash.c Show resolved Hide resolved

src/ziplist.h Show resolved Hide resolved

oranagra dismissed stale reviews from madolson and yossigo via b145d25 January 28, 2021 22:49

madolson previously approved these changes Jan 28, 2021

View reviewed changes

soloestoy reviewed Jan 29, 2021

View reviewed changes

src/t_hash.c Outdated Show resolved Hide resolved

src/t_hash.c Outdated Show resolved Hide resolved

src/t_zset.c Outdated Show resolved Hide resolved

changes suggested by Zhao

c456e03

oranagra dismissed madolson’s stale review via c456e03 January 29, 2021 08:33

oranagra requested a review from soloestoy January 29, 2021 08:33

soloestoy approved these changes Jan 29, 2021

View reviewed changes

oranagra merged commit b9a0500 into redis:unstable Jan 29, 2021

oranagra changed the title ~~Add hrandfield and zrandmember~~ Add HRANDFIELD and ZRANDMEMBER. improvements to SRANDMEMBER Jan 29, 2021

itamarhaber added a commit to redis/redis-doc that referenced this pull request Jan 29, 2021

Adds HRANDMEMBER and ZRANDMEMBER, revamps SRANDMEMBER (#1504)

47c4105

Ref: redis/redis#8297

oranagra mentioned this pull request Jan 31, 2021

6.2 RC3 #8427

Merged

oranagra reviewed Feb 1, 2021

View reviewed changes

src/t_hash.c Show resolved Hide resolved

mp911de mentioned this pull request Feb 2, 2021

Add support for HRANDFIELD and ZRANDMEMBER commands redis/lettuce#1605

Closed

madolson mentioned this pull request Feb 16, 2021

SRANDMEMBER RESP3 return should be Array, not Set #8504

Merged

oranagra mentioned this pull request Jul 15, 2021

New HRANDKEY command for getting random keys from a hash #2646

Closed

Add HRANDFIELD and ZRANDMEMBER. improvements to SRANDMEMBER #8297

Add HRANDFIELD and ZRANDMEMBER. improvements to SRANDMEMBER #8297

Uh oh!

Conversation

yangbodong22011 commented Jan 7, 2021 • edited by oranagra Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yangbodong22011 commented Jan 12, 2021

Uh oh!

madolson commented Jan 19, 2021

Uh oh!

yangbodong22011 commented Jan 19, 2021

Uh oh!

oranagra left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oranagra left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oranagra commented Jan 27, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

madolson commented Jan 27, 2021

Uh oh!

oranagra commented Jan 27, 2021

Uh oh!

itamarhaber commented Jan 28, 2021

Uh oh!

oranagra commented Jan 28, 2021

Uh oh!

oranagra commented Jan 28, 2021

Uh oh!

itamarhaber commented Jan 28, 2021

Uh oh!

itamarhaber commented Jan 28, 2021

Uh oh!

madolson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oranagra commented Jan 28, 2021

Uh oh!

madolson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zuiderkwast commented Feb 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oranagra commented Feb 5, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

yangbodong22011 commented Jan 7, 2021 •

edited by oranagra

Loading

oranagra left a comment •

edited

Loading

zuiderkwast commented Feb 5, 2021 •

edited

Loading