-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Replace slots_to_channels radix tree with slot specific dictionaries for shard channels. #12804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace slots_to_channels radix tree with slot specific dictionaries for shard channels. #12804
Conversation
Yeah, this was the discussion we had at the time related to sharded pubsub. We also expect the number of channels to remain relatively low compared to the number of keys (although maybe this is a bad assumption). I at the very least agree with viktors suggestion we should lazily create the dictionaries so we aren't always paying the cost. |
Sounds reasonable. Please follow up @CharlesChen888 . |
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
zuiderkwast
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'm happy now. 😀
|
we discussed in core team meeting, and think it is ready to be merged, @CharlesChen888 please update the top comment to describe the change, thanks. |
|
Seems we should add more test cases... |
|
@soloestoy i see this doesn't come with any interface changes, but maybe there's another reason to mention it in the release notes? |
|
maybe we can mention it as this: slots_to_channels radix tree dropped, the shard channel names no longer require additional storage, which can save memory. And in cluster mode, split server.pubsubshard_channels into 16384 dicts, counting shard channels in specific slot reduced from O(n) to O(1). |
correct me if i'm wrong, this isn't exposed in any specific command. it's (a small) part of slot migrate (resharding), right? |
|
you are right. |
…ing the client The code does not delete the corresponding node when traversing clients, resulting in a loop, causing the dictDelete() == DICT_OK assertion to fail. In addition, did a cleanup, in the dictCreate scenario, we can avoid a dictFind call since the dict is empty. Issue was introduced in redis#12804.
…ing the client (#12896) The code does not delete the corresponding node when traversing clients, resulting in a loop, causing the dictDelete() == DICT_OK assertion to fail. In addition, did a cleanup, in the dictCreate scenario, we can avoid a dictFind call since the dict is empty. Issue was introduced in #12804.
…re (#12822) # Description Gather most of the scattered `redisDb`-related code from the per-slot dict PR (#11695) and turn it to a new data structure, `kvstore`. i.e. it's a class that represents an array of dictionaries. # Motivation The main motivation is code cleanliness, the idea of using an array of dictionaries is very well-suited to becoming a self-contained data structure. This allowed cleaning some ugly code, among others: loops that run twice on the main dict and expires dict, and duplicate code for allocating and releasing this data structure. # Notes 1. This PR reverts the part of #12848 where the `rehashing` list is global (handling rehashing `dict`s is under the responsibility of `kvstore`, and should not be managed by the server) 2. This PR also replaces the type of `server.pubsubshard_channels` from `dict**` to `kvstore` (original PR: #12804). After that was done, server.pubsub_channels was also chosen to be a `kvstore` (with only one `dict`, which seems odd) just to make the code cleaner by making it the same type as `server.pubsubshard_channels`, see `pubsubtype.serverPubSubChannels` 3. the keys and expires kvstores are currenlty configured to allocate the individual dicts only when the first key is added (unlike before, in which they allocated them in advance), but they won't release them when the last key is deleted. Worth mentioning that due to the recent change the reply of DEBUG HTSTATS changed, in case no keys were ever added to the db. before: ``` 127.0.0.1:6379> DEBUG htstats 9 [Dictionary HT] Hash table 0 stats (main hash table): No stats available for empty dictionaries [Expires HT] Hash table 0 stats (main hash table): No stats available for empty dictionaries ``` after: ``` 127.0.0.1:6379> DEBUG htstats 9 [Dictionary HT] [Expires HT] ```
…for shard channels. (redis#12804) We have achieved replacing `slots_to_keys` radix tree with key->slot linked list (redis#9356), and then replacing the list with slot specific dictionaries for keys (redis#11695). Shard channels behave just like keys in many ways, and we also need a slots->channels mapping. Currently this is still done by using a radix tree. So we should split `server.pubsubshard_channels` into 16384 dicts and drop the radix tree, just like what we did to DBs. Some benefits (basically the benefits of what we've done to DBs): 1. Optimize counting channels in a slot. This is currently used only in removing channels in a slot. But this is potentially more useful: sometimes we need to know how many channels there are in a specific slot when doing slot migration. Counting is now implemented by traversing the radix tree, and with this PR it will be as simple as calling `dictSize`, from O(n) to O(1). 2. The radix tree in the cluster has been removed. The shard channel names no longer require additional storage, which can save memory. 3. Potentially useful in slot migration, as shard channels are logically split by slots, thus making it easier to migrate, remove or add as a whole. 4. Avoid rehashing a big dict when there is a large number of channels. Drawbacks: 1. Takes more memory than using radix tree when there are relatively few shard channels. What this PR does: 1. in cluster mode, split `server.pubsubshard_channels` into 16384 dicts, in standalone mode, still use only one dict. 2. drop the `slots_to_channels` radix tree. 3. to save memory (to solve the drawback above), all 16384 dicts are created lazily, which means only when a channel is about to be inserted to the dict will the dict be initialized, and when all channels are deleted, the dict would delete itself. 5. use `server.shard_channel_count` to keep track of the number of all shard channels. --------- Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
…ing the client (redis#12896) The code does not delete the corresponding node when traversing clients, resulting in a loop, causing the dictDelete() == DICT_OK assertion to fail. In addition, did a cleanup, in the dictCreate scenario, we can avoid a dictFind call since the dict is empty. Issue was introduced in redis#12804.
…re (redis#12822) # Description Gather most of the scattered `redisDb`-related code from the per-slot dict PR (redis#11695) and turn it to a new data structure, `kvstore`. i.e. it's a class that represents an array of dictionaries. # Motivation The main motivation is code cleanliness, the idea of using an array of dictionaries is very well-suited to becoming a self-contained data structure. This allowed cleaning some ugly code, among others: loops that run twice on the main dict and expires dict, and duplicate code for allocating and releasing this data structure. # Notes 1. This PR reverts the part of redis#12848 where the `rehashing` list is global (handling rehashing `dict`s is under the responsibility of `kvstore`, and should not be managed by the server) 2. This PR also replaces the type of `server.pubsubshard_channels` from `dict**` to `kvstore` (original PR: redis#12804). After that was done, server.pubsub_channels was also chosen to be a `kvstore` (with only one `dict`, which seems odd) just to make the code cleaner by making it the same type as `server.pubsubshard_channels`, see `pubsubtype.serverPubSubChannels` 3. the keys and expires kvstores are currenlty configured to allocate the individual dicts only when the first key is added (unlike before, in which they allocated them in advance), but they won't release them when the last key is deleted. Worth mentioning that due to the recent change the reply of DEBUG HTSTATS changed, in case no keys were ever added to the db. before: ``` 127.0.0.1:6379> DEBUG htstats 9 [Dictionary HT] Hash table 0 stats (main hash table): No stats available for empty dictionaries [Expires HT] Hash table 0 stats (main hash table): No stats available for empty dictionaries ``` after: ``` 127.0.0.1:6379> DEBUG htstats 9 [Dictionary HT] [Expires HT] ```
…for shard channels. (redis#12804) We have achieved replacing `slots_to_keys` radix tree with key->slot linked list (redis#9356), and then replacing the list with slot specific dictionaries for keys (redis#11695). Shard channels behave just like keys in many ways, and we also need a slots->channels mapping. Currently this is still done by using a radix tree. So we should split `server.pubsubshard_channels` into 16384 dicts and drop the radix tree, just like what we did to DBs. Some benefits (basically the benefits of what we've done to DBs): 1. Optimize counting channels in a slot. This is currently used only in removing channels in a slot. But this is potentially more useful: sometimes we need to know how many channels there are in a specific slot when doing slot migration. Counting is now implemented by traversing the radix tree, and with this PR it will be as simple as calling `dictSize`, from O(n) to O(1). 2. The radix tree in the cluster has been removed. The shard channel names no longer require additional storage, which can save memory. 3. Potentially useful in slot migration, as shard channels are logically split by slots, thus making it easier to migrate, remove or add as a whole. 4. Avoid rehashing a big dict when there is a large number of channels. Drawbacks: 1. Takes more memory than using radix tree when there are relatively few shard channels. What this PR does: 1. in cluster mode, split `server.pubsubshard_channels` into 16384 dicts, in standalone mode, still use only one dict. 2. drop the `slots_to_channels` radix tree. 3. to save memory (to solve the drawback above), all 16384 dicts are created lazily, which means only when a channel is about to be inserted to the dict will the dict be initialized, and when all channels are deleted, the dict would delete itself. 5. use `server.shard_channel_count` to keep track of the number of all shard channels. --------- Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
…ing the client (redis#12896) The code does not delete the corresponding node when traversing clients, resulting in a loop, causing the dictDelete() == DICT_OK assertion to fail. In addition, did a cleanup, in the dictCreate scenario, we can avoid a dictFind call since the dict is empty. Issue was introduced in redis#12804.
…re (redis#12822) # Description Gather most of the scattered `redisDb`-related code from the per-slot dict PR (redis#11695) and turn it to a new data structure, `kvstore`. i.e. it's a class that represents an array of dictionaries. # Motivation The main motivation is code cleanliness, the idea of using an array of dictionaries is very well-suited to becoming a self-contained data structure. This allowed cleaning some ugly code, among others: loops that run twice on the main dict and expires dict, and duplicate code for allocating and releasing this data structure. # Notes 1. This PR reverts the part of redis#12848 where the `rehashing` list is global (handling rehashing `dict`s is under the responsibility of `kvstore`, and should not be managed by the server) 2. This PR also replaces the type of `server.pubsubshard_channels` from `dict**` to `kvstore` (original PR: redis#12804). After that was done, server.pubsub_channels was also chosen to be a `kvstore` (with only one `dict`, which seems odd) just to make the code cleaner by making it the same type as `server.pubsubshard_channels`, see `pubsubtype.serverPubSubChannels` 3. the keys and expires kvstores are currenlty configured to allocate the individual dicts only when the first key is added (unlike before, in which they allocated them in advance), but they won't release them when the last key is deleted. Worth mentioning that due to the recent change the reply of DEBUG HTSTATS changed, in case no keys were ever added to the db. before: ``` 127.0.0.1:6379> DEBUG htstats 9 [Dictionary HT] Hash table 0 stats (main hash table): No stats available for empty dictionaries [Expires HT] Hash table 0 stats (main hash table): No stats available for empty dictionaries ``` after: ``` 127.0.0.1:6379> DEBUG htstats 9 [Dictionary HT] [Expires HT] ```
We have achieved replacing
slots_to_keysradix tree with key->slot linked list (#9356), and then replacing the list with slot specific dictionaries for keys (#11695).Shard channels behave just like keys in many ways, and we also need a slots->channels mapping. Currently this is still done by using a radix tree. So we should split
server.pubsubshard_channelsinto 16384 dicts and drop the radix tree, just like what we did to DBs.Some benefits (basically the benefits of what we've done to DBs):
dictSize, from O(n) to O(1).Drawbacks:
What this PR does:
server.pubsubshard_channelsinto 16384 dicts, in standalone mode, still use only one dict.slots_to_channelsradix tree.server.shard_channel_countto keep track of the number of all shard channels.