-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Refactor the per-slot dict-array db.c into a new kvstore data structure #12822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@vitarb can you please share your thoughts? |
oranagra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
I guess I'm still not convinced that we need a separate entity that differentiates the DB from the Dict Array, to me they still seem one and the same. There is a lot of logic being silo'd out into the Dictarray that is still pretty heavily coupled to Redis, cursors and per-slot information for sampling. |
|
Note that the merge commit Merge remote-tracking branch 'origin/unstable' into dictarray inserts the logical change of moving the |
|
@oranagra worth mentioning that due to the recent change the reply of DEBUG HTSTATS changed, in case of no keys were ever added to the db. before: after: |
|
ok, i don't mind. but please mention it in the top comment. |
| server.db[j].keys = kvstoreCreate(&dbDictType, slot_count_bits, KVSTORE_ALLOCATE_DICTS_ON_DEMAND); | ||
| server.db[j].expires = kvstoreCreate(&dbExpiresDictType, slot_count_bits, KVSTORE_ALLOCATE_DICTS_ON_DEMAND); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@madolson please comment if you think we should drop this flag.
it means that a cluster node will only allocate a dict when the slot is assigned to it and populated, but on the other hand, when the slot is unassigned, the dict will remain allocated.
till now all 16k dicts were always allocated.
i'm worried that someone will find it odd.
p.s. maybe we can add a kvstore interface to explicitly release a specific empty dict, and call it when a slot is unassigned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zuiderkwast maybe you wanna look into that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not that annoying. It's a bit weird that for static configurations there won't be the extra overhead, but there will be if you scale out and add move slots to other nodes. I don't think it's likely a big issue, and now that there is an interface we can improve it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm bothered by two things:
- after resharding there's wasted memory which we can reclaim.
- the memory usage depends on the history and not just current state.
isn't it really simple to solve? some 10 trivial lines of code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see madolson also reacted with a thumbs up emoji. i try to add it in #13072
|
i'd like to merge the PR as is, and maybe address the last remaining comment later. |
When the dict is NULL, we also need to push resize_cursor, otherwise it will keep doing useless continue here, and there is no way to resize the other dict behind it. Introduced in redis#12822.
After redis#12822, when pubsubshard_channels became empty, kvstoreDictDelete will delete the dict (which is the only one currently deleting dicts that become empty) and in the next loop, we will make an invalid call to dictNext. After the dict becomes empty, we break out of the loop without calling dictNext.
…3033) After #12822, when pubsubshard_channels became empty, kvstoreDictDelete will delete the dict (which is the only one currently deleting dicts that become empty) and in the next loop, we will make an invalid call to dictNext. After the dict becomes empty, we break out of the loop without calling dictNext.
…re (redis#12822) # Description Gather most of the scattered `redisDb`-related code from the per-slot dict PR (redis#11695) and turn it to a new data structure, `kvstore`. i.e. it's a class that represents an array of dictionaries. # Motivation The main motivation is code cleanliness, the idea of using an array of dictionaries is very well-suited to becoming a self-contained data structure. This allowed cleaning some ugly code, among others: loops that run twice on the main dict and expires dict, and duplicate code for allocating and releasing this data structure. # Notes 1. This PR reverts the part of redis#12848 where the `rehashing` list is global (handling rehashing `dict`s is under the responsibility of `kvstore`, and should not be managed by the server) 2. This PR also replaces the type of `server.pubsubshard_channels` from `dict**` to `kvstore` (original PR: redis#12804). After that was done, server.pubsub_channels was also chosen to be a `kvstore` (with only one `dict`, which seems odd) just to make the code cleaner by making it the same type as `server.pubsubshard_channels`, see `pubsubtype.serverPubSubChannels` 3. the keys and expires kvstores are currenlty configured to allocate the individual dicts only when the first key is added (unlike before, in which they allocated them in advance), but they won't release them when the last key is deleted. Worth mentioning that due to the recent change the reply of DEBUG HTSTATS changed, in case no keys were ever added to the db. before: ``` 127.0.0.1:6379> DEBUG htstats 9 [Dictionary HT] Hash table 0 stats (main hash table): No stats available for empty dictionaries [Expires HT] Hash table 0 stats (main hash table): No stats available for empty dictionaries ``` after: ``` 127.0.0.1:6379> DEBUG htstats 9 [Dictionary HT] [Expires HT] ```
…redis#13031) When the dict is NULL, we also need to push resize_cursor, otherwise it will keep doing useless continue here, and there is no way to resize the other dict behind it. Introduced in redis#12822. --------- Co-authored-by: Oran Agra <oran@redislabs.com>
…dis#13033) After redis#12822, when pubsubshard_channels became empty, kvstoreDictDelete will delete the dict (which is the only one currently deleting dicts that become empty) and in the next loop, we will make an invalid call to dictNext. After the dict becomes empty, we break out of the loop without calling dictNext.
…13072) Currently (following #11695, and #12822), keys kvstore and expires kvstore both flag with ON_DEMAND, it means that a cluster node will only allocate a dict when the slot is assigned to it and populated, but on the other hand, when the slot is unassigned, the dict will remain allocated. We considered releasing the dict when the slot is unassigned, but it causes complications on replicas. On the other hand, from benchmarks we conducted, it looks like the performance impact of releasing the dict when it becomes empty and re-allocate it when a key is added again, isn't huge. This PR add KVSTORE_FREE_EMPTY_DICTS to cluster mode keys / expires kvstore. The impact is about about 2% performance drop, for this hopefully uncommon scenario. --------- Co-authored-by: Oran Agra <oran@redislabs.com>
…re (redis#12822) # Description Gather most of the scattered `redisDb`-related code from the per-slot dict PR (redis#11695) and turn it to a new data structure, `kvstore`. i.e. it's a class that represents an array of dictionaries. # Motivation The main motivation is code cleanliness, the idea of using an array of dictionaries is very well-suited to becoming a self-contained data structure. This allowed cleaning some ugly code, among others: loops that run twice on the main dict and expires dict, and duplicate code for allocating and releasing this data structure. # Notes 1. This PR reverts the part of redis#12848 where the `rehashing` list is global (handling rehashing `dict`s is under the responsibility of `kvstore`, and should not be managed by the server) 2. This PR also replaces the type of `server.pubsubshard_channels` from `dict**` to `kvstore` (original PR: redis#12804). After that was done, server.pubsub_channels was also chosen to be a `kvstore` (with only one `dict`, which seems odd) just to make the code cleaner by making it the same type as `server.pubsubshard_channels`, see `pubsubtype.serverPubSubChannels` 3. the keys and expires kvstores are currenlty configured to allocate the individual dicts only when the first key is added (unlike before, in which they allocated them in advance), but they won't release them when the last key is deleted. Worth mentioning that due to the recent change the reply of DEBUG HTSTATS changed, in case no keys were ever added to the db. before: ``` 127.0.0.1:6379> DEBUG htstats 9 [Dictionary HT] Hash table 0 stats (main hash table): No stats available for empty dictionaries [Expires HT] Hash table 0 stats (main hash table): No stats available for empty dictionaries ``` after: ``` 127.0.0.1:6379> DEBUG htstats 9 [Dictionary HT] [Expires HT] ```
…redis#13031) When the dict is NULL, we also need to push resize_cursor, otherwise it will keep doing useless continue here, and there is no way to resize the other dict behind it. Introduced in redis#12822. --------- Co-authored-by: Oran Agra <oran@redislabs.com>
…dis#13033) After redis#12822, when pubsubshard_channels became empty, kvstoreDictDelete will delete the dict (which is the only one currently deleting dicts that become empty) and in the next loop, we will make an invalid call to dictNext. After the dict becomes empty, we break out of the loop without calling dictNext.
…edis#13072) Currently (following redis#11695, and redis#12822), keys kvstore and expires kvstore both flag with ON_DEMAND, it means that a cluster node will only allocate a dict when the slot is assigned to it and populated, but on the other hand, when the slot is unassigned, the dict will remain allocated. We considered releasing the dict when the slot is unassigned, but it causes complications on replicas. On the other hand, from benchmarks we conducted, it looks like the performance impact of releasing the dict when it becomes empty and re-allocate it when a key is added again, isn't huge. This PR add KVSTORE_FREE_EMPTY_DICTS to cluster mode keys / expires kvstore. The impact is about about 2% performance drop, for this hopefully uncommon scenario. --------- Co-authored-by: Oran Agra <oran@redislabs.com>
Description
Gather most of the scattered
redisDb-related code from the per-slot dict PR (#11695) and turn it to a new data structure,kvstore. i.e. it's a class that represents an array of dictionaries.Motivation
The main motivation is code cleanliness, the idea of using an array of dictionaries is very well-suited to becoming a self-contained data structure.
This allowed cleaning some ugly code, among others: loops that run twice on the main dict and expires dict, and duplicate code for allocating and releasing this data structure.
Notes
rehashinglist is global (handling rehashingdicts is under the responsibility ofkvstore, and should not be managed by the server)server.pubsubshard_channelsfromdict**tokvstore(original PR: Replace slots_to_channels radix tree with slot specific dictionaries for shard channels. #12804). After that was done, server.pubsub_channels was also chosen to be akvstore(with only onedict, which seems odd) just to make the code cleaner by making it the same type asserver.pubsubshard_channels, seepubsubtype.serverPubSubChannelsWorth mentioning that due to the recent change the reply of DEBUG HTSTATS changed, in case no keys were ever added to the db.
before:
after: