Skip to content

Conversation

@soloestoy
Copy link
Contributor

@soloestoy soloestoy commented Dec 8, 2023

After #11695, we added two functions rehashingStarted and rehashingCompleted to the dict structure. We also registered two handlers for the main database's dict and expire structures. This allows the main database to record the dict in rehashing list when rehashing starts. Later, in serverCron, the incrementallyRehash function is continuously called to perform the rehashing operation. However, currently, when rehashing is completed, rehashingCompleted does not remove the dict from the rehashing list. This results in the rehashing list containing many invalid dicts. Although subsequent cron checks and removes dicts that don't require rehashing, it is still inefficient.

This PR implements the functionality to remove the dict from the rehashing list in rehashingCompleted. This is achieved by adding metadata to the dict structure, which keeps track of its position in the rehashing list, allowing for quick removal. This approach avoids storing duplicate dicts in the rehashing list.

Additionally, there are other modifications:

  1. Whether in standalone or cluster mode, the dict in database is inserted into the rehashing linked list when rehashing starts. This eliminates the need to distinguish between standalone and cluster mode in incrementallyRehash. The function only needs to focus on the dicts in the rehashing list that require rehashing.
  2. rehashing list is moved from per-database to Redis server level. This decouples incrementallyRehash from the database ID, and in standalone mode, there is no need to iterate over all databases, avoiding unnecessary access to databases that do not require rehashing. In the future, even if unsharded-cluster mode supports multiple databases, there will be no risk involved.
  3. The insertion and removal operations of dict structures in the rehashing list are decoupled from activerehashing config. activerehashing only controls whether incrementallyRehash is executed in serverCron. There is no need for additional steps when modifying the activerehashing switch, as in Update active rehashing list on change in activerehashing config #12705.

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally LGTM, with a small improvement suggestion.

Copy link
Contributor

@hpatro hpatro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments.

@oranagra
Copy link
Member

@soloestoy can we merge this? anything outdated in the top comment?

@soloestoy soloestoy merged commit d8a21c5 into redis:unstable Dec 15, 2023
oranagra pushed a commit that referenced this pull request Feb 5, 2024
…re (#12822)

# Description
Gather most of the scattered `redisDb`-related code from the per-slot
dict PR (#11695) and turn it to a new data structure, `kvstore`. i.e.
it's a class that represents an array of dictionaries.

# Motivation
The main motivation is code cleanliness, the idea of using an array of
dictionaries is very well-suited to becoming a self-contained data
structure.
This allowed cleaning some ugly code, among others: loops that run twice
on the main dict and expires dict, and duplicate code for allocating and
releasing this data structure.

# Notes
1. This PR reverts the part of #12848
where the `rehashing` list is global (handling rehashing `dict`s is
under the responsibility of `kvstore`, and should not be managed by the
server)
2. This PR also replaces the type of `server.pubsubshard_channels` from
`dict**` to `kvstore` (original PR:
#12804). After that was done,
server.pubsub_channels was also chosen to be a `kvstore` (with only one
`dict`, which seems odd) just to make the code cleaner by making it the
same type as `server.pubsubshard_channels`, see
`pubsubtype.serverPubSubChannels`
3. the keys and expires kvstores are currenlty configured to allocate
the individual dicts only when the first key is added (unlike before, in
which they allocated them in advance), but they won't release them when
the last key is deleted.

Worth mentioning that due to the recent change the reply of DEBUG
HTSTATS changed, in case no keys were ever added to the db.

before:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
[Expires HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
```

after:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
[Expires HT]
```
roggervalf pushed a commit to roggervalf/redis that referenced this pull request Feb 11, 2024
After redis#11695, we added two functions `rehashingStarted` and
`rehashingCompleted` to the dict structure. We also registered two
handlers for the main database's dict and expire structures. This allows
the main database to record the dict in `rehashing` list when rehashing
starts. Later, in `serverCron`, the `incrementallyRehash` function is
continuously called to perform the rehashing operation. However,
currently, when rehashing is completed, `rehashingCompleted` does not
remove the dict from the `rehashing` list. This results in the
`rehashing` list containing many invalid dicts. Although subsequent cron
checks and removes dicts that don't require rehashing, it is still
inefficient.

This PR implements the functionality to remove the dict from the
`rehashing` list in `rehashingCompleted`. This is achieved by adding
`metadata` to the dict structure, which keeps track of its position in
the `rehashing` list, allowing for quick removal. This approach avoids
storing duplicate dicts in the `rehashing` list.

Additionally, there are other modifications:

1. Whether in standalone or cluster mode, the dict in database is
inserted into the rehashing linked list when rehashing starts. This
eliminates the need to distinguish between standalone and cluster mode
in `incrementallyRehash`. The function only needs to focus on the dicts
in the `rehashing` list that require rehashing.
2. `rehashing` list is moved from per-database to Redis server level.
This decouples `incrementallyRehash` from the database ID, and in
standalone mode, there is no need to iterate over all databases,
avoiding unnecessary access to databases that do not require rehashing.
In the future, even if unsharded-cluster mode supports multiple
databases, there will be no risk involved.
3. The insertion and removal operations of dict structures in the
`rehashing` list are decoupled from `activerehashing` config.
`activerehashing` only controls whether `incrementallyRehash` is
executed in serverCron. There is no need for additional steps when
modifying the `activerehashing` switch, as in redis#12705.
roggervalf pushed a commit to roggervalf/redis that referenced this pull request Feb 11, 2024
…re (redis#12822)

# Description
Gather most of the scattered `redisDb`-related code from the per-slot
dict PR (redis#11695) and turn it to a new data structure, `kvstore`. i.e.
it's a class that represents an array of dictionaries.

# Motivation
The main motivation is code cleanliness, the idea of using an array of
dictionaries is very well-suited to becoming a self-contained data
structure.
This allowed cleaning some ugly code, among others: loops that run twice
on the main dict and expires dict, and duplicate code for allocating and
releasing this data structure.

# Notes
1. This PR reverts the part of redis#12848
where the `rehashing` list is global (handling rehashing `dict`s is
under the responsibility of `kvstore`, and should not be managed by the
server)
2. This PR also replaces the type of `server.pubsubshard_channels` from
`dict**` to `kvstore` (original PR:
redis#12804). After that was done,
server.pubsub_channels was also chosen to be a `kvstore` (with only one
`dict`, which seems odd) just to make the code cleaner by making it the
same type as `server.pubsubshard_channels`, see
`pubsubtype.serverPubSubChannels`
3. the keys and expires kvstores are currenlty configured to allocate
the individual dicts only when the first key is added (unlike before, in
which they allocated them in advance), but they won't release them when
the last key is deleted.

Worth mentioning that due to the recent change the reply of DEBUG
HTSTATS changed, in case no keys were ever added to the db.

before:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
[Expires HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
```

after:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
[Expires HT]
```
funny-dog pushed a commit to funny-dog/redis that referenced this pull request Sep 17, 2025
…re (redis#12822)

# Description
Gather most of the scattered `redisDb`-related code from the per-slot
dict PR (redis#11695) and turn it to a new data structure, `kvstore`. i.e.
it's a class that represents an array of dictionaries.

# Motivation
The main motivation is code cleanliness, the idea of using an array of
dictionaries is very well-suited to becoming a self-contained data
structure.
This allowed cleaning some ugly code, among others: loops that run twice
on the main dict and expires dict, and duplicate code for allocating and
releasing this data structure.

# Notes
1. This PR reverts the part of redis#12848
where the `rehashing` list is global (handling rehashing `dict`s is
under the responsibility of `kvstore`, and should not be managed by the
server)
2. This PR also replaces the type of `server.pubsubshard_channels` from
`dict**` to `kvstore` (original PR:
redis#12804). After that was done,
server.pubsub_channels was also chosen to be a `kvstore` (with only one
`dict`, which seems odd) just to make the code cleaner by making it the
same type as `server.pubsubshard_channels`, see
`pubsubtype.serverPubSubChannels`
3. the keys and expires kvstores are currenlty configured to allocate
the individual dicts only when the first key is added (unlike before, in
which they allocated them in advance), but they won't release them when
the last key is deleted.

Worth mentioning that due to the recent change the reply of DEBUG
HTSTATS changed, in case no keys were ever added to the db.

before:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
[Expires HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
```

after:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
[Expires HT]
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants