Skip to content

Conversation

@sundb
Copy link
Collaborator

@sundb sundb commented Feb 18, 2024

After #13013

This PR make effort to defrag the pubsub kvstore in the following ways:

  1. Till now server.pubsub(shard)_channels only share channel name obj with the first subscribed client, now change it so that the clients and the pubsub kvstore share the channel name robj.
    This would save a lot of memory when there are many subscribers to the same channel.
    It also means that we only need to defrag the channel name robj in the pubsub kvstore, and then update
    all client references for the current channel, avoiding the need to iterate through all the clients to do the same things.

  2. Refactor the code to defragment pubsub(shard) in the same way as defragment of keys and EXPIRES, with the exception that we only defragment pubsub(without shard) when slot is zero.

Other

Fix an overlook in #11695, if defragment doesn't reach the end time, we should wait for the current
db's keys and expires, pubsub and pubsubshard to finish before leaving, now it's possible to exit
early when the keys are defragmented.

@sundb sundb requested a review from oranagra February 18, 2024 02:44
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if we now need a better test? (one that grantees we really did re-locate something in pubsub) or do you think the one we have is sufficient?

sundb and others added 2 commits February 26, 2024 23:42
without timeout

Fix an overlook in redis#11695, if defragment doesn't reach the end time, we should wait for the current db's keys and expires to finish before leaving, now it's possible to exit early when the keys are defragmented.
@sundb sundb requested a review from oranagra February 28, 2024 01:17
Before this PR we are always interating from stage 0 to 3, but now we're
starting from the last stage, so when defrag_stage is not equal i, it
means that the last stage isn't finished, so there is no point in
iterating further.
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sundb sundb requested a review from oranagra March 2, 2024 11:21
@sundb
Copy link
Collaborator Author

sundb commented Mar 2, 2024

@oranagra recent changes from last review:

  1. always process defrag stages from slot 0, since we need to process pubsub with slot 0 and db 0.
  2. when all defrag stages are finished, look for the next cloest slot from all stages, instead of looking from keys as before.
  3. optimize defrag pubsub test, using the same client to intermittently populate the pubsub and pubsubshard.

sundb and others added 2 commits March 4, 2024 11:51
Co-authored-by: oranagra <oran@redislabs.com>
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you happy with the new approach? (scanning all the slots in each step before moving to the next one)?
[edit] only now i see your comment above. i'll read and respond.

@sundb
Copy link
Collaborator Author

sundb commented Mar 4, 2024

@oranagra It's ready.

@oranagra oranagra merged commit ad12730 into redis:unstable Mar 4, 2024
@sundb sundb deleted the defrag_pubsub branch March 4, 2024 15:06
sundb added a commit that referenced this pull request Jun 18, 2024
… reset (#13315)

this PR fixes two crashes:

1. Fix missing slotToKeyInit() when using `flushdb async` under cluster
mode.
    #13205

2. Fix missing expires_cursor reset when stopping active defrag in the
middle of defragment.
    #13307
If we stop active defrag in the middle of defragging db->expires, if
`expires_cursor` is not reset to 0, the next time we enable active
defrag again, defragLaterStep(db, ...) will be entered. However, at this
time, `db` has been reset to NULL, which results in crash.

The affected code were removed by #11695 and #13058 in usntable, so we
just need backport this to 7.2.
sundb added a commit that referenced this pull request Aug 14, 2025
Fix #13612
This bug was introduced by #11465

This PR added the incremental defragmentation for db->expires.
If the time for the defragmentation cycle has not arrived, it will exit
unless both cursor and expires_cursor are 0, meaning both keys and
expires have been fragmented.
However, the expires_cursor check has now been overlooked, which leads
to we will exit immediately even if the defragmentation doesn't reach
the end time, which will cause the efficiency of defragmentation to
become very low.

Note that this bug was already fixed in version
7.4(#13058)
sundb added a commit that referenced this pull request Sep 7, 2025
This PR fixes two defrag issues.

1. Fix a use-after-free issue caused by updating dictionary keys after a
pubsub channel is reallocated.
This issue was introduced by #13058

1. Fix potential use-after-free for lua during AOF loading with defrag
This issue was introduced by #13058
   This fix follows #14319
This PR updates the LuaScript LRU list before script execution to
prevent accessing a potentially invalidated pointer after long-running
scripts.
funny-dog pushed a commit to funny-dog/redis that referenced this pull request Sep 17, 2025
After redis#13013

### This PR make effort to defrag the pubsub kvstore in the following
ways:

1. Till now server.pubsub(shard)_channels only share channel name obj
with the first subscribed client, now change it so that the clients and
the pubsub kvstore share the channel name robj.
This would save a lot of memory when there are many subscribers to the
same channel.
It also means that we only need to defrag the channel name robj in the
pubsub kvstore, and then update
all client references for the current channel, avoiding the need to
iterate through all the clients to do the same things.
    
2. Refactor the code to defragment pubsub(shard) in the same way as
defragment of keys and EXPIRES, with the exception that we only
defragment pubsub(without shard) when slot is zero.


### Other
Fix an overlook in redis#11695, if defragment doesn't reach the end time, we
should wait for the current
db's keys and expires, pubsub and pubsubshard to finish before leaving,
now it's possible to exit
early when the keys are defragmented.

---------

Co-authored-by: oranagra <oran@redislabs.com>
YaacovHazan pushed a commit to YaacovHazan/redis that referenced this pull request Sep 28, 2025
This PR fixes two defrag issues.

1. Fix a use-after-free issue caused by updating dictionary keys after a
pubsub channel is reallocated.
This issue was introduced by redis#13058

1. Fix potential use-after-free for lua during AOF loading with defrag
This issue was introduced by redis#13058
   This fix follows redis#14319
This PR updates the LuaScript LRU list before script execution to
prevent accessing a potentially invalidated pointer after long-running
scripts.
sundb added a commit to YaacovHazan/redis that referenced this pull request Sep 30, 2025
This PR fixes two defrag issues.

1. Fix a use-after-free issue caused by updating dictionary keys after a
pubsub channel is reallocated.
This issue was introduced by redis#13058

1. Fix potential use-after-free for lua during AOF loading with defrag
This issue was introduced by redis#13058
   This fix follows redis#14319
This PR updates the LuaScript LRU list before script execution to
prevent accessing a potentially invalidated pointer after long-running
scripts.
sundb added a commit to YaacovHazan/redis that referenced this pull request Sep 30, 2025
This PR fixes two defrag issues.

1. Fix a use-after-free issue caused by updating dictionary keys after a
pubsub channel is reallocated.
This issue was introduced by redis#13058

2. Fix potential use-after-free for lua during AOF loading with defrag
This issue was introduced by redis#13058
   This fix follows redis#14319
This PR updates the LuaScript LRU list before script execution to
prevent accessing a potentially invalidated pointer after long-running
scripts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants