Add defragment support for HFE by sundb · Pull Request #13229 · redis/redis

sundb · 2024-04-19T08:44:14Z

Background

All hash objects that contain HFE are referenced by db->hexpires.
All fields in a dict hash object with HFE are referenced by an ebucket.

So when we defrag the hash object or the field in a dict with HFE, we also need to update the references in them.

Interface

Add a new interface ebDefragItem, which can accept a defrag callback to defrag items in ebuckets, and simultaneously update their references in the ebucket.

Mainly changes

The key type of dict of hash object is no longer sds, so add new activeDefragHfieldDict() to defrag the dict instead of activeDefragSdsDict().
When we defrag the dict of hash object by using dictScanDefrag(), we always set the defrag callback defragKey of dictDefragFunctions to NULL, because we can't reallocate a field with out updating it's reference in ebuckets.
Instead, we will defrag the field of the dict and update its reference in the callback dictScanDefrag of dictScanFunction().
When we defrag the hash robj with HFE, we will use ebDefragItem to defrag the robj and update the reference in db->hexpires.

TODO:

Defrag ebucket structure incremently, which will be handler in a future PR.

src/defrag.c

src/ebuckets.c

Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>

Co-authored-by: Moti Cohen <moti.cohen@redis.com>

src/server.h

sundb · 2024-04-28T09:41:57Z

tests/unit/type/hash-field-expire.tcl


        # Wait for active expire
-        wait_for_condition 50 20 { [r EXISTS same40] == 0 } else { fail "hash `same40` should be expired" }
+        wait_for_condition 500 20 { [r EXISTS same40] == 0 } else { fail "hash `same40` should be expired" }


change this due to https://github.com/redis/redis/actions/runs/8866018951/job/24342931827?pr=13229

src/server.h

src/ebuckets.c

sundb · 2024-05-09T08:50:14Z

src/ebuckets.c

        ebValidateRax(ebGetRaxPtr(eb), type);
 }

+eItem ebDefragItem(ebuckets *eb, EbucketsType *type, eItem item, ebDefragFunction *fn) {


I added comment (include note) for this function.

sundb · 2024-05-09T08:53:45Z

src/defrag.c

+    if (unlikely(ob->type == OBJ_HASH && hashTypeHasExpireField(ob))) {
+        /* Update its reference in the ebucket while defragging it. */
+        newob = ebDefragItem(&db->hexpires, &hashExpireBucketsType, ob, (ebDefragFunction *)activeDefragStringOb);


done with 5bf9ddb (#13229).

src/ebuckets.c

sundb · 2024-05-10T14:21:31Z

src/ebuckets.c

        ebValidateRax(ebGetRaxPtr(eb), type);
 }

+eItem ebDefragItem(ebuckets *eb, EbucketsType *type, eItem item, ebDefragFunction *fn) {


after struggling, i decide to add the assertion.

src/ebuckets.c

tezc · 2024-05-13T08:05:40Z

I see the comment that you'll add test cases for eblist and listpack later. I didn't do deep bug hunting but overall, PR seems good to me.

sundb · 2024-05-14T01:50:40Z

@tezc please have a look, make the test cover the listpackex and eblist.

sundb · 2024-05-14T06:52:26Z

@tezc because the defragment for ebucket doesn't complete, i increase the threhold to make the test stable, now the test is just for ensuring the correctly updating of references.

tests/unit/memefficiency.tcl

guybe7 · 2025-02-04T07:21:51Z

@sundb @YaacovHazan did we handle this?

Defrag ebucket structure incremently, which will be handler in a future PR.

sundb · 2025-02-04T13:34:50Z

@guybe7 sure, but the biggest obstacle is that every stage of the defragmenting process now has to be kvstore, and before we can do this, we need to refactor the code so that every stage doesn't have to be kvstore.

In PR #13229, we introduced the ebucket for HFE. Before this PR, when updating eitems stored in ebuckets, the lack of incremental fragmentation support for non-kvstore data structures (until PR #13814) meant that we had to reverse lookup the position of the eitem in the ebucket and then perform the update. This approach was inefficient as it often required frequent traversals of the segment list to locate and update the item. To address this issue, in this PR, This PR implements incremental fragmentation for hash dict ebuckets and server.hexpires. By incrementally defrag the ebuckets, we also perform defragmentation for the associated items, eliminates the need for frequent traversals of the segment list for defragging the eitem. --------- Co-authored-by: Moti Cohen <moticless@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

## Background 1. All hash objects that contain HFE are referenced by db->hexpires. 2. All fields in a dict hash object with HFE are referenced by an ebucket. So when we defrag the hash object or the field in a dict with HFE, we also need to update the references in them. ## Interface 1. Add a new interface `ebDefragItem`, which can accept a defrag callback to defrag items in ebuckets, and simultaneously update their references in the ebucket. ## Mainly changes 1. The key type of dict of hash object is no longer sds, so add new `activeDefragHfieldDict()` to defrag the dict instead of `activeDefragSdsDict()`. 2. When we defrag the dict of hash object by using `dictScanDefrag()`, we always set the defrag callback `defragKey` of `dictDefragFunctions` to NULL, because we can't reallocate a field with out updating it's reference in ebuckets. Instead, we will defrag the field of the dict and update its reference in the callback `dictScanDefrag` of dictScanFunction(). 3. When we defrag the hash robj with HFE, we will use `ebDefragItem` to defrag the robj and update the reference in db->hexpires. ## TODO: Defrag ebucket structure incremently, which will be handler in a future PR. --------- Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com> Co-authored-by: Moti Cohen <moti.cohen@redis.com>

Hash Field Expiration Defragmentation

5ee153e

sundb requested a review from moticless April 19, 2024 08:44

sundb added 18 commits April 20, 2024 15:21

Optimize ebScanDefrag

53dbff1

Spell

3162c25

Add notification tests for HFE

9117b42

New way to skip defraging the fields with TTL

4986870

Remove activeDefragHfieldDict

f7bf686

Use dictUseStoredKeyApi to get hash from hfield

99b3bad

Revert the code for defraging hash obj with TTL

09562c4

Add commands.def

761b3a4

Add activeDefragHfieldSkipTTL

5a86b46

add new ebDefragItem

6942c4a

Revert some code

fdbc69e

Simplify code

af9eadc

Fix compile warning

871ec6f

Fix crash due to forget skip defraging hfield with TTL in scanLaterHash

b19f6f5

Unify the defragment of hfield in the callback of dictScanDefrag

5e8dfd1

Add comment

2f988f0

Cleanup

f8bbf7f

Use zmalloc_usable_size instead of zmalloc_size

acfc293

sundb marked this pull request as ready for review April 25, 2024 12:48

sundb added 2 commits April 25, 2024 21:25

Remove unused code

4a9f8f6

Remove unused code

d701bf8

sundb changed the title ~~WIP: Hash Field Expiration Defragmentation~~ AddHash Field Expiration Defragmentation Apr 25, 2024

sundb changed the title ~~AddHash Field Expiration Defragmentation~~ Add defragment support for HFE Apr 25, 2024

sundb added 5 commits April 26, 2024 09:22

Add missing raxStop

dcc5da0

Merge branch 'hfe' into hfe-defrag

35ec1a3

Add individual test for HFE defragment

1f62c55

For CI test

c0ec94c

Fix defrag HFE test failed

c270ca4

sundb added 2 commits May 9, 2024 16:49

Fix CR

5bf9ddb

Format

0936517

sundb requested review from moticless and tezc May 9, 2024 09:23

tezc reviewed May 10, 2024

View reviewed changes

src/defrag.c Outdated Show resolved Hide resolved

tezc reviewed May 10, 2024

View reviewed changes

src/ebuckets.c Show resolved Hide resolved

tezc reviewed May 10, 2024

View reviewed changes

src/ebuckets.c Show resolved Hide resolved

sundb and others added 2 commits May 10, 2024 21:39

Update src/defrag.c

1bea6c9

Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>

Add a assertion for ExpireMeta's trash in ebDefragItem()

b619ea5

Co-authored-by: Moti Cohen <moti.cohen@redis.com>

sundb commented May 10, 2024

View reviewed changes

Fix a else if mistake

5023c03

tezc reviewed May 13, 2024

View reviewed changes

src/ebuckets.c Show resolved Hide resolved

tezc previously approved these changes May 13, 2024

View reviewed changes

sundb added 2 commits May 13, 2024 22:20

Merge branch 'hash-field-expiry-integ' into hfe-defrag

ebbb291

Improve the HFE deframent test to cover listpackex and eblist

8db938b

sundb dismissed tezc’s stale review via 8db938b May 14, 2024 01:49

Increase the threshold to make test more stable

8363b2c

tezc reviewed May 14, 2024

View reviewed changes

tests/unit/memefficiency.tcl Outdated Show resolved Hide resolved

Improve test

79cdbb0

tezc approved these changes May 14, 2024

View reviewed changes

sundb merged commit 80be2cc into redis:hash-field-expiry-integ May 14, 2024

sundb deleted the hfe-defrag branch May 14, 2024 09:32

sundb mentioned this pull request Mar 7, 2025

Add support to defrag ebuckets incrementally #13842

Merged

Conversation

sundb commented Apr 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Interface

Mainly changes

TODO:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sundb Apr 28, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sundb May 9, 2024

Choose a reason for hiding this comment

Uh oh!

sundb May 9, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sundb May 10, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tezc commented May 13, 2024

Uh oh!

sundb commented May 14, 2024

Uh oh!

sundb commented May 14, 2024

Uh oh!

Uh oh!

guybe7 commented Feb 4, 2025

Uh oh!

sundb commented Feb 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sundb commented Apr 19, 2024 •

edited

Loading