Skip to content

Add defragment support for HFE#13229

Merged
sundb merged 87 commits intoredis:hash-field-expiry-integfrom
sundb:hfe-defrag
May 14, 2024
Merged

Add defragment support for HFE#13229
sundb merged 87 commits intoredis:hash-field-expiry-integfrom
sundb:hfe-defrag

Conversation

@sundb
Copy link
Copy Markdown
Collaborator

@sundb sundb commented Apr 19, 2024

Background

  1. All hash objects that contain HFE are referenced by db->hexpires.
  2. All fields in a dict hash object with HFE are referenced by an ebucket.

So when we defrag the hash object or the field in a dict with HFE, we also need to update the references in them.

Interface

  1. Add a new interface ebDefragItem, which can accept a defrag callback to defrag items in ebuckets, and simultaneously update their references in the ebucket.

Mainly changes

  1. The key type of dict of hash object is no longer sds, so add new activeDefragHfieldDict() to defrag the dict instead of activeDefragSdsDict().
  2. When we defrag the dict of hash object by using dictScanDefrag(), we always set the defrag callback defragKey of dictDefragFunctions to NULL, because we can't reallocate a field with out updating it's reference in ebuckets.
    Instead, we will defrag the field of the dict and update its reference in the callback dictScanDefrag of dictScanFunction().
  3. When we defrag the hash robj with HFE, we will use ebDefragItem to defrag the robj and update the reference in db->hexpires.

TODO:

Defrag ebucket structure incremently, which will be handler in a future PR.

@sundb sundb requested a review from moticless April 19, 2024 08:44
@sundb sundb marked this pull request as ready for review April 25, 2024 12:48
@sundb sundb changed the title WIP: Hash Field Expiration Defragmentation AddHash Field Expiration Defragmentation Apr 25, 2024
@sundb sundb changed the title AddHash Field Expiration Defragmentation Add defragment support for HFE Apr 25, 2024
@sundb sundb requested review from moticless and tezc May 9, 2024 09:23
sundb and others added 2 commits May 10, 2024 21:39
Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
Co-authored-by: Moti Cohen <moti.cohen@redis.com>

# Wait for active expire
wait_for_condition 50 20 { [r EXISTS same40] == 0 } else { fail "hash `same40` should be expired" }
wait_for_condition 500 20 { [r EXISTS same40] == 0 } else { fail "hash `same40` should be expired" }
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/ebuckets.c Outdated
ebValidateRax(ebGetRaxPtr(eb), type);
}

eItem ebDefragItem(ebuckets *eb, EbucketsType *type, eItem item, ebDefragFunction *fn) {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added comment (include note) for this function.

src/defrag.c Outdated
Comment on lines +754 to +756
if (unlikely(ob->type == OBJ_HASH && hashTypeHasExpireField(ob))) {
/* Update its reference in the ebucket while defragging it. */
newob = ebDefragItem(&db->hexpires, &hashExpireBucketsType, ob, (ebDefragFunction *)activeDefragStringOb);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done with 5bf9ddb (#13229).

src/ebuckets.c Outdated
ebValidateRax(ebGetRaxPtr(eb), type);
}

eItem ebDefragItem(ebuckets *eb, EbucketsType *type, eItem item, ebDefragFunction *fn) {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after struggling, i decide to add the assertion.

@tezc
Copy link
Copy Markdown
Collaborator

tezc commented May 13, 2024

I see the comment that you'll add test cases for eblist and listpack later. I didn't do deep bug hunting but overall, PR seems good to me.

tezc
tezc previously approved these changes May 13, 2024
@sundb
Copy link
Copy Markdown
Collaborator Author

sundb commented May 14, 2024

@tezc please have a look, make the test cover the listpackex and eblist.

@sundb
Copy link
Copy Markdown
Collaborator Author

sundb commented May 14, 2024

@tezc because the defragment for ebucket doesn't complete, i increase the threhold to make the test stable, now the test is just for ensuring the correctly updating of references.

@sundb sundb merged commit 80be2cc into redis:hash-field-expiry-integ May 14, 2024
@sundb sundb deleted the hfe-defrag branch May 14, 2024 09:32
@guybe7
Copy link
Copy Markdown
Collaborator

guybe7 commented Feb 4, 2025

@sundb @YaacovHazan did we handle this?

Defrag ebucket structure incremently, which will be handler in a future PR.

@sundb
Copy link
Copy Markdown
Collaborator Author

sundb commented Feb 4, 2025

@guybe7 sure, but the biggest obstacle is that every stage of the defragmenting process now has to be kvstore, and before we can do this, we need to refactor the code so that every stage doesn't have to be kvstore.

sundb added a commit that referenced this pull request May 18, 2025
In PR #13229, we introduced the ebucket for HFE.
Before this PR, when updating eitems stored in ebuckets, the lack of
incremental fragmentation support for non-kvstore data structures (until
PR #13814) meant that we had to reverse lookup the position of the eitem
in the ebucket and then perform the update.
This approach was inefficient as it often required frequent traversals
of the segment list to locate and update the item.

To address this issue, in this PR, This PR implements incremental
fragmentation for hash dict ebuckets and server.hexpires.
By incrementally defrag the ebuckets, we also perform defragmentation
for the associated items, eliminates the need for frequent traversals of
the segment list for defragging the eitem.

---------

Co-authored-by: Moti Cohen <moticless@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
funny-dog pushed a commit to funny-dog/redis that referenced this pull request Sep 17, 2025
## Background
1. All hash objects that contain HFE are referenced by db->hexpires.
2. All fields in a dict hash object with HFE are referenced by an
ebucket.

So when we defrag the hash object or the field in a dict with HFE, we
also need to update the references in them.

## Interface
1. Add a new interface `ebDefragItem`, which can accept a defrag
callback to defrag items in ebuckets, and simultaneously update their
references in the ebucket.

## Mainly changes
1. The key type of dict of hash object is no longer sds, so add new
`activeDefragHfieldDict()` to defrag the dict instead of
`activeDefragSdsDict()`.
2. When we defrag the dict of hash object by using `dictScanDefrag()`,
we always set the defrag callback `defragKey` of `dictDefragFunctions`
to NULL, because we can't reallocate a field with out updating it's
reference in ebuckets.
Instead, we will defrag the field of the dict and update its reference
in the callback `dictScanDefrag` of dictScanFunction().
3. When we defrag the hash robj with HFE, we will use `ebDefragItem` to
defrag the robj and update the reference in db->hexpires.

## TODO:
Defrag ebucket structure incremently, which will be handler in a future
PR.

---------

Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
Co-authored-by: Moti Cohen <moti.cohen@redis.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants