Skip to content

Fixed passing incorrect endtime value for module context#13822

Merged
sundb merged 12 commits intoredis:unstablefrom
sundb:incorrect_endtime
Feb 23, 2025
Merged

Fixed passing incorrect endtime value for module context#13822
sundb merged 12 commits intoredis:unstablefrom
sundb:incorrect_endtime

Conversation

@sundb
Copy link
Copy Markdown
Collaborator

@sundb sundb commented Feb 21, 2025

  1. Fix a bug that passing an incorrect endtime to module.
    This bug was found by @ShooterIT.
    After Refactor of ActiveDefrag to reduce latencies #13814, all endtime will be monotonic time, and we should no longer convert it to ustime relative.
    Add assertions to prevent endtime from being much larger thatn the current time.

  2. Fix a race in test Reduce defrag CPU usage when module data can't be defragged

@sundb sundb requested a review from ShooterIT February 21, 2025 06:29
Co-authored-by: ShooterIT <wangyuancode@163.com>
Copy link
Copy Markdown
Member

@ShooterIT ShooterIT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@ShooterIT ShooterIT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's check before using?

maybe we don't need this assert since normal data defrags don't check yet

sundb and others added 2 commits February 21, 2025 19:28
Co-authored-by: Yuan Wang <wangyuancode@163.com>
Co-authored-by: Yuan Wang <wangyuancode@163.com>
@sundb
Copy link
Copy Markdown
Collaborator Author

sundb commented Feb 22, 2025

http://github.com/sundb/redis/actions/runs/13457985264/job/37606186414
there is a chance that defragtest_global_strings_attempts] > 0 (context: type eval line 21 cmd {assert {[getInfoProperty $info defragtest_global_strings_attempts] > 0}} proc ::test) failed.
This is because we are defragmenting strings first and then following dicts, and strings defragmenting is faster, which results in 25,000 RedisStrings (half of which are deleted to create fragmentation) being defragmented very quickly.
However, we have verified similar defragmentation break in dict defragmenting, so we can remove this validation.

@YaacovHazan
Copy link
Copy Markdown
Collaborator

http://github.com/sundb/redis/actions/runs/13457985264/job/37606186414 there is a chance that defragtest_global_strings_attempts] > 0 (context: type eval line 21 cmd {assert {[getInfoProperty $info defragtest_global_strings_attempts] > 0}} proc ::test) failed. This is because we are defragmenting strings first and then following dicts, and strings defragmenting is faster, which results in 25,000 RedisStrings (half of which are deleted to create fragmentation) being defragmented very quickly. However, we have verified similar defragmentation break in dict defragmenting, so we can remove this validation.

@sundb are you going to push a another change to remove this validation?

@sundb
Copy link
Copy Markdown
Collaborator Author

sundb commented Feb 23, 2025

@YaacovHazan It has been deleted in this PR, I will merge this PR when daily ci passes.

@sundb
Copy link
Copy Markdown
Collaborator Author

sundb commented Feb 23, 2025

Because embedded string robj consumes 24 bits under 32-bit, this bin is almost the most frequently used bin, so the robj of strings can hardly be defragmented in this test, because its total number of robjs is equivalent to the total number is very small. je_get_defrag_hint() always returns no.
from the bin 24 in the malloc-stats report we can see that the interest rate of bin 24 has always been very high (>95%), so the defragmentation of strings has never been successful

after create 50000 strings

bins:           size ind    allocated      nmalloc (#/sec)      ndalloc (#/sec)    nrequests   (#/sec)  nshards      curregs     curslabs  nonfull_slabs regs pgs   util       nfills (#/sec)     nflushes (#/sec)       nslabs     nreslabs (#/sec)      n_lock_ops (#/sec)       n_waiting (#/sec)      n_spin_acq (#/sec)  n_owner_switch (#/sec)   total_wait_ns   (#/sec)     max_wait_ns  max_n_thds
                   8   0      4409584       551200  551200            2       2       601751    601751        1       551198         1078              1  512   1  0.998         5512    5512            1       1         1078            1       1            6600    6600               0       0               0       0               4       4               0         0               0           0
                  16   1       204160        12848   12848           88      88        62982     62982        1        12760           51              0  256   1  0.977          136     136            5       5           51            0       0             201     201               0       0               0       0               4       4               0         0               0           0
                  24   2     14431152       601300  601300            2       2       651170    651170        1       601298         1175              1  512   3  0.999         6013    6013            1       1         1175            1       1            7198    7198               0       0               0       0               4       4               0         0               0           0

aftert delete half of the strings

bins:           size ind    allocated      nmalloc (#/sec)      ndalloc (#/sec)    nrequests   (#/sec)  nshards      curregs     curslabs  nonfull_slabs regs pgs   util       nfills (#/sec)     nflushes (#/sec)       nslabs     nreslabs (#/sec)      n_lock_ops (#/sec)       n_waiting (#/sec)      n_spin_acq (#/sec)  n_owner_switch (#/sec)   total_wait_ns   (#/sec)     max_wait_ns  max_n_thds
                   8   0      4409584       551200  551200            2       2       601751    601751        1       551198         1078              1  512   1  0.998         5512    5512            1       1         1078            1       1            6602    6602               0       0               0       0               4       4               0         0               0           0
                  16   1       204160        12848   12848           88      88        62982     62982        1        12760           51              0  256   1  0.977          136     136            5       5           51            0       0             203     203               0       0               0       0               4       4               0         0               0           0
                  24   2     13833552       601300  601300        24902   24902       651255    651255        1       576398         1175             98  512   3  0.958         6013    6013          250     250         1175            1       1            7449    7449               0       0               0       0               4       4               0         0               0           0

after commit: b15207c
We no longer use embedded string obj to avoid storing robj in bin24 to make it more likely to be defragged.
This change will cause 64-bit failure, removing the test that this PR introduced.

@sundb
Copy link
Copy Markdown
Collaborator Author

sundb commented Feb 23, 2025

@sundb sundb merged commit ee933d9 into redis:unstable Feb 23, 2025
fcostaoliveira pushed a commit to filipecosta90/redis that referenced this pull request Feb 24, 2025
1) Fix a bug that passing an incorrect endtime to module.
   This bug was found by @ShooterIT.
After redis#13814, all endtime will be monotonic time, and we should no
longer convert it to ustime relative.
Add assertions to prevent endtime from being much larger thatn the
current time.

2) Fix a race in test `Reduce defrag CPU usage when module data can't be
defragged`

---------

Co-authored-by: ShooterIT <wangyuancode@163.com>
funny-dog pushed a commit to funny-dog/redis that referenced this pull request Sep 17, 2025
1) Fix a bug that passing an incorrect endtime to module.
   This bug was found by @ShooterIT.
After redis#13814, all endtime will be monotonic time, and we should no
longer convert it to ustime relative.
Add assertions to prevent endtime from being much larger thatn the
current time.

2) Fix a race in test `Reduce defrag CPU usage when module data can't be
defragged`

---------

Co-authored-by: ShooterIT <wangyuancode@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants