Skip to content

[trival] [fix] missing dlclose when error with module load#19

Closed
charsyam wants to merge 1 commit into
valkey-io:unstablefrom
charsyam:feature/fix-dlclose-leak
Closed

[trival] [fix] missing dlclose when error with module load#19
charsyam wants to merge 1 commit into
valkey-io:unstablefrom
charsyam:feature/fix-dlclose-leak

Conversation

@charsyam

Copy link
Copy Markdown
Contributor

just simple fix to add dlclose when error with module load.

Comment thread src/module.c
if (post_load_err) {
moduleUnload(ctx.module->name, NULL);
moduleFreeContext(&ctx);
dlclose(handle);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like moduleUnload handles dlclose today, so I think this will have already been closed.

@charsyam

Copy link
Copy Markdown
Contributor Author

@madolson You're right. I missed it. Thanks.

@charsyam charsyam closed this Mar 25, 2024
zuiderkwast pushed a commit that referenced this pull request Jun 25, 2025
**Current state**
During `hashtableScanDefrag`, rehashing is paused to prevent entries
from moving, but the scan callback can still delete entries which
triggers `hashtableShrinkIfNeeded`. For example, the
`expireScanCallback` can delete expired entries.

**Issue**
This can cause the table to be resized and the old memory to be freed
while the scan is still accessing it, resulting in the following memory
access violation:

```
[err]: Sanitizer error: =================================================================
==46774==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000003100 at pc 0x0000004704d3 bp 0x7fffcb062000 sp 0x7fffcb061ff0
READ of size 1 at 0x611000003100 thread T0
    #0 0x4704d2 in isPositionFilled /home/gusakovy/Projects/valkey/src/hashtable.c:422
    #1 0x478b45 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1768
    #2 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    #3 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    #4 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    #5 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    #6 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    #7 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    #8 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    #9 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    #10 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    #11 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)
    #12 0x452e39 in _start (/local/home/gusakovy/Projects/valkey/src/valkey-server+0x452e39)

0x611000003100 is located 0 bytes inside of 256-byte region [0x611000003100,0x611000003200)
freed by thread T0 here:
    #0 0x7f471a34a1e5 in __interceptor_free (/lib64/libasan.so.4+0xd81e5)
    #1 0x4aefbc in zfree_internal /home/gusakovy/Projects/valkey/src/zmalloc.c:400
    #2 0x4aeff5 in valkey_free /home/gusakovy/Projects/valkey/src/zmalloc.c:415
    #3 0x4707d2 in rehashingCompleted /home/gusakovy/Projects/valkey/src/hashtable.c:456
    #4 0x471b5b in resize /home/gusakovy/Projects/valkey/src/hashtable.c:656
    #5 0x475bff in hashtableShrinkIfNeeded /home/gusakovy/Projects/valkey/src/hashtable.c:1272
    #6 0x47704b in hashtablePop /home/gusakovy/Projects/valkey/src/hashtable.c:1448
    #7 0x47716f in hashtableDelete /home/gusakovy/Projects/valkey/src/hashtable.c:1459
    #8 0x480038 in kvstoreHashtableDelete /home/gusakovy/Projects/valkey/src/kvstore.c:847
    #9 0x50c12c in dbGenericDeleteWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:490
    #10 0x515f28 in deleteExpiredKeyAndPropagateWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:1831
    #11 0x516103 in deleteExpiredKeyAndPropagate /home/gusakovy/Projects/valkey/src/db.c:1844
    #12 0x6d8642 in activeExpireCycleTryExpire /home/gusakovy/Projects/valkey/src/expire.c:70
    #13 0x6d8706 in expireScanCallback /home/gusakovy/Projects/valkey/src/expire.c:139
    #14 0x478bd8 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1770
    #15 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    #16 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    #17 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    #18 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    #19 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    #20 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    #21 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    #22 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    #23 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    #24 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)

previously allocated by thread T0 here:
    #0 0x7f471a34a753 in __interceptor_calloc (/lib64/libasan.so.4+0xd8753)
    #1 0x4ae48c in ztrycalloc_usable_internal /home/gusakovy/Projects/valkey/src/zmalloc.c:214
    #2 0x4ae757 in valkey_calloc /home/gusakovy/Projects/valkey/src/zmalloc.c:257
    #3 0x4718fc in resize /home/gusakovy/Projects/valkey/src/hashtable.c:645
    #4 0x475bff in hashtableShrinkIfNeeded /home/gusakovy/Projects/valkey/src/hashtable.c:1272
    #5 0x47704b in hashtablePop /home/gusakovy/Projects/valkey/src/hashtable.c:1448
    #6 0x47716f in hashtableDelete /home/gusakovy/Projects/valkey/src/hashtable.c:1459
    #7 0x480038 in kvstoreHashtableDelete /home/gusakovy/Projects/valkey/src/kvstore.c:847
    #8 0x50c12c in dbGenericDeleteWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:490
    #9 0x515f28 in deleteExpiredKeyAndPropagateWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:1831
    #10 0x516103 in deleteExpiredKeyAndPropagate /home/gusakovy/Projects/valkey/src/db.c:1844
    #11 0x6d8642 in activeExpireCycleTryExpire /home/gusakovy/Projects/valkey/src/expire.c:70
    #12 0x6d8706 in expireScanCallback /home/gusakovy/Projects/valkey/src/expire.c:139
    #13 0x478bd8 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1770
    #14 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    #15 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    #16 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    #17 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    #18 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    #19 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    #20 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    #21 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    #22 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    #23 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)

SUMMARY: AddressSanitizer: heap-use-after-free /home/gusakovy/Projects/valkey/src/hashtable.c:422 in isPositionFilled
Shadow bytes around the buggy address:
  0x0c227fff85d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff85e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff85f0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c227fff8600: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8610: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
=>0x0c227fff8620:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8630: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8640: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8650: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8660: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8670: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==46774==ABORTING
```


**Solution**
Suggested solution is to also pause auto shrinking during
`hashtableScanDefrag`. I noticed that there was already a
`hashtablePauseAutoShrink` method and `pause_auto_shrink` counter, but
it wasn't actually used in `hashtableShrinkIfNeeded` so I fixed that.

**Testing**
I created a simple tcl test that (most of the times) triggers this
error, but it's a little clunky so I didn't add it as part of the PR:

```
start_server {tags {"expire hashtable defrag"}} {
    test {hashtable scan defrag on expiry} {

        r config set hz 100

        set num_keys 20
        for {set i 0} {$i < $num_keys} {incr i} {
            r set "key_$i" "value_$i"
        }

        for {set j 0} {$j < 50} {incr j} {
            set expire_keys 100
            for {set i 0} {$i < $expire_keys} {incr i} {
                # Short expiry time to ensure they expire quickly
                r psetex "expire_key_${i}_${j}" 100 "expire_value_${i}_${j}"
            }

            # Verify keys are set
            set initial_size [r dbsize]
            assert_equal $initial_size [expr $num_keys + $expire_keys]
            
            after 150
            for {set i 0} {$i < 10} {incr i} {
                r get "expire_key_${i}_${j}"
                after 10
            }
        }

        set remaining_keys [r dbsize]
        assert_equal $remaining_keys $num_keys

        # Verify server is still responsive
        assert_equal [r ping] {PONG}
    } {}
}
```
Compiling with ASAN using `make noopt SANITIZER=address valkey-server`
and running the test causes error above. Applying the fix resolves the
issue.

Signed-off-by: Yakov Gusakov <yaakov0015@gmail.com>
ranshid pushed a commit to ranshid/valkey that referenced this pull request Sep 30, 2025
…y-io#2257)

**Current state**
During `hashtableScanDefrag`, rehashing is paused to prevent entries
from moving, but the scan callback can still delete entries which
triggers `hashtableShrinkIfNeeded`. For example, the
`expireScanCallback` can delete expired entries.

**Issue**
This can cause the table to be resized and the old memory to be freed
while the scan is still accessing it, resulting in the following memory
access violation:

```
[err]: Sanitizer error: =================================================================
==46774==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000003100 at pc 0x0000004704d3 bp 0x7fffcb062000 sp 0x7fffcb061ff0
READ of size 1 at 0x611000003100 thread T0
    #0 0x4704d2 in isPositionFilled /home/gusakovy/Projects/valkey/src/hashtable.c:422
    #1 0x478b45 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1768
    #2 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    #3 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    #4 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    #5 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    #6 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    #7 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    #8 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    valkey-io#9 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    valkey-io#10 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    valkey-io#11 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)
    valkey-io#12 0x452e39 in _start (/local/home/gusakovy/Projects/valkey/src/valkey-server+0x452e39)

0x611000003100 is located 0 bytes inside of 256-byte region [0x611000003100,0x611000003200)
freed by thread T0 here:
    #0 0x7f471a34a1e5 in __interceptor_free (/lib64/libasan.so.4+0xd81e5)
    #1 0x4aefbc in zfree_internal /home/gusakovy/Projects/valkey/src/zmalloc.c:400
    #2 0x4aeff5 in valkey_free /home/gusakovy/Projects/valkey/src/zmalloc.c:415
    #3 0x4707d2 in rehashingCompleted /home/gusakovy/Projects/valkey/src/hashtable.c:456
    #4 0x471b5b in resize /home/gusakovy/Projects/valkey/src/hashtable.c:656
    #5 0x475bff in hashtableShrinkIfNeeded /home/gusakovy/Projects/valkey/src/hashtable.c:1272
    #6 0x47704b in hashtablePop /home/gusakovy/Projects/valkey/src/hashtable.c:1448
    #7 0x47716f in hashtableDelete /home/gusakovy/Projects/valkey/src/hashtable.c:1459
    #8 0x480038 in kvstoreHashtableDelete /home/gusakovy/Projects/valkey/src/kvstore.c:847
    valkey-io#9 0x50c12c in dbGenericDeleteWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:490
    valkey-io#10 0x515f28 in deleteExpiredKeyAndPropagateWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:1831
    valkey-io#11 0x516103 in deleteExpiredKeyAndPropagate /home/gusakovy/Projects/valkey/src/db.c:1844
    valkey-io#12 0x6d8642 in activeExpireCycleTryExpire /home/gusakovy/Projects/valkey/src/expire.c:70
    valkey-io#13 0x6d8706 in expireScanCallback /home/gusakovy/Projects/valkey/src/expire.c:139
    valkey-io#14 0x478bd8 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1770
    valkey-io#15 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    valkey-io#16 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    valkey-io#17 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    valkey-io#18 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    valkey-io#19 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    valkey-io#20 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    valkey-io#21 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    valkey-io#22 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    valkey-io#23 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    valkey-io#24 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)

previously allocated by thread T0 here:
    #0 0x7f471a34a753 in __interceptor_calloc (/lib64/libasan.so.4+0xd8753)
    #1 0x4ae48c in ztrycalloc_usable_internal /home/gusakovy/Projects/valkey/src/zmalloc.c:214
    #2 0x4ae757 in valkey_calloc /home/gusakovy/Projects/valkey/src/zmalloc.c:257
    #3 0x4718fc in resize /home/gusakovy/Projects/valkey/src/hashtable.c:645
    #4 0x475bff in hashtableShrinkIfNeeded /home/gusakovy/Projects/valkey/src/hashtable.c:1272
    #5 0x47704b in hashtablePop /home/gusakovy/Projects/valkey/src/hashtable.c:1448
    #6 0x47716f in hashtableDelete /home/gusakovy/Projects/valkey/src/hashtable.c:1459
    #7 0x480038 in kvstoreHashtableDelete /home/gusakovy/Projects/valkey/src/kvstore.c:847
    #8 0x50c12c in dbGenericDeleteWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:490
    valkey-io#9 0x515f28 in deleteExpiredKeyAndPropagateWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:1831
    valkey-io#10 0x516103 in deleteExpiredKeyAndPropagate /home/gusakovy/Projects/valkey/src/db.c:1844
    valkey-io#11 0x6d8642 in activeExpireCycleTryExpire /home/gusakovy/Projects/valkey/src/expire.c:70
    valkey-io#12 0x6d8706 in expireScanCallback /home/gusakovy/Projects/valkey/src/expire.c:139
    valkey-io#13 0x478bd8 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1770
    valkey-io#14 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    valkey-io#15 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    valkey-io#16 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    valkey-io#17 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    valkey-io#18 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    valkey-io#19 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    valkey-io#20 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    valkey-io#21 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    valkey-io#22 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    valkey-io#23 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)

SUMMARY: AddressSanitizer: heap-use-after-free /home/gusakovy/Projects/valkey/src/hashtable.c:422 in isPositionFilled
Shadow bytes around the buggy address:
  0x0c227fff85d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff85e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff85f0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c227fff8600: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8610: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
=>0x0c227fff8620:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8630: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8640: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8650: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8660: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8670: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==46774==ABORTING
```


**Solution**
Suggested solution is to also pause auto shrinking during
`hashtableScanDefrag`. I noticed that there was already a
`hashtablePauseAutoShrink` method and `pause_auto_shrink` counter, but
it wasn't actually used in `hashtableShrinkIfNeeded` so I fixed that.

**Testing**
I created a simple tcl test that (most of the times) triggers this
error, but it's a little clunky so I didn't add it as part of the PR:

```
start_server {tags {"expire hashtable defrag"}} {
    test {hashtable scan defrag on expiry} {

        r config set hz 100

        set num_keys 20
        for {set i 0} {$i < $num_keys} {incr i} {
            r set "key_$i" "value_$i"
        }

        for {set j 0} {$j < 50} {incr j} {
            set expire_keys 100
            for {set i 0} {$i < $expire_keys} {incr i} {
                # Short expiry time to ensure they expire quickly
                r psetex "expire_key_${i}_${j}" 100 "expire_value_${i}_${j}"
            }

            # Verify keys are set
            set initial_size [r dbsize]
            assert_equal $initial_size [expr $num_keys + $expire_keys]
            
            after 150
            for {set i 0} {$i < 10} {incr i} {
                r get "expire_key_${i}_${j}"
                after 10
            }
        }

        set remaining_keys [r dbsize]
        assert_equal $remaining_keys $num_keys

        # Verify server is still responsive
        assert_equal [r ping] {PONG}
    } {}
}
```
Compiling with ASAN using `make noopt SANITIZER=address valkey-server`
and running the test causes error above. Applying the fix resolves the
issue.

Signed-off-by: Yakov Gusakov <yaakov0015@gmail.com>
zuiderkwast pushed a commit that referenced this pull request Oct 1, 2025
**Current state**
During `hashtableScanDefrag`, rehashing is paused to prevent entries
from moving, but the scan callback can still delete entries which
triggers `hashtableShrinkIfNeeded`. For example, the
`expireScanCallback` can delete expired entries.

**Issue**
This can cause the table to be resized and the old memory to be freed
while the scan is still accessing it, resulting in the following memory
access violation:

```
[err]: Sanitizer error: =================================================================
==46774==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000003100 at pc 0x0000004704d3 bp 0x7fffcb062000 sp 0x7fffcb061ff0
READ of size 1 at 0x611000003100 thread T0
    #0 0x4704d2 in isPositionFilled /home/gusakovy/Projects/valkey/src/hashtable.c:422
    #1 0x478b45 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1768
    #2 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    #3 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    #4 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    #5 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    #6 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    #7 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    #8 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    #9 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    #10 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    #11 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)
    #12 0x452e39 in _start (/local/home/gusakovy/Projects/valkey/src/valkey-server+0x452e39)

0x611000003100 is located 0 bytes inside of 256-byte region [0x611000003100,0x611000003200)
freed by thread T0 here:
    #0 0x7f471a34a1e5 in __interceptor_free (/lib64/libasan.so.4+0xd81e5)
    #1 0x4aefbc in zfree_internal /home/gusakovy/Projects/valkey/src/zmalloc.c:400
    #2 0x4aeff5 in valkey_free /home/gusakovy/Projects/valkey/src/zmalloc.c:415
    #3 0x4707d2 in rehashingCompleted /home/gusakovy/Projects/valkey/src/hashtable.c:456
    #4 0x471b5b in resize /home/gusakovy/Projects/valkey/src/hashtable.c:656
    #5 0x475bff in hashtableShrinkIfNeeded /home/gusakovy/Projects/valkey/src/hashtable.c:1272
    #6 0x47704b in hashtablePop /home/gusakovy/Projects/valkey/src/hashtable.c:1448
    #7 0x47716f in hashtableDelete /home/gusakovy/Projects/valkey/src/hashtable.c:1459
    #8 0x480038 in kvstoreHashtableDelete /home/gusakovy/Projects/valkey/src/kvstore.c:847
    #9 0x50c12c in dbGenericDeleteWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:490
    #10 0x515f28 in deleteExpiredKeyAndPropagateWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:1831
    #11 0x516103 in deleteExpiredKeyAndPropagate /home/gusakovy/Projects/valkey/src/db.c:1844
    #12 0x6d8642 in activeExpireCycleTryExpire /home/gusakovy/Projects/valkey/src/expire.c:70
    #13 0x6d8706 in expireScanCallback /home/gusakovy/Projects/valkey/src/expire.c:139
    #14 0x478bd8 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1770
    #15 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    #16 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    #17 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    #18 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    #19 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    #20 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    #21 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    #22 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    #23 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    #24 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)

previously allocated by thread T0 here:
    #0 0x7f471a34a753 in __interceptor_calloc (/lib64/libasan.so.4+0xd8753)
    #1 0x4ae48c in ztrycalloc_usable_internal /home/gusakovy/Projects/valkey/src/zmalloc.c:214
    #2 0x4ae757 in valkey_calloc /home/gusakovy/Projects/valkey/src/zmalloc.c:257
    #3 0x4718fc in resize /home/gusakovy/Projects/valkey/src/hashtable.c:645
    #4 0x475bff in hashtableShrinkIfNeeded /home/gusakovy/Projects/valkey/src/hashtable.c:1272
    #5 0x47704b in hashtablePop /home/gusakovy/Projects/valkey/src/hashtable.c:1448
    #6 0x47716f in hashtableDelete /home/gusakovy/Projects/valkey/src/hashtable.c:1459
    #7 0x480038 in kvstoreHashtableDelete /home/gusakovy/Projects/valkey/src/kvstore.c:847
    #8 0x50c12c in dbGenericDeleteWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:490
    #9 0x515f28 in deleteExpiredKeyAndPropagateWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:1831
    #10 0x516103 in deleteExpiredKeyAndPropagate /home/gusakovy/Projects/valkey/src/db.c:1844
    #11 0x6d8642 in activeExpireCycleTryExpire /home/gusakovy/Projects/valkey/src/expire.c:70
    #12 0x6d8706 in expireScanCallback /home/gusakovy/Projects/valkey/src/expire.c:139
    #13 0x478bd8 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1770
    #14 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    #15 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    #16 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    #17 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    #18 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    #19 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    #20 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    #21 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    #22 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    #23 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)

SUMMARY: AddressSanitizer: heap-use-after-free /home/gusakovy/Projects/valkey/src/hashtable.c:422 in isPositionFilled
Shadow bytes around the buggy address:
  0x0c227fff85d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff85e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff85f0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c227fff8600: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8610: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
=>0x0c227fff8620:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8630: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8640: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8650: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8660: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8670: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==46774==ABORTING
```


**Solution**
Suggested solution is to also pause auto shrinking during
`hashtableScanDefrag`. I noticed that there was already a
`hashtablePauseAutoShrink` method and `pause_auto_shrink` counter, but
it wasn't actually used in `hashtableShrinkIfNeeded` so I fixed that.

**Testing**
I created a simple tcl test that (most of the times) triggers this
error, but it's a little clunky so I didn't add it as part of the PR:

```
start_server {tags {"expire hashtable defrag"}} {
    test {hashtable scan defrag on expiry} {

        r config set hz 100

        set num_keys 20
        for {set i 0} {$i < $num_keys} {incr i} {
            r set "key_$i" "value_$i"
        }

        for {set j 0} {$j < 50} {incr j} {
            set expire_keys 100
            for {set i 0} {$i < $expire_keys} {incr i} {
                # Short expiry time to ensure they expire quickly
                r psetex "expire_key_${i}_${j}" 100 "expire_value_${i}_${j}"
            }

            # Verify keys are set
            set initial_size [r dbsize]
            assert_equal $initial_size [expr $num_keys + $expire_keys]
            
            after 150
            for {set i 0} {$i < 10} {incr i} {
                r get "expire_key_${i}_${j}"
                after 10
            }
        }

        set remaining_keys [r dbsize]
        assert_equal $remaining_keys $num_keys

        # Verify server is still responsive
        assert_equal [r ping] {PONG}
    } {}
}
```
Compiling with ASAN using `make noopt SANITIZER=address valkey-server`
and running the test causes error above. Applying the fix resolves the
issue.

Signed-off-by: Yakov Gusakov <yaakov0015@gmail.com>
PingXie added a commit to PingXie/valkey that referenced this pull request Jan 26, 2026
Add MIN_LINES threshold (default: 3) to skip trivial changes.
One-liner and two-liner fixes are unlikely to be meaningful copies
and often cause false positives due to similar bug fixes.

PR valkey-io#19 (one-liner dlclose fix) now correctly passes the check.
All control group tests still pass (3088 fails, 60/2340 pass).

Signed-off-by: Ping Xie <pingxie@outlook.com>
ikolomi referenced this pull request in ikolomi/valkey May 27, 2026
Removes one of the two time-based hot-key skip knobs. The eligibility
predicate's LRU/noeviction branch now compares against a single
threshold (`compression-min-idle-seconds`) instead of two.

Why this matters
----------------

The dual-knob surface — `compression-settle-seconds` ("recent-write
protection") and `compression-min-idle-seconds` ("recent-access
protection") — was introduced via the Thread #18 walkthrough resolution
on the theory that operators benefit from being able to express two
different intents.

In v1 reality this is documentation theater. Both knobs compare against
the same metric (`lru_idle_secs(o)`) because Valkey's `robj->lru` field
is touched on every read AND every write — gated only by
`LOOKUP_NOTOUCH` and fork. v1 cannot distinguish read-recency from
write-recency from this single signal. The math always reduces to:

    eligible iff lru_idle_secs(o) >= max(settle, min_idle)

Setting `settle=10, min_idle=120` is identical to the single-knob
`min_idle=120`. Tested every scenario I could construct (different
time scales, sweep cron sequences, heterogeneous workloads, forward-
compat with v2) — none give the dual surface operationally distinct
behavior in v1.

The only non-trivial argument for keeping both was forward-compat: if
v2 adds a per-object write-time field, the dual knob becomes
meaningful. But adding a config in v2 is non-breaking; existing
operators on v1 see no change. Removing later is harder than adding
later.

Plus the dual surface is an active footgun: operators tuning the two
knobs differently expecting different effects get a confusing
no-difference outcome. PR #1 Thread #3 specifically pushed back on
"too many knobs" — that pressure applies here.

Per YAGNI, ship v1 with the single knob. v2 reintroduces a
write-time-specific knob non-breakingly when per-object write-time
tracking lands.

Code changes
------------

- src/server.h: remove `compression_settle_seconds` field.
- src/config.c: remove the `createIntConfig` registration.
- src/compression.c: drop the second `idle_secs >= settle` check
  in `compressionIsEligible`'s LRU branch. Updated the comment block
  to reflect single-signal reality.
- src/unit/test_compression_eligibility.cpp:
    - Drop `LruRejectsBetweenSettleAndMinIdle` (test of dual-knob
      max-wins behavior — no longer applicable).
    - Replace `LruRejectsRecentTouch` / `LruAcceptsBeyondBothThresholds`
      / `LruZeroThresholdsAcceptImmediately` with single-knob
      equivalents (`LruRejectsRecentTouch`, `LruAcceptsBeyondThreshold`,
      `LruAtThresholdAcceptsBoundary`, `LruZeroThresholdAcceptsImmediately`).
    - Drop `compression-settle-seconds` from `LfuTimeKnobsAreInactive`
      and rename to `LfuTimeKnobIsInactive`.
- tests/unit/type/compression.tcl: drop the
  `compression-settle-seconds` config-default assertion; update
  comment from "Advanced (11)" to "Advanced (10)".

Doc changes
-----------

- detailed-design.md §2.2 R2.2 predicate: hot_key_check helper now has
  one comparison in the LRU/noeviction branch instead of two. The
  rationale paragraph below the predicate explains the v1 single-
  signal reality and the YAGNI motivation for dropping the second
  knob; future v2 reintroduction noted.
- detailed-design.md §2.12 advanced config table: 11 → 10 knobs;
  `compression-settle-seconds` row removed; `compression-min-idle-
  seconds` description simplified.
- detailed-design.md §7.1 transparency-mode harness config: drop the
  `--compression-settle-seconds 0` line so the harness doesn't pass
  an unknown option.
- idea-honing.md Q6 baseline filter bullet: collapse the two-bullet
  LRU branch into a single bullet; add an _italicized rationale
  paragraph_ explaining why the second knob was dropped (preserves
  the historical thinking for future readers).
- idea-honing.md Q6 consolidated predicate: matches detailed-design.md.
- idea-honing.md Q6 config table: drop the `compression-settle-seconds`
  row.
- idea-honing.md Q6a answer: rewrite to reflect single-knob reality
  with reference to "S2.2 implementation review" so future readers
  can trace this refinement chain (Thread #18 → Thread #19 → S2.2
  refinement).
- idea-honing.md §7.1 harness config: drop the
  `--compression-settle-seconds 0` line.
- implementation/plan.md S2.2 description: simplify to "policy-aware
  hot-key skip" + the actual operator-facing knobs.
- summary.md: update the eligibility table row + walkthrough-
  highlights bullet to reflect the policy-aware single-knob outcome.

Audit-trail files (DESIGN_TODO.md, pr-feedback.json) intentionally
unchanged — they capture decisions at a point in time. The
walkthrough Thread #18/#19 resolutions stand as written; only the
implementation interpretation in the live design docs is refined.

Verified locally
----------------

- `make -j` builds clean.
- `./runtest --single unit/type/compression` 10/10 passes (the Tcl
  fixture's config-default assertion was updated in lockstep with the
  C-side removal, so the integration test catches any drift between
  src/config.c and tests/unit/type/compression.tcl).

Not verified locally (CI will validate):
- gtest unit tests (no libgtest-dev locally).

Test count delta
----------------
S2.2 gtest: 16 tests → 14 tests (dropped 2, simplified 2 to remove
the dual-knob exercise paths).
ikolomi referenced this pull request in ikolomi/valkey May 27, 2026
* docs(design): align eligibility predicate; document policy-aware hot-key skip

Two corrections discovered while preparing S2.2 implementation:

1. The two design docs had drifted on minor wording (idea-honing.md
   said `obj->encoding == RAW` and `obj->refcount != SHARED`;
   detailed-design.md said `OBJ_ENCODING_RAW` / `OBJ_SHARED_REFCOUNT`).
   Predicates now match exactly across both docs, using the actual C
   constants from src/server.h. Per the agreed convention: keep the
   exact predicate inline in both docs (different audiences both need
   it readable in place) rather than a cross-reference.

2. The Thread #18/#19 walkthrough resolutions made a claim that
   doesn't match the existing Valkey source: "the existing LRU field
   already provides the signal for every eviction policy" / time-based
   checks "work uniformly across LRU, LFU, and noeviction" /
   `estimateObjectIdleTime()` is "a reliable read-hotness signal
   universally."

   Reading src/lrulfu.h:
     - LRU and noeviction policies: robj->lru is seconds-based.
       lru_idle_secs(o) returns real seconds. Time-based thresholds
       work as designed.
     - LFU policy: robj->lru encodes 16 bits "last eval time in
       minutes" + 8 bits approximate freq counter. There is no
       per-second access timestamp. lru_idle_secs(o) would
       misinterpret the bits. The function `estimateObjectIdleTime()`
       referenced by the design does not exist in current Valkey;
       the closest available helper is `lrulfu_getIdleness()` which
       returns `UINT8_MAX - freq` in LFU mode (a 0..255 freq-derived
       heuristic, NOT seconds).

   The fix is policy-aware checks, not policy-uniform:
     - LRU and noeviction: apply settle-seconds AND min-idle-seconds
       against lru_idle_secs(o).
     - LFU: skip the time-based thresholds entirely (the metric is
       wrong unit). Apply compression-lfu-threshold against the freq
       counter — already in the predicate as the LFU branch.

   The dual-knob operator surface (`compression-settle-seconds` and
   `compression-min-idle-seconds`) is preserved across modes; in
   LRU/noeviction both knobs apply to the same metric (since
   robj->lru is touched on every read AND write — v1 cannot
   distinguish source) so the effective threshold is `max(settle,
   min_idle)`. Operators get to express two intents.

   The Thread #18/#19 resolutions stand as written in DESIGN_TODO.md
   (audit trail; the decision to use the existing LRU field rather
   than add a new write-time field is unchanged); only the
   implementation interpretation in the live design docs is refined.

Updated:
  - detailed-design.md §2.2 R2.2 predicate + new explanatory
    paragraph below it.
  - detailed-design.md §2.12 config table: settle-seconds and
    min-idle-seconds descriptions now correctly note "Inactive in LFU
    mode."
  - idea-honing.md Q6 baseline filter bullet (rewritten as
    "policy-aware" with sub-bullets per policy).
  - idea-honing.md Q6 consolidated predicate (now identical to
    detailed-design.md §2.2).
  - idea-honing.md Q6a answer: rewritten with the policy-aware
    framing; cross-references "S2.2 implementation review" so future
    readers can trace this refinement.

* Inline compression: S2.2 — eligibility predicate

Implements R2.2 / Q6 `compressionIsEligible(robj *o, const sds key)`,
replacing the Phase 0 stub that returned 0 for every value.

The predicate has six gates, evaluated in cheapest-first order so the
master switch short-circuits early when the feature is disabled:

  1. Master switch (server.compression_enabled).
  2. Type + encoding gate. STRING values only (R2.2, Q6c). Of the four
     string encodings, only OBJ_ENCODING_RAW is a candidate:
       - INT      — already memory-optimal.
       - EMBSTR   — ≤44 B, header overhead erases any savings (Threads
                    #17 / #21 explicitly excluded).
       - COMPRESSED — defense-in-depth no-op for double-compress.
  3. Refcount gate. Shared RESP constants are never installed in a db
     (lookupKey asserts this); we mirror the assertion as a safety
     check.
  4. Size bounds. compression-min-value-size (lower) prevents wasting
     CPU on values too small to recoup the per-value header (~16 B).
     compression-max-value-size (upper, 0 = disabled) caps worst-case
     sync-decompression latency on the main thread (~1 µs/KB).
  5. Hot-key skip — POLICY-AWARE per the corrected R2.2:
       - LRU and noeviction: robj->lru is seconds-based. Apply both
         compression-settle-seconds (recent-write proxy) and
         compression-min-idle-seconds (recent-read proxy) against
         lru_getIdleSecs(o->lru). v1 cannot distinguish source
         (robj->lru is touched on read AND write); the dual surface
         lets operators express two intents that share an underlying
         signal. Effective threshold is max(settle, min_idle).
       - LFU: robj->lru encodes a freq counter (no per-second
         timestamp). Time-based knobs are inactive in this mode.
         Apply compression-lfu-threshold against the freq counter via
         lfu_getFrequency(), which mirrors the standard Valkey
         decay-on-read pattern (objectGetLFUFrequency in src/object.c).
         The signature is `robj *o` (not `const robj *`) because of
         this in-place decay.
  6. Incompressible-keys retry guard. Stubbed as "always retry-eligible"
     in S2.2; S2.3 lands the side hashtable and wires
     compressionRetryEligible(key) here.

Test coverage in src/unit/test_compression_eligibility.cpp (16 tests,
auto-discovered by src/unit/Makefile's `wildcard *.cpp`):

  - Master switch off / on.
  - Each rejection branch:
      * non-STRING type
      * INT, EMBSTR, COMPRESSED encodings
      * shared refcount
      * size below min, above max
  - Size-bound boundary cases (at exact min, at exact max,
    max=0 disables upper bound).
  - LRU branch:
      * recent touch (idle < settle) — rejected
      * idle between settle and min_idle (max wins) — rejected
      * cold key (idle >= max(settle, min_idle)) — accepted
      * zero thresholds — accept immediately
  - noeviction policy: same code path as LRU per R2.2.
  - LFU branch:
      * freq at threshold — rejected (>= comparison)
      * freq above threshold — rejected
      * freq below threshold — accepted
      * time-based knobs are inactive even at INT_MAX values.

The fixture saves and restores `server.compression_*` and
`server.maxmemory_policy`, and re-syncs lrulfu's cached
`is_using_lfu_policy` boolean via `lrulfu_updateClockAndPolicy()` on
both setup and teardown so tests don't leak policy state into each
other.

Verified locally:
  - `make -j` builds clean.
  - `./runtest --single unit/type/compression` 10/10 passes (the Phase 0
    integration fixture exercises feature-off semantics; with
    compression-enabled still 0 by default, eligibility is never
    consulted in the integration server).

Not verified locally (CI will validate):
  - gtest unit tests on Linux/macOS/32-bit (no libgtest-dev locally).

S2.3 (incompressible-keys hashtable) wires the last branch and ships
its own gtest coverage. After that, S2.4–S2.10 wire the rest of the
hot path.

* docs(design): rewrite eligibility predicate's hot-key check in branched form

Per review feedback during S2.2 review: the previous form encoded the
policy split using short-circuit booleans —

  && (lfu_mode  || lru_idle_secs(obj) >= compression-settle-seconds)
  && (lfu_mode  || lru_idle_secs(obj) >= compression-min-idle-seconds)
  && (!lfu_mode || lfu_freq(obj) < compression-lfu-threshold)

— which is logically correct but reads awkwardly. Three lines mention
`lfu_mode` (twice unprimed, once primed); the reader has to mentally
short-circuit twice to see that line 1+2 fire only in LRU/noeviction
and line 3 fires only in LFU. It also looks at first glance like the
predicate might be using `compression-lfu-threshold` as an LRU-mode
threshold.

Replaced with a branched helper that mirrors how the C
implementation's if/else branches:

  && hot_key_check(obj)                       // policy-aware

  where hot_key_check(obj) is:
      if lfu_mode:
          lfu_freq(obj) < compression-lfu-threshold
      else:
          lru_idle_secs(obj) >= compression-settle-seconds
          AND lru_idle_secs(obj) >= compression-min-idle-seconds

Same behavior; the implementation in src/compression.c
(`compressionIsEligible()`) already uses this exact branching shape —
the docs now match it visually.

Updated:
  - detailed-design.md §2.2 R2.2 predicate.
  - idea-honing.md Q6 consolidated predicate.

Both docs were already aligned (per the previous predicate-alignment
commit); they remain identical with this rewrite. The explanatory
paragraph below the §2.2 predicate (LRU vs LFU lru-field encoding)
already covers the rationale and is unchanged.

* Inline compression: drop compression-settle-seconds knob (YAGNI)

Removes one of the two time-based hot-key skip knobs. The eligibility
predicate's LRU/noeviction branch now compares against a single
threshold (`compression-min-idle-seconds`) instead of two.

Why this matters
----------------

The dual-knob surface — `compression-settle-seconds` ("recent-write
protection") and `compression-min-idle-seconds` ("recent-access
protection") — was introduced via the Thread #18 walkthrough resolution
on the theory that operators benefit from being able to express two
different intents.

In v1 reality this is documentation theater. Both knobs compare against
the same metric (`lru_idle_secs(o)`) because Valkey's `robj->lru` field
is touched on every read AND every write — gated only by
`LOOKUP_NOTOUCH` and fork. v1 cannot distinguish read-recency from
write-recency from this single signal. The math always reduces to:

    eligible iff lru_idle_secs(o) >= max(settle, min_idle)

Setting `settle=10, min_idle=120` is identical to the single-knob
`min_idle=120`. Tested every scenario I could construct (different
time scales, sweep cron sequences, heterogeneous workloads, forward-
compat with v2) — none give the dual surface operationally distinct
behavior in v1.

The only non-trivial argument for keeping both was forward-compat: if
v2 adds a per-object write-time field, the dual knob becomes
meaningful. But adding a config in v2 is non-breaking; existing
operators on v1 see no change. Removing later is harder than adding
later.

Plus the dual surface is an active footgun: operators tuning the two
knobs differently expecting different effects get a confusing
no-difference outcome. PR #1 Thread #3 specifically pushed back on
"too many knobs" — that pressure applies here.

Per YAGNI, ship v1 with the single knob. v2 reintroduces a
write-time-specific knob non-breakingly when per-object write-time
tracking lands.

Code changes
------------

- src/server.h: remove `compression_settle_seconds` field.
- src/config.c: remove the `createIntConfig` registration.
- src/compression.c: drop the second `idle_secs >= settle` check
  in `compressionIsEligible`'s LRU branch. Updated the comment block
  to reflect single-signal reality.
- src/unit/test_compression_eligibility.cpp:
    - Drop `LruRejectsBetweenSettleAndMinIdle` (test of dual-knob
      max-wins behavior — no longer applicable).
    - Replace `LruRejectsRecentTouch` / `LruAcceptsBeyondBothThresholds`
      / `LruZeroThresholdsAcceptImmediately` with single-knob
      equivalents (`LruRejectsRecentTouch`, `LruAcceptsBeyondThreshold`,
      `LruAtThresholdAcceptsBoundary`, `LruZeroThresholdAcceptsImmediately`).
    - Drop `compression-settle-seconds` from `LfuTimeKnobsAreInactive`
      and rename to `LfuTimeKnobIsInactive`.
- tests/unit/type/compression.tcl: drop the
  `compression-settle-seconds` config-default assertion; update
  comment from "Advanced (11)" to "Advanced (10)".

Doc changes
-----------

- detailed-design.md §2.2 R2.2 predicate: hot_key_check helper now has
  one comparison in the LRU/noeviction branch instead of two. The
  rationale paragraph below the predicate explains the v1 single-
  signal reality and the YAGNI motivation for dropping the second
  knob; future v2 reintroduction noted.
- detailed-design.md §2.12 advanced config table: 11 → 10 knobs;
  `compression-settle-seconds` row removed; `compression-min-idle-
  seconds` description simplified.
- detailed-design.md §7.1 transparency-mode harness config: drop the
  `--compression-settle-seconds 0` line so the harness doesn't pass
  an unknown option.
- idea-honing.md Q6 baseline filter bullet: collapse the two-bullet
  LRU branch into a single bullet; add an _italicized rationale
  paragraph_ explaining why the second knob was dropped (preserves
  the historical thinking for future readers).
- idea-honing.md Q6 consolidated predicate: matches detailed-design.md.
- idea-honing.md Q6 config table: drop the `compression-settle-seconds`
  row.
- idea-honing.md Q6a answer: rewrite to reflect single-knob reality
  with reference to "S2.2 implementation review" so future readers
  can trace this refinement chain (Thread #18 → Thread #19 → S2.2
  refinement).
- idea-honing.md §7.1 harness config: drop the
  `--compression-settle-seconds 0` line.
- implementation/plan.md S2.2 description: simplify to "policy-aware
  hot-key skip" + the actual operator-facing knobs.
- summary.md: update the eligibility table row + walkthrough-
  highlights bullet to reflect the policy-aware single-knob outcome.

Audit-trail files (DESIGN_TODO.md, pr-feedback.json) intentionally
unchanged — they capture decisions at a point in time. The
walkthrough Thread #18/#19 resolutions stand as written; only the
implementation interpretation in the live design docs is refined.

Verified locally
----------------

- `make -j` builds clean.
- `./runtest --single unit/type/compression` 10/10 passes (the Tcl
  fixture's config-default assertion was updated in lockstep with the
  C-side removal, so the integration test catches any drift between
  src/config.c and tests/unit/type/compression.tcl).

Not verified locally (CI will validate):
- gtest unit tests (no libgtest-dev locally).

Test count delta
----------------
S2.2 gtest: 16 tests → 14 tests (dropped 2, simplified 2 to remove
the dual-knob exercise paths).

* Cleanup: untrack proposal-issue.md; mark S2.2 complete in plan.md

Two small fixes to the previous commit's collateral:

1. proposal-issue.md was inadvertently committed via `git add -A` in
   the previous commit. The file is a working draft of the upstream
   issue (already tracked in the valkey-io issue tracker) and doesn't
   belong in the planning directory. Removing.

2. plan.md still showed S2.2 as `[ ]`. Implementation-complete state
   matches the S2.1 marking convention (`[x]` once the task ships); on
   merge to unstable the marking becomes definitive.
GilboaAWS pushed a commit to GilboaAWS/valkey that referenced this pull request Jun 24, 2026
* [S2.7] Compression write-path hook

Wires compressionEnqueueCandidate into dbAddInternal and dbSetValue,
and replaces the TODO(S2.7) placeholder in the drain handler with a
real install path. With this change, writes to eligible STRING values
get queued for background compression and the result is installed back
into the kvstore as an OBJ_ENCODING_COMPRESSED robj.

The decoder (S2.6) is shipped but not yet wired into read paths (S2.8),
so as long as compression-enabled stays no (default), behavior is
unchanged. Once an operator turns the switch on, written values get
compressed, but reads return the compressed bytes until S2.8 lands.
Existing transparency tests verify no regression in the default-off
configuration.

Producer side (compression.c, db.c)

  Two seams in db.c — end of dbAddInternal and end of dbSetValue —
  call compressionEnqueueCandidate(key, value, db->id). The candidate
  function applies four guards:
    1. Master switch (compression_enabled, via compressionIsEligible).
    2. R2.2 eligibility (type/encoding/size/hot-key — also via predicate).
    3. R2.1.5 active-dict check — saves an allocator round-trip when
       compression-enabled=yes but training hasn't completed.
    4. incrRefCount(value) — pins the bytes for the worker AND
       reserves the robj address for the drain handler's pointer-
       equality stale check (ABA-safe per R2.4.4 + the lifetime
       discussion in PR valkey-io#18).

  If the worker pool refuses (not started; future S2.11 inbox full),
  the pin is released immediately. RDB-load enqueue is deliberately
  skipped — TODO(S2.10): the sweep tick will rediscover RDB-loaded
  values without hammering the inbox during load.

API change: compressionWorkersEnqueue

  Old: compressionWorkersEnqueue(sds key, int dbid, uint64_t version, sds src)
  New: compressionWorkersEnqueue(robj *value, int dbid)

  The new form requires a pinned robj; the worker reads
  objectGetVal(value) once at enqueue (captured into job->src) and
  never touches the robj afterwards (R2.11.4 intact). The drain
  handler uses job->value for the kvstore lookup and the pointer-
  equality stale check.

  The version field is gone — pointer equality, made ABA-safe by the
  pin, is sufficient. R2.4.4 explains why: holding incrRefCount(value)
  prevents the allocator from reusing the address while the job is
  in flight.

Drain install (compression_workers.c)

  New compressionInstall() helper:
    1. void **slot = kvstoreHashtableFindRef(db->keys, didx, key_sds);
    2. If slot == NULL OR *slot != job->value: stale (overwrite, expire,
       or COW). Discard.
    3. Else: createCompressedObject(OBJ_STRING, job->dst, job->dst_len);
       dbReplaceValue installs.
    4. compressionRegistryIncRef(job->dict_id) on success.

  dbReplaceValue routes through dbSetValue(..., overwrite=0, ...),
  which does NOT call signalModifiedKey, moduleNotifyKeyUnlink, or
  signalDeletedKeyAsReady. Background compression is a storage-only
  change per R2.9.2 — no WATCH dirty_cas, no client-side-caching
  invalidations, no keyspace notifications.

  Pin released on every drain completion path (success, stale-discard,
  net-savings reject, ZSTD error, no-active-dict). Test-mode jobs
  (job->value == NULL) skip both install and decRef.

Test migration

  The 15 existing test-fixture call sites passed raw sds + dummy
  version. Migrated to a new testOnlyCompressionWorkersEnqueueRaw(src,
  dbid) that sets job->value = NULL. Tests extract jobs via
  testOnlyCompressionWorkersDrainOutbox before the production drain
  runs, so production-only paths (install, decRef) are never reached
  by the value=NULL sentinel.

  No new gtest cases for the install path itself — that requires a
  fully-initialized server.db / kvstore that the unit-test environment
  doesn't construct. End-to-end coverage will come from the Tcl
  transparency harness once S2.8 wires the read path.

TODO(S4.1) markers added at:
  - compressionInstall: compression_compressions_per_sec, EMA fold,
    compression_compressed_objects.
  - compressionEnqueueCandidate: compression_candidates_dropped_total
    when S2.11 lands (today the pool-not-started rejection is a
    config state, not back-pressure).

Verified locally:
  - make -j2 -C src              → clean (BUILD_ZSTD=yes default).
  - make -j2 -C src BUILD_ZSTD=no → clean.
  - ./runtest --single unit/type/compression → 10/10 pass.

  gtest unit tests not runnable locally; CI validates.

Diff stat:

  .../implementation/plan.md                |   4 +-
  src/compression.c                         |  35 +++-
  src/compression.h                         |  27 ++-
  src/compression_workers.c                 | 185 +++++++++++++++------
  src/compression_workers.h                 |  56 +++----
  src/db.c                                  |  14 ++
  src/unit/test_compression_workers.cpp     |  31 ++--
  7 files changed, 244 insertions(+), 108 deletions(-)

* [S2.7] PR valkey-io#19 review: assert + design-doc alignment

Two reviewer threads addressed:

Thread #1 (T-3369017721) — production code carrying test concerns

  The drain handler had a `if (job->value == NULL)` branch that only
  existed to handle test-only jobs from
  testOnlyCompressionWorkersEnqueueRaw. Reviewer correctly pointed out
  that production code shouldn't carry test-only branches.

  Fix: replaced with serverAssert(job->value != NULL) at the top of
  the per-job loop. Production drain assumes every job has a real
  pinned robj; tests must extract their value=NULL jobs via
  testOnlyCompressionWorkersDrainOutbox before this drain runs.

  Side effect: removed the conditional `if (job->value != NULL)`
  guards around decrRefCount and the install branch — the top-of-loop
  assert means every code path can assume value is non-NULL.

Thread valkey-io#2 (T-3356207626) — design doc out of sync with implementation

  Design §4.6 still described the original version-counter approach
  for staleness detection (`uint64_t version` field on compressionJob,
  "if version counter moved, discard"). The implementation has used
  pointer equality + the incrRefCount-pin since S2.4 PR valkey-io#13.

  Fix: updated §4.6 to:
    - compressionJob struct: drop `version`, drop `robj *key`, add
      `robj *value` (pinned via incrRefCount), and `sds src` and
      `int dbid` separately, matching the actual struct.
    - Concurrency notes: replaced the "version counter moved" bullet
      with the pointer-equality + ABA-safety reasoning, naming the
      incrRefCount-reserves-the-address invariant as the protection
      mechanism (same property explained in PR valkey-io#18 review).

Verified locally:
  - make -j2 -C src              → clean
  - ./runtest --single unit/type/compression  → 10/10 pass

* [S2.7] Fix CI: remove erroneous & on server.db indexing

build-32bit (and the 30+ downstream cells, all CI cells use -Werror):

  compression_workers.c:531:20: error: initialization of 'serverDb *'
    from incompatible pointer type 'serverDb **'
    [-Werror=incompatible-pointer-types]

`server.db` is `serverDb **` (array of pointers, one per DB). So
`server.db[i]` is already `serverDb *` — the address-of operator was
redundant and produced `serverDb **`.

Fix: drop the `&`. Matches the pattern used everywhere else in the
codebase (db.c, server.c, etc.).

Local make didn't catch this — the default SERVER_CFLAGS doesn't
include -Werror. CI does. Built locally with `make SERVER_CFLAGS=-Werror`
to confirm clean.

* [S2.7] Fix CI: tests must use testOnly drain for value=NULL jobs

5 gtest cases failed on build-32bit (and would on every test cell)
with the new production-drain serverAssert(job->value != NULL):

  ASSERTION FAILED: compression_workers.c:591 'job->value != NULL'

  in: SingleJobRoundTrip, BurstOf256JobsOneWorker,
      BurstOf1024JobsFourWorkers, ResizeAcrossEnqueuedJobs,
      NetSavingsGuardRejectsIncompressible

Root cause: the previous commit's reviewer-driven hardening (PR valkey-io#19
review thread #1) made the production drain assert that every job
has a non-NULL pinned robj. The premise was "tests use the testOnly
drain to extract jobs before the production drain runs". That premise
was wrong — many tests ALSO call compressionWorkersDrainOutbox
directly to consume-and-dispose test-mode jobs (the drainUntil helper
is the most-used path).

Fix: add testOnlyCompressionWorkersDrainAndDispose(budget) — pulls
jobs via the existing testOnlyCompressionWorkersDrainOutbox, frees
them via testOnlyCompressionWorkersFreeJob, returns count. Migrate
the test fixture's drainUntil helper and all 8 direct
compressionWorkersDrainOutbox call sites in the test file to the
new helper.

Production drain stays clean — no test concerns. Reviewer thread #1
intent preserved.

Verified locally:
  - make -j2 -C src SERVER_CFLAGS=-Werror   → clean
  - ./runtest --single unit/type/compression → 10/10 pass
GilboaAWS pushed a commit to GilboaAWS/valkey that referenced this pull request Jun 24, 2026
…lkey-io#21)

Adopts the transient-view approach for the read-path hook (S2.8) after
deep analysis of three options (per-site / decompress-in-place /
transient view).

Design doc changes:

- §2.5: rewrote R2.5.2 to note that the single decoder helper is
  called from inside lookupKey* when LOOKUP_READ_BYTES is set; added
  R2.5.7 describing the full transient model (decompress on lookup,
  pin via incrRefCount, save compressed buffer in side-map, flip
  encoding to RAW for the iteration; restore via pointer swap at
  beforeSleep using kvstore-slot pointer-equality for staleness
  detection).

- §2.5.6 updated: cost is now "1 decompress per event-loop iteration
  that touches the key" rather than "1 decompress per read" —
  amortizes naturally for repeat reads.

- §3.2 read-path seam: rewritten to match (lookupKey* with flag
  centralizes; 3 out-of-process paths bypass).

- §4.2 db.c row: rewritten to describe the new flag plumbing.

- New Appendix E: full design exploration including the codebase-
  sweep findings (62 sdslen(objectGetVal) sites, 9 dbUnshareStringValue
  callers, addReplyBulk catches only ~30% of byte access), the
  trade-off matrix across the three approaches, why approach 3 was
  chosen, and limitations / future considerations. Preserved for
  future contributors who may revisit the choice.

Plan.md changes:

- S2.8 description rewritten with the transient-model details, files
  touched per the codebase sweep, and explicit out-of-process paths
  that need separate wiring (AOF rewrite child, RDB-replication save).
  feedReplicationBufferWithObject is now noted as NOT needing wiring
  (operates on argv, not kvstore).

Why option 3 over the alternatives:

- Approach 1 (per-site) has 62-site leakage surface; future code can
  silently corrupt by missing a decompress call.
- Approach 2 (decompress-in-place forever) violates R2.5.6 — read-hot
  keys lose memory savings; MEMORY USAGE / OBJECT ENCODING change
  after first read (operator surprise); sweep treadmill on read-hot
  patterns.
- Approach 3 (transient view) preserves R2.5.6, is leak-proof for
  lookupKey paths, restoration is a free pointer swap (no
  re-compression cost), bounded memory inflation (≤ uncompressed
  baseline — feature can't make memory worse than the no-compression
  case), and reuses three v1 invariants (R2.4.4 COW, R2.5.2 single
  helper, PR valkey-io#19 pointer equality).

No code changes in this commit — design and plan only. S2.8
implementation lands in a follow-up PR per this design.
GilboaAWS pushed a commit to GilboaAWS/valkey that referenced this pull request Jun 24, 2026
…ess (writes) (2/3) (valkey-io#23)

PR 2 of 3 in the S2.8 split. Wires the read-path transient-view model
from PR valkey-io#21 (R2.5.7) + Appendix E.7 into lookupKey(), AND adds a
write-path permanent-decompress optimization with re-compression
auto-scheduling via signalModifiedKey().

== Read-path: transient view (R2.5.7 + Appendix E) ==

For lookupKeyRead* (no LOOKUP_WRITE), compressed values are decompressed
into a temp sds, the robj is registered in a per-server side-map and
pinned (incrRefCount); encoding flips to RAW for the duration of the
event-loop iteration. At the next compressionBeforeSleep() boundary,
the kvstore slot is re-fetched: pointer match → pointer-swap restore
(zero recompression cost), pointer mismatch → discard (mutation /
overwrite / expire / COW orphan). Memory bound is uncompressed-baseline
(at most the size the dataset would use without compression).

== Write-path: permanent decompress + signalModifiedKey enqueue ==

For lookupKeyWrite* (LOOKUP_WRITE set, no LOOKUP_NO_BYTES), compressed
values are PERMANENTLY decompressed in place: free the compressed
buffer, decRef the dict frame-ref, install fresh sds as val_ptr, flip
encoding to RAW. No side-map entry, no pin. Refcount stays 1.

Subsequent dbUnshareStringValue sees refcount==1 RAW → no COW → mutation
in place. Cheaper than transient view + COW, and avoids the wasted
beforeSleep kvstore re-fetch for a write that always discards.

The post-mutation value is re-enqueued for compression by hooking
signalModifiedKey(). signalModifiedKey is the canonical "logical value
at this key changed" signal — called by every byte-mutating command.
Audit confirmed: every relevant write path fires it; the 3 paths that
don't (RDB load per R2.6.2, worker drain per R2.9.2, module
SETKEY_NO_SIGNAL) are exactly the ones we want to skip.

R2.9.2 invariant intact: the direction is signalModifiedKey →
compression-enqueue, NOT the reverse. Compression infrastructure does
not call signalModifiedKey.

This replaces PR valkey-io#19's compressionEnqueueCandidate calls in
dbAddInternal/dbSetValue, which fired too eagerly (worker-drain
re-enqueue → wasted predicate; empty initial-state values from
hashTypeLookupWriteOrCreate → wasted predicate) AND missed in-place
mutations like APPEND on a permanent-decompressed value (which never
go through dbReplaceValue).

== Implementation ==

src/server.h:
- Renamed LOOKUP_READ_BYTES → LOOKUP_NO_BYTES, semantic flipped to
  opt-out. Default behavior is "read bytes (decompress on demand)";
  metadata-only lookups (TYPE, EXISTS, OBJECT ENCODING, TTL, PERSIST,
  TOUCH, DEBUG POPULATE, OBJECT/MEMORY USAGE, setKey existence check)
  pass LOOKUP_NO_BYTES to opt out.

src/compression.{c,h}:
- Side-map: hashtable.c primitive keyed by robj pointer (default
  pointer-bits hash + pointer-equality compare); lazy allocation;
  early-out when empty.
- compressionMaterializeTransientView(): read-path decompress +
  register + pin + flip.
- compressionPermanentlyDecompress(): write-path decompress + free
  buffer + decRef dict + flip; no side-map, no pin.
- transientViewActive(): O(1) presence check for the deferred-capture
  fix (Appendix E.7).
- compressionBeforeSleep(): iterate side-map, restore via
  pointer-equality stale check, free per entry.
- compressionEnqueueModified(): looked up by signalModifiedKey;
  no-op when feature disabled / key deleted / value ineligible.
- compressionShutdown(): drain side-map (treat as discard).
- testOnlyCompressionDrainTransientViewAsDiscard(),
  testOnlyCompressionTransientViewSize(): test-only helpers.

src/db.c:
- lookupKey() splits decode strategy by LOOKUP_WRITE: write context
  → permanent decompress; read context → transient view.
- signalModifiedKey() now calls compressionEnqueueModified() as the
  third side effect (alongside touchWatchedKey + trackingInvalidateKey).
- Removed compressionEnqueueCandidate from dbAddInternal and
  dbSetValue (PR valkey-io#19's wiring); replaced by the signalModifiedKey hook.
- typeCommand, existsCommand, setKey existence check pass
  LOOKUP_NO_BYTES.

src/object.c, src/expire.c, src/debug.c:
- objectCommandLookup, expire/ttl/persist/touch commands, DEBUG POPULATE
  pass LOOKUP_NO_BYTES.

src/networking.c:
- isCopyAvoidPreferred returns 0 when transientViewActive(obj) is true
  (deferred-capture fix per Appendix E.7).

src/unit/test_compression_transient_view.cpp (new):
- 9 gtest cases covering side-map lifecycle, materialize state machine,
  permanent decompress side-effects, decoder-error recovery.

== Verified locally ==

- make -j2 -C src SERVER_CFLAGS=-Werror clean.
- ./runtest --single unit/type/compression 10/10 pass.
- Manual smoke: server with --compression-enabled yes
  --compression-threads 2; PING/SET/GET/EXISTS/TYPE/OBJECT ENCODING/
  STRLEN/APPEND/TTL/TOUCH/DEL all work; clean shutdown.

== Not verified locally (CI gates) ==

- gtest unit tests (libgtest-dev not installed).
- clang-format-18.

== Diff stat ==

src/compression.c                            | 438 +++++++++++++++++++--
src/compression.h                            |  93 +++++
src/db.c                                     |  87 +++-
src/debug.c                                  |   5 +-
src/expire.c                                 |  16 +-
src/networking.c                             |  12 +
src/object.c                                 |   9 +-
src/server.h                                 |  26 +-
src/unit/test_compression_transient_view.cpp | 397 ++++++++++++++++++
9 files changed, 1037 insertions(+), 46 deletions(-)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants