Skip to content

Test#1

Closed
madolson wants to merge 3 commits into
unstablefrom
zunion
Closed

Test#1
madolson wants to merge 3 commits into
unstablefrom
zunion

Conversation

@madolson

Copy link
Copy Markdown
Member

No description provided.

antirez added 3 commits June 6, 2014 15:32
The user @kjmph provided excellent ideas to improve speed of ZUNIONSTORE
(in certain cases by many order of magnitude), together with an
implementation of the ideas.

While the ideas were sounding, the implementation could be improved both
in terms of speed and clearness, so that's my attempt at reimplementing
the speedup proposed, trying to improve by directly using just a
dictionary with an embedded score inside, and reusing the single-pass
aggregate + order-later approach.

Note that you can't apply this commit without applying the previous
commit in this branch that adds a double in the dictEntry value union.

Issue #1786.
@madolson madolson requested a review from PingXie March 22, 2024 03:01
@madolson madolson closed this Mar 22, 2024
@madolson madolson deleted the zunion branch April 4, 2024 19:15
naglera pushed a commit to naglera/placeholderkv that referenced this pull request Apr 8, 2024
…is missed cases to redis-server. (#12322)

Observed that the sanitizer reported memory leak as clean up is not done
before the process termination in negative/following cases:

**- when we passed '--invalid' as option to redis-server.**

```
 -vm:~/mem-leak-issue/redis$ ./src/redis-server --invalid

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'invalid'
Bad directive or wrong number of arguments

=================================================================
==865778==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7f0985f65867 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    valkey-io#1 0x558ec86686ec in ztrymalloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:117
    valkey-io#2 0x558ec86686ec in ztrymalloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:135
    valkey-io#3 0x558ec86686ec in ztryrealloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:276
    valkey-io#4 0x558ec86686ec in zrealloc /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:327
    valkey-io#5 0x558ec865dd7e in sdssplitargs /home/ubuntu/mem-leak-issue/redis/src/sds.c:1172
    valkey-io#6 0x558ec87a1be7 in loadServerConfigFromString /home/ubuntu/mem-leak-issue/redis/src/config.c:472
    valkey-io#7 0x558ec87a13b3 in loadServerConfig /home/ubuntu/mem-leak-issue/redis/src/config.c:718
    valkey-io#8 0x558ec85e6f15 in main /home/ubuntu/mem-leak-issue/redis/src/server.c:7258
    valkey-io#9 0x7f09856e5d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s).

```

**- when we pass '--port' as option and missed to add port number to redis-server.**

```
vm:~/mem-leak-issue/redis$ ./src/redis-server --port

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'port'
wrong number of arguments

=================================================================
==865846==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7fdcdbb1f867 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    valkey-io#1 0x557e8b04f6ec in ztrymalloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:117
    valkey-io#2 0x557e8b04f6ec in ztrymalloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:135
    valkey-io#3 0x557e8b04f6ec in ztryrealloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:276
    valkey-io#4 0x557e8b04f6ec in zrealloc /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:327
    valkey-io#5 0x557e8b044d7e in sdssplitargs /home/ubuntu/mem-leak-issue/redis/src/sds.c:1172
    valkey-io#6 0x557e8b188be7 in loadServerConfigFromString /home/ubuntu/mem-leak-issue/redis/src/config.c:472
    valkey-io#7 0x557e8b1883b3 in loadServerConfig /home/ubuntu/mem-leak-issue/redis/src/config.c:718
    valkey-io#8 0x557e8afcdf15 in main /home/ubuntu/mem-leak-issue/redis/src/server.c:7258
    valkey-io#9 0x7fdcdb29fd8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Indirect leak of 10 byte(s) in 1 object(s) allocated from:
    #0 0x7fdcdbb1fc18 in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:164
    valkey-io#1 0x557e8b04f9aa in ztryrealloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:287
    valkey-io#2 0x557e8b04f9aa in ztryrealloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:317
    valkey-io#3 0x557e8b04f9aa in zrealloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:342
    valkey-io#4 0x557e8b033f90 in _sdsMakeRoomFor /home/ubuntu/mem-leak-issue/redis/src/sds.c:271
    valkey-io#5 0x557e8b033f90 in sdsMakeRoomFor /home/ubuntu/mem-leak-issue/redis/src/sds.c:295
    valkey-io#6 0x557e8b033f90 in sdscatlen /home/ubuntu/mem-leak-issue/redis/src/sds.c:486
    valkey-io#7 0x557e8b044e1f in sdssplitargs /home/ubuntu/mem-leak-issue/redis/src/sds.c:1165
    valkey-io#8 0x557e8b188be7 in loadServerConfigFromString /home/ubuntu/mem-leak-issue/redis/src/config.c:472
    valkey-io#9 0x557e8b1883b3 in loadServerConfig /home/ubuntu/mem-leak-issue/redis/src/config.c:718
    valkey-io#10 0x557e8afcdf15 in main /home/ubuntu/mem-leak-issue/redis/src/server.c:7258
    valkey-io#11 0x7fdcdb29fd8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 18 byte(s) leaked in 2 allocation(s).

```

As part analysis found that the sdsfreesplitres is not called when this condition checks are being hit.

Output after the fix:


```
vm:~/mem-leak-issue/redis$ ./src/redis-server --invalid

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'invalid'
Bad directive or wrong number of arguments
vm:~/mem-leak-issue/redis$

===========================================
vm:~/mem-leak-issue/redis$ ./src/redis-server --jdhg

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'jdhg'
Bad directive or wrong number of arguments

---------------------------------------------------------------------------
vm:~/mem-leak-issue/redis$ ./src/redis-server --port

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'port'
wrong number of arguments
```

Co-authored-by: Oran Agra <oran@redislabs.com>
naglera pushed a commit to naglera/placeholderkv that referenced this pull request Apr 8, 2024
## Issues and solutions from #12817
1. Touch ProcessingEventsWhileBlocked and calling moduleCount() without
GIL in afterSleep()
    - Introduced: 
       Version: 7.0.0
       PR: #9963

   - Harm Level: Very High
If the module thread calls `RM_Yield()` before the main thread enters
afterSleep(),
and modifies `ProcessingEventsWhileBlocked`(+1), it will cause the main
thread to not wait for GIL,
which can lead to all kinds of unforeseen problems, including memory
data corruption.

   - Initial / Abandoned Solution:
      * Added `__thread` specifier for ProcessingEventsWhileBlocked.
`ProcessingEventsWhileBlocked` is used to protect against nested event
processing, but event processing
in the main thread and module threads should be completely independent
and unaffected, so it is safer
         to use TLS.
* Adding a cached module count to keep track of the current number of
modules, to avoid having to use `dictSize()`.
    
    - Related Warnings:
```
WARNING: ThreadSanitizer: data race (pid=1136)
  Write of size 4 at 0x0001045990c0 by thread T4 (mutexes: write M0):
    #0 processEventsWhileBlocked networking.c:4135 (redis-server:arm64+0x10006d124)
    valkey-io#1 RM_Yield module.c:2410 (redis-server:arm64+0x10018b66c)
    valkey-io#2 bg_call_worker <null>:83232836 (blockedclient.so:arm64+0x16a8)

  Previous read of size 4 at 0x0001045990c0 by main thread:
    #0 afterSleep server.c:1861 (redis-server:arm64+0x100024f98)
    valkey-io#1 aeProcessEvents ae.c:408 (redis-server:arm64+0x10000fd64)
    valkey-io#2 aeMain ae.c:496 (redis-server:arm64+0x100010f0c)
    valkey-io#3 main server.c:7220 (redis-server:arm64+0x10003f38c)
```

2. aeApiPoll() is not thread-safe
When using RM_Yield to handle events in a module thread, if the main
thread has not yet
entered `afterSleep()`, both the module thread and the main thread may
touch `server.el` at the same time.

    - Introduced: 
       Version: 7.0.0
       PR: #9963

   - Old / Abandoned Solution:
Adding a new mutex to protect timing between after beforeSleep() and
before afterSleep().
Defect: If the main thread enters the ae loop without any IO events, it
will wait until
the next timeout or until there is any event again, and the module
thread will
always hang until the main thread leaves the event loop.

    - Related Warnings:
```
SUMMARY: ThreadSanitizer: data race ae_kqueue.c:55 in addEventMask
==================
==================
WARNING: ThreadSanitizer: data race (pid=14682)
  Write of size 4 at 0x000100b54000 by thread T9 (mutexes: write M0):
    #0 aeApiPoll ae_kqueue.c:175 (redis-server:arm64+0x100010588)
    valkey-io#1 aeProcessEvents ae.c:399 (redis-server:arm64+0x10000fb84)
    valkey-io#2 processEventsWhileBlocked networking.c:4138 (redis-server:arm64+0x10006d3c4)
    valkey-io#3 RM_Yield module.c:2410 (redis-server:arm64+0x10018b66c)
    valkey-io#4 bg_call_worker <null>:16042052 (blockedclient.so:arm64+0x169c)

  Previous write of size 4 at 0x000100b54000 by main thread:
    #0 aeApiPoll ae_kqueue.c:175 (redis-server:arm64+0x100010588)
    valkey-io#1 aeProcessEvents ae.c:399 (redis-server:arm64+0x10000fb84)
    valkey-io#2 aeMain ae.c:496 (redis-server:arm64+0x100010da8)
    valkey-io#3 main server.c:7238 (redis-server:arm64+0x10003f51c)
```

## The final fix as the comments:
redis/redis#12817 (comment)
Optimized solution based on the above comment:

First, we add `module_gil_acquring` to indicate whether the main thread
is currently in the acquiring GIL state.

When the module thread starts to yield, there are two possibilities(we
assume the caller keeps the GIL):
1. The main thread is in the mid of beforeSleep() and afterSleep(), that
is, `module_gil_acquring` is not 1 now.
At this point, the module thread will wake up the main thread through
the pipe and leave the yield,
waiting for the next yield when the main thread may already in the
acquiring GIL state.
    
2. The main thread is in the acquiring GIL state.
The module thread release the GIL, yielding CPU to give the main thread
an opportunity to start
event processing, and then acquire the GIL again until the main thread
releases it.
This is what
redis/redis#12817 (comment)
mentioned direction.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
@hpatro hpatro mentioned this pull request Jun 24, 2024
enjoy-binbin added a commit that referenced this pull request Aug 14, 2024
We are updating this variable in the main thread, and the
child threads can printing the logs at the same time. This
generating a warning in SANITIZER=thread:
```
WARNING: ThreadSanitizer: data race (pid=74208)
  Read of size 4 at 0x000102875c10 by thread T3:
    #0 serverLogRaw <null>:52173615 (valkey-server:x86_64+0x10003c556)
    #1 _serverLog <null>:52173615 (valkey-server:x86_64+0x10003ca89)
    #2 bioProcessBackgroundJobs <null>:52173615 (valkey-server:x86_64+0x1001402c9)

  Previous write of size 4 at 0x000102875c10 by main thread (mutexes: write M0):
    #0 afterSleep <null>:52173615 (valkey-server:x86_64+0x10004989b)
    #1 aeProcessEvents <null>:52173615 (valkey-server:x86_64+0x100031e52)
    #2 main <null>:52173615 (valkey-server:x86_64+0x100064a3c)
    #3 start <null>:52173615 (dyld:x86_64+0xfffffffffff5c365)
    #4 start <null>:52173615 (dyld:x86_64+0xfffffffffff5c365)
```

The refresh of daylight_active is not real time, we update
it in aftersleep, so we don't need a strong synchronization,
so using memory_order_relaxed. But also noted we are doing
load/store operations only for daylight_active, which is an
aligned 32-bit integer, so using memory_order_relaxed will
not provide more consistency than what we have today.

So this is just a cleanup that to clear the warning.

Signed-off-by: Binbin <binloveplay1314@qq.com>
mapleFU pushed a commit to mapleFU/valkey that referenced this pull request Aug 21, 2024
We are updating this variable in the main thread, and the
child threads can printing the logs at the same time. This
generating a warning in SANITIZER=thread:
```
WARNING: ThreadSanitizer: data race (pid=74208)
  Read of size 4 at 0x000102875c10 by thread T3:
    #0 serverLogRaw <null>:52173615 (valkey-server:x86_64+0x10003c556)
    valkey-io#1 _serverLog <null>:52173615 (valkey-server:x86_64+0x10003ca89)
    valkey-io#2 bioProcessBackgroundJobs <null>:52173615 (valkey-server:x86_64+0x1001402c9)

  Previous write of size 4 at 0x000102875c10 by main thread (mutexes: write M0):
    #0 afterSleep <null>:52173615 (valkey-server:x86_64+0x10004989b)
    valkey-io#1 aeProcessEvents <null>:52173615 (valkey-server:x86_64+0x100031e52)
    valkey-io#2 main <null>:52173615 (valkey-server:x86_64+0x100064a3c)
    valkey-io#3 start <null>:52173615 (dyld:x86_64+0xfffffffffff5c365)
    valkey-io#4 start <null>:52173615 (dyld:x86_64+0xfffffffffff5c365)
```

The refresh of daylight_active is not real time, we update
it in aftersleep, so we don't need a strong synchronization,
so using memory_order_relaxed. But also noted we are doing
load/store operations only for daylight_active, which is an
aligned 32-bit integer, so using memory_order_relaxed will
not provide more consistency than what we have today.

So this is just a cleanup that to clear the warning.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: mwish <maplewish117@gmail.com>
mapleFU pushed a commit to mapleFU/valkey that referenced this pull request Aug 22, 2024
We are updating this variable in the main thread, and the
child threads can printing the logs at the same time. This
generating a warning in SANITIZER=thread:
```
WARNING: ThreadSanitizer: data race (pid=74208)
  Read of size 4 at 0x000102875c10 by thread T3:
    #0 serverLogRaw <null>:52173615 (valkey-server:x86_64+0x10003c556)
    valkey-io#1 _serverLog <null>:52173615 (valkey-server:x86_64+0x10003ca89)
    valkey-io#2 bioProcessBackgroundJobs <null>:52173615 (valkey-server:x86_64+0x1001402c9)

  Previous write of size 4 at 0x000102875c10 by main thread (mutexes: write M0):
    #0 afterSleep <null>:52173615 (valkey-server:x86_64+0x10004989b)
    valkey-io#1 aeProcessEvents <null>:52173615 (valkey-server:x86_64+0x100031e52)
    valkey-io#2 main <null>:52173615 (valkey-server:x86_64+0x100064a3c)
    valkey-io#3 start <null>:52173615 (dyld:x86_64+0xfffffffffff5c365)
    valkey-io#4 start <null>:52173615 (dyld:x86_64+0xfffffffffff5c365)
```

The refresh of daylight_active is not real time, we update
it in aftersleep, so we don't need a strong synchronization,
so using memory_order_relaxed. But also noted we are doing
load/store operations only for daylight_active, which is an
aligned 32-bit integer, so using memory_order_relaxed will
not provide more consistency than what we have today.

So this is just a cleanup that to clear the warning.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: mwish <maplewish117@gmail.com>
madolson pushed a commit that referenced this pull request Sep 2, 2024
We are updating this variable in the main thread, and the
child threads can printing the logs at the same time. This
generating a warning in SANITIZER=thread:
```
WARNING: ThreadSanitizer: data race (pid=74208)
  Read of size 4 at 0x000102875c10 by thread T3:
    #0 serverLogRaw <null>:52173615 (valkey-server:x86_64+0x10003c556)
    #1 _serverLog <null>:52173615 (valkey-server:x86_64+0x10003ca89)
    #2 bioProcessBackgroundJobs <null>:52173615 (valkey-server:x86_64+0x1001402c9)

  Previous write of size 4 at 0x000102875c10 by main thread (mutexes: write M0):
    #0 afterSleep <null>:52173615 (valkey-server:x86_64+0x10004989b)
    #1 aeProcessEvents <null>:52173615 (valkey-server:x86_64+0x100031e52)
    #2 main <null>:52173615 (valkey-server:x86_64+0x100064a3c)
    #3 start <null>:52173615 (dyld:x86_64+0xfffffffffff5c365)
    #4 start <null>:52173615 (dyld:x86_64+0xfffffffffff5c365)
```

The refresh of daylight_active is not real time, we update
it in aftersleep, so we don't need a strong synchronization,
so using memory_order_relaxed. But also noted we are doing
load/store operations only for daylight_active, which is an
aligned 32-bit integer, so using memory_order_relaxed will
not provide more consistency than what we have today.

So this is just a cleanup that to clear the warning.

Signed-off-by: Binbin <binloveplay1314@qq.com>
madolson pushed a commit that referenced this pull request Sep 3, 2024
We are updating this variable in the main thread, and the
child threads can printing the logs at the same time. This
generating a warning in SANITIZER=thread:
```
WARNING: ThreadSanitizer: data race (pid=74208)
  Read of size 4 at 0x000102875c10 by thread T3:
    #0 serverLogRaw <null>:52173615 (valkey-server:x86_64+0x10003c556)
    #1 _serverLog <null>:52173615 (valkey-server:x86_64+0x10003ca89)
    #2 bioProcessBackgroundJobs <null>:52173615 (valkey-server:x86_64+0x1001402c9)

  Previous write of size 4 at 0x000102875c10 by main thread (mutexes: write M0):
    #0 afterSleep <null>:52173615 (valkey-server:x86_64+0x10004989b)
    #1 aeProcessEvents <null>:52173615 (valkey-server:x86_64+0x100031e52)
    #2 main <null>:52173615 (valkey-server:x86_64+0x100064a3c)
    #3 start <null>:52173615 (dyld:x86_64+0xfffffffffff5c365)
    #4 start <null>:52173615 (dyld:x86_64+0xfffffffffff5c365)
```

The refresh of daylight_active is not real time, we update
it in aftersleep, so we don't need a strong synchronization,
so using memory_order_relaxed. But also noted we are doing
load/store operations only for daylight_active, which is an
aligned 32-bit integer, so using memory_order_relaxed will
not provide more consistency than what we have today.

So this is just a cleanup that to clear the warning.

Signed-off-by: Binbin <binloveplay1314@qq.com>
zuiderkwast pushed a commit that referenced this pull request Feb 10, 2025
Fix new unittest networking use-after-free error


```
==96611==ERROR: AddressSanitizer: heap-use-after-free on address 0x503000075e00 at pc 0x55e52cbe1495 bp 0x7ffd9e1fc690 sp 0x7ffd9e1fc688
READ of size 8 at 0x503000075e00 thread T0
    #0 0x55e52cbe[149](https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457#step:10:150)4 in freeReplicaReferencedReplBuffer /home/runner/work/valkey/valkey/src/replication.c:401:27
    #1 0x55e52cbe7abf in freeClientReplicationData /home/runner/work/valkey/valkey/src/replication.c:1261:5
    #2 0x55e52cb17a44 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:188:5
    #3 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    #4 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    #5 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #6 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #7 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)

0x503000075e00 is located 16 bytes inside of 24-byte region [0x503000075df0,0x503000075e08)
freed by thread T0 here:
    #0 0x55e52ca50a7a in free (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x212a7a) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
    #1 0x55e52cb905ba in listEmpty /home/runner/work/valkey/valkey/src/adlist.c:64:9
    #2 0x55e52cb179e5 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:179:9
    #3 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    #4 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    #5 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #6 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #7 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)

previously allocated by thread T0 here:
    #0 0x55e52ca50d13 in malloc (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x212d13) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
    #1 0x55e52cbb844f in ztrymalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:[155](https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457#step:10:156):17
    #2 0x55e52cbb844f in valkey_malloc /home/runner/work/valkey/valkey/src/zmalloc.c:184:17
    #3 0x55e52cb90be6 in listAddNodeTail /home/runner/work/valkey/valkey/src/adlist.c:126:17
    #4 0x55e52cb17873 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:167:9
    #5 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    #6 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    #7 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #8 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #9 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
```


https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
xbasel pushed a commit to xbasel/valkey that referenced this pull request Mar 27, 2025
Fix new unittest networking use-after-free error


```
==96611==ERROR: AddressSanitizer: heap-use-after-free on address 0x503000075e00 at pc 0x55e52cbe1495 bp 0x7ffd9e1fc690 sp 0x7ffd9e1fc688
READ of size 8 at 0x503000075e00 thread T0
    #0 0x55e52cbe[149](https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457#step:10:150)4 in freeReplicaReferencedReplBuffer /home/runner/work/valkey/valkey/src/replication.c:401:27
    valkey-io#1 0x55e52cbe7abf in freeClientReplicationData /home/runner/work/valkey/valkey/src/replication.c:1261:5
    valkey-io#2 0x55e52cb17a44 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:188:5
    valkey-io#3 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    valkey-io#4 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    valkey-io#5 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#6 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#7 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)

0x503000075e00 is located 16 bytes inside of 24-byte region [0x503000075df0,0x503000075e08)
freed by thread T0 here:
    #0 0x55e52ca50a7a in free (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x212a7a) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
    valkey-io#1 0x55e52cb905ba in listEmpty /home/runner/work/valkey/valkey/src/adlist.c:64:9
    valkey-io#2 0x55e52cb179e5 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:179:9
    valkey-io#3 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    valkey-io#4 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    valkey-io#5 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#6 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#7 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)

previously allocated by thread T0 here:
    #0 0x55e52ca50d13 in malloc (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x212d13) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
    valkey-io#1 0x55e52cbb844f in ztrymalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:[155](https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457#step:10:156):17
    valkey-io#2 0x55e52cbb844f in valkey_malloc /home/runner/work/valkey/valkey/src/zmalloc.c:184:17
    valkey-io#3 0x55e52cb90be6 in listAddNodeTail /home/runner/work/valkey/valkey/src/adlist.c:126:17
    valkey-io#4 0x55e52cb17873 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:167:9
    valkey-io#5 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    valkey-io#6 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    valkey-io#7 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#8 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#9 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
```


https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
xbasel pushed a commit to xbasel/valkey that referenced this pull request Mar 27, 2025
Fix new unittest networking use-after-free error


```
==96611==ERROR: AddressSanitizer: heap-use-after-free on address 0x503000075e00 at pc 0x55e52cbe1495 bp 0x7ffd9e1fc690 sp 0x7ffd9e1fc688
READ of size 8 at 0x503000075e00 thread T0
    #0 0x55e52cbe[149](https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457#step:10:150)4 in freeReplicaReferencedReplBuffer /home/runner/work/valkey/valkey/src/replication.c:401:27
    valkey-io#1 0x55e52cbe7abf in freeClientReplicationData /home/runner/work/valkey/valkey/src/replication.c:1261:5
    valkey-io#2 0x55e52cb17a44 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:188:5
    valkey-io#3 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    valkey-io#4 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    valkey-io#5 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#6 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#7 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)

0x503000075e00 is located 16 bytes inside of 24-byte region [0x503000075df0,0x503000075e08)
freed by thread T0 here:
    #0 0x55e52ca50a7a in free (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x212a7a) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
    valkey-io#1 0x55e52cb905ba in listEmpty /home/runner/work/valkey/valkey/src/adlist.c:64:9
    valkey-io#2 0x55e52cb179e5 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:179:9
    valkey-io#3 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    valkey-io#4 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    valkey-io#5 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#6 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#7 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)

previously allocated by thread T0 here:
    #0 0x55e52ca50d13 in malloc (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x212d13) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
    valkey-io#1 0x55e52cbb844f in ztrymalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:[155](https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457#step:10:156):17
    valkey-io#2 0x55e52cbb844f in valkey_malloc /home/runner/work/valkey/valkey/src/zmalloc.c:184:17
    valkey-io#3 0x55e52cb90be6 in listAddNodeTail /home/runner/work/valkey/valkey/src/adlist.c:126:17
    valkey-io#4 0x55e52cb17873 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:167:9
    valkey-io#5 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    valkey-io#6 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    valkey-io#7 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#8 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#9 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
```


https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
murphyjacob4 pushed a commit to enjoy-binbin/valkey that referenced this pull request Apr 13, 2025
Fix new unittest networking use-after-free error


```
==96611==ERROR: AddressSanitizer: heap-use-after-free on address 0x503000075e00 at pc 0x55e52cbe1495 bp 0x7ffd9e1fc690 sp 0x7ffd9e1fc688
READ of size 8 at 0x503000075e00 thread T0
    #0 0x55e52cbe[149](https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457#step:10:150)4 in freeReplicaReferencedReplBuffer /home/runner/work/valkey/valkey/src/replication.c:401:27
    valkey-io#1 0x55e52cbe7abf in freeClientReplicationData /home/runner/work/valkey/valkey/src/replication.c:1261:5
    valkey-io#2 0x55e52cb17a44 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:188:5
    valkey-io#3 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    valkey-io#4 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    valkey-io#5 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#6 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#7 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)

0x503000075e00 is located 16 bytes inside of 24-byte region [0x503000075df0,0x503000075e08)
freed by thread T0 here:
    #0 0x55e52ca50a7a in free (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x212a7a) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
    valkey-io#1 0x55e52cb905ba in listEmpty /home/runner/work/valkey/valkey/src/adlist.c:64:9
    valkey-io#2 0x55e52cb179e5 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:179:9
    valkey-io#3 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    valkey-io#4 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    valkey-io#5 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#6 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#7 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)

previously allocated by thread T0 here:
    #0 0x55e52ca50d13 in malloc (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x212d13) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
    valkey-io#1 0x55e52cbb844f in ztrymalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:[155](https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457#step:10:156):17
    valkey-io#2 0x55e52cbb844f in valkey_malloc /home/runner/work/valkey/valkey/src/zmalloc.c:184:17
    valkey-io#3 0x55e52cb90be6 in listAddNodeTail /home/runner/work/valkey/valkey/src/adlist.c:126:17
    valkey-io#4 0x55e52cb17873 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:167:9
    valkey-io#5 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    valkey-io#6 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    valkey-io#7 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#8 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    valkey-io#9 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
```


https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
xbasel added a commit to xbasel/valkey that referenced this pull request May 21, 2025
* add tests

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>

* fix a bug - return on error

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>

* disable failing tests

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>

* rmeove redundant test

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>

* Update tests/unit/hashexpire.tcl

---------

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>
Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
hwware pushed a commit that referenced this pull request Jun 18, 2025
When calling the command `EVAL error{} 0`, Valkey crashes with the
following stack trace. This patch ensures we never leave the
`err_info.msg` field null when we fail to extract a proper error
message.

```
=== VALKEY BUG REPORT START: Cut & paste starting from here ===
2595901:M 18 Jun 2025 01:20:12.917 # valkey 8.1.2 crashed by signal: 11, si_code: 1
2595901:M 18 Jun 2025 01:20:12.917 # Accessing address: (nil)
2595901:M 18 Jun 2025 01:20:12.917 # Crashed running the instruction at: 0x726f8e57ed1d

------ STACK TRACE ------
EIP:
/usr/lib/libc.so.6(+0x16ed1d) [0x726f8e57ed1d]

2595905 bio_aof
/usr/lib/libc.so.6(+0x9de22) [0x726f8e4ade22]
/usr/lib/libc.so.6(+0x91fda) [0x726f8e4a1fda]
/usr/lib/libc.so.6(+0x9264c) [0x726f8e4a264c]
/usr/lib/libc.so.6(pthread_cond_wait+0x14e) [0x726f8e4a4d1e]
valkey-server *:6379(bioProcessBackgroundJobs+0x1b4) [0x6530abb46db4]
/usr/lib/libc.so.6(+0x957eb) [0x726f8e4a57eb]
/usr/lib/libc.so.6(+0x11918c) [0x726f8e52918c]

2595904 bio_close_file
/usr/lib/libc.so.6(+0x9de22) [0x726f8e4ade22]
/usr/lib/libc.so.6(+0x91fda) [0x726f8e4a1fda]
/usr/lib/libc.so.6(+0x9264c) [0x726f8e4a264c]
/usr/lib/libc.so.6(pthread_cond_wait+0x14e) [0x726f8e4a4d1e]
valkey-server *:6379(bioProcessBackgroundJobs+0x1b4) [0x6530abb46db4]
/usr/lib/libc.so.6(+0x957eb) [0x726f8e4a57eb]
/usr/lib/libc.so.6(+0x11918c) [0x726f8e52918c]

2595901 valkey-server *
/usr/lib/libc.so.6(+0x3def0) [0x726f8e44def0]
/usr/lib/libc.so.6(+0x16ed1d) [0x726f8e57ed1d]
valkey-server *:6379(sdscatfmt+0x894) [0x6530abaa24a4]
valkey-server *:6379(luaCallFunction+0x39a) [0x6530abbc66ea]
valkey-server *:6379(+0x1a0992) [0x6530abbc6992]
valkey-server *:6379(scriptingEngineCallFunction+0x98) [0x6530abbc1298]
valkey-server *:6379(+0x11ff55) [0x6530abb45f55]
valkey-server *:6379(call+0x174) [0x6530aba94454]
valkey-server *:6379(processCommand+0x93d) [0x6530aba958dd]
valkey-server *:6379(processCommandAndResetClient+0x21) [0x6530abaa9d11]
valkey-server *:6379(processInputBuffer+0xe3) [0x6530abaaee83]
valkey-server *:6379(readQueryFromClient+0x65) [0x6530abaaef55]
valkey-server *:6379(+0x18e31a) [0x6530abbb431a]
valkey-server *:6379(aeProcessEvents+0x24a) [0x6530aba790ca]
valkey-server *:6379(aeMain+0x2d) [0x6530aba7938d]
valkey-server *:6379(main+0x3f6) [0x6530aba6e7b6]
/usr/lib/libc.so.6(+0x276b5) [0x726f8e4376b5]
/usr/lib/libc.so.6(__libc_start_main+0x89) [0x726f8e437769]
valkey-server *:6379(_start+0x25) [0x6530aba70235]

2595906 bio_lazy_free
/usr/lib/libc.so.6(+0x9de22) [0x726f8e4ade22]
/usr/lib/libc.so.6(+0x91fda) [0x726f8e4a1fda]
/usr/lib/libc.so.6(+0x9264c) [0x726f8e4a264c]
/usr/lib/libc.so.6(pthread_cond_wait+0x14e) [0x726f8e4a4d1e]
valkey-server *:6379(bioProcessBackgroundJobs+0x1b4) [0x6530abb46db4]
/usr/lib/libc.so.6(+0x957eb) [0x726f8e4a57eb]
/usr/lib/libc.so.6(+0x11918c) [0x726f8e52918c]

4/4 expected stacktraces.

------ STACK TRACE DONE ------

------ REGISTERS ------
2595901:M 18 Jun 2025 01:20:12.920 # 
RAX:0000000000000000 RBX:0000726f8dd35663
RCX:0000000000000000 RDX:0000000000000000
RDI:0000000000000000 RSI:0000000000000010
RBP:00007ffc2b821a80 RSP:00007ffc2b821938
R8 :000000000000000c R9 :00006530abc111b8
R10:0000000000000001 R11:0000000000000003
R12:00006530abc49adc R13:00006530abc111b7
R14:0000000000000001 R15:0000000000000001
RIP:0000726f8e57ed1d EFL:0000000000010283
CSGSFS:002b000000000033
2595901:M 18 Jun 2025 01:20:12.921 * hide-user-data-from-log is on, skip logging stack content to avoid spilling user data.

------ INFO OUTPUT ------
# Server
redis_version:7.2.4
server_name:valkey
valkey_version:8.1.2
valkey_release_stage:ga
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:38d65aa7b4148d2c
server_mode:standalone
os:Linux 6.14.6-arch1-1 x86_64
arch_bits:64
monotonic_clock:POSIX clock_gettime
multiplexing_api:epoll
gcc_version:15.1.1
process_id:2595901
process_supervised:no
run_id:a0b75f67a217a81142f17553028c010e86c1ee80
tcp_port:6379
server_time_usec:1750209612917634
uptime_in_seconds:16
uptime_in_days:0
hz:10
configured_hz:10
clients_hz:10
lru_clock:5379148
executable:/home/fusl/valkey-server
config_file:
io_threads_active:0
availability_zone:
listener0:name=tcp,bind=*,bind=-::*,port=6379

# Clients
connected_clients:1
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:0
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
pubsub_clients:0
watching_clients:0
clients_in_timeout_table:0
total_watched_keys:0
total_blocking_keys:0
total_blocking_keys_on_nokey:0
paused_reason:none
paused_actions:none
paused_timeout_milliseconds:0

# Memory
used_memory:911824
used_memory_human:890.45K
used_memory_rss:15323136
used_memory_rss_human:14.61M
used_memory_peak:911824
used_memory_peak_human:890.45K
used_memory_peak_perc:100.29%
used_memory_overhead:892232
used_memory_startup:891824
used_memory_dataset:19592
used_memory_dataset_perc:97.96%
allocator_allocated:1845952
allocator_active:1986560
allocator_resident:6672384
allocator_muzzy:0
total_system_memory:67323842560
total_system_memory_human:62.70G
used_memory_lua:34816
used_memory_vm_eval:34816
used_memory_lua_human:34.00K
used_memory_scripts_eval:184
number_of_cached_scripts:1
number_of_functions:0
number_of_libraries:0
used_memory_vm_functions:33792
used_memory_vm_total:68608
used_memory_vm_total_human:67.00K
used_memory_functions:224
used_memory_scripts:408
used_memory_scripts_human:408B
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.00
allocator_frag_bytes:0
allocator_rss_ratio:3.36
allocator_rss_bytes:4685824
rss_overhead_ratio:2.30
rss_overhead_bytes:8650752
mem_fragmentation_ratio:17.18
mem_fragmentation_bytes:14431168
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_total_replication_buffers:0
mem_clients_slaves:0
mem_clients_normal:0
mem_cluster_links:0
mem_aof_buffer:0
mem_allocator:jemalloc-5.3.0
mem_overhead_db_hashtable_rehashing:0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0

# Persistence
loading:0
async_loading:0
current_cow_peak:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1750209596
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:-1
rdb_current_bgsave_time_sec:-1
rdb_saves:0
rdb_last_cow_size:0
rdb_last_load_keys_expired:0
rdb_last_load_keys_loaded:0
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_rewrites:0
aof_rewrites_consecutive_failures:0
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0

# Stats
total_connections_received:1
total_commands_processed:0
instantaneous_ops_per_sec:0
total_net_input_bytes:34
total_net_output_bytes:0
total_net_repl_input_bytes:0
total_net_repl_output_bytes:0
instantaneous_input_kbps:0.00
instantaneous_output_kbps:0.00
instantaneous_input_repl_kbps:0.00
instantaneous_output_repl_kbps:0.00
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:0
expired_stale_perc:0.00
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:0
evicted_keys:0
evicted_clients:0
evicted_scripts:0
total_eviction_exceeded_time:0
current_eviction_exceeded_time:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
pubsubshard_channels:0
latest_fork_usec:0
total_forks:0
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
total_active_defrag_time:0
current_active_defrag_time:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:0
dump_payload_sanitizations:0
total_reads_processed:1
total_writes_processed:0
io_threaded_reads_processed:0
io_threaded_writes_processed:0
io_threaded_freed_objects:0
io_threaded_accept_processed:0
io_threaded_poll_processed:0
io_threaded_total_prefetch_batches:0
io_threaded_total_prefetch_entries:0
client_query_buffer_limit_disconnections:0
client_output_buffer_limit_disconnections:0
reply_buffer_shrinks:0
reply_buffer_expands:0
eventloop_cycles:170
eventloop_duration_sum:17739
eventloop_duration_cmd_sum:0
instantaneous_eventloop_cycles_per_sec:9
instantaneous_eventloop_duration_usec:99
acl_access_denied_auth:0
acl_access_denied_cmd:0
acl_access_denied_key:0
acl_access_denied_channel:0

# Replication
role:master
connected_slaves:0
replicas_waiting_psync:0
master_failover_state:no-failover
master_replid:d35a0bb7979f490a60174bb363524431d7eb2428
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:10485760
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

# CPU
used_cpu_sys:0.012543
used_cpu_user:0.016853
used_cpu_sys_children:0.000000
used_cpu_user_children:0.000000
used_cpu_sys_main_thread:0.012440
used_cpu_user_main_thread:0.016714

# Modules

# Commandstats

# Errorstats

# Latencystats

# Cluster
cluster_enabled:0

# Keyspace

------ CLIENT LIST OUTPUT ------
id=2 addr=127.0.0.1:41372 laddr=127.0.0.1:6379 fd=10 name=*redacted* age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=12 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=17060 events=r cmd=eval user=*redacted* redir=-1 resp=2 lib-name= lib-ver= tot-net-in=34 tot-net-out=0 tot-cmds=0

------ CURRENT CLIENT INFO ------
id=2 addr=127.0.0.1:41372 laddr=127.0.0.1:6379 fd=10 name=*redacted* age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=12 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=17060 events=r cmd=eval user=*redacted* redir=-1 resp=2 lib-name= lib-ver= tot-net-in=34 tot-net-out=0 tot-cmds=0
argc: 3
argv[0]: "eval"
argv[1]: 7 bytes
argv[2]: 1 bytes

------ EXECUTING CLIENT INFO ------
id=2 addr=127.0.0.1:41372 laddr=127.0.0.1:6379 fd=10 name=*redacted* age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=12 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=17060 events=r cmd=eval user=*redacted* redir=-1 resp=2 lib-name= lib-ver= tot-net-in=34 tot-net-out=0 tot-cmds=0
argc: 3
argv[0]: "eval"
argv[1]: 7 bytes
argv[2]: 1 bytes

------ MODULES INFO OUTPUT ------

------ CONFIG DEBUG OUTPUT ------
repl-diskless-load disabled
debug-context ""
sanitize-dump-payload no
lazyfree-lazy-user-del yes
lazyfree-lazy-server-del yes
import-mode no
lazyfree-lazy-user-flush yes
list-compress-depth 0
dual-channel-replication-enabled no
repl-diskless-sync yes
activedefrag no
lazyfree-lazy-expire yes
io-threads 1
replica-read-only yes
client-query-buffer-limit 1gb
slave-read-only yes
lazyfree-lazy-eviction yes
proto-max-bulk-len 512mb

------ FAST MEMORY TEST ------
2595901:M 18 Jun 2025 01:20:12.921 # Bio worker thread #0 terminated
2595901:M 18 Jun 2025 01:20:12.921 # Bio worker thread #1 terminated
2595901:M 18 Jun 2025 01:20:12.921 # Bio worker thread #2 terminated
*** Preparing to test memory region 6530abce2000 (212992 bytes)
*** Preparing to test memory region 726f8af7f000 (2621440 bytes)
*** Preparing to test memory region 726f8b200000 (8388608 bytes)
*** Preparing to test memory region 726f8ba00000 (4194304 bytes)
*** Preparing to test memory region 726f8bffe000 (8388608 bytes)
*** Preparing to test memory region 726f8c7ff000 (8388608 bytes)
*** Preparing to test memory region 726f8d000000 (8388608 bytes)
*** Preparing to test memory region 726f8dc00000 (4194304 bytes)
*** Preparing to test memory region 726f8e290000 (16384 bytes)
*** Preparing to test memory region 726f8e3d2000 (20480 bytes)
*** Preparing to test memory region 726f8e5f8000 (32768 bytes)
*** Preparing to test memory region 726f8eb58000 (12288 bytes)
*** Preparing to test memory region 726f8eb5c000 (16384 bytes)
*** Preparing to test memory region 726f8ed63000 (4096 bytes)
*** Preparing to test memory region 726f8eef2000 (397312 bytes)
*** Preparing to test memory region 726f8efc7000 (4096 bytes)
.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O
Fast memory test PASSED, however your memory can still be broken. Please run a memory test for several hours if possible.

------ DUMPING CODE AROUND EIP ------
Symbol: (null) (base: (nil))
Module: /usr/lib/libc.so.6 (base 0x726f8e410000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=(nil) -D -b binary -m i386:x86-64 /tmp/dump.bin
------

=== VALKEY BUG REPORT END. Make sure to include from START to END. ===
```

---------

Signed-off-by: Fusl <fusl@meo.ws>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
xbasel pushed a commit to xbasel/valkey that referenced this pull request Jun 22, 2025
Squashed commit of the following:

commit af11752
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Thu Jun 19 08:56:12 2025 +0300

    more PR comments

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit a39fa20
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 18 19:35:55 2025 +0300

    update comment

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit ec73906
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 18 19:33:59 2025 +0300

    more pr comments being addressed

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit a4aa35c
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 18 18:51:24 2025 +0300

    move parseExtendedCommandArgumentsOrReply to server.c

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit f4a8786
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 18 18:48:49 2025 +0300

    address some more review comments

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 8ecd584
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 18 18:12:03 2025 +0300

    add missing entry.o

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit ee916d8
Merge: 156c4a5 a1f4cd6
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 18 17:28:04 2025 +0300

    Merge remote-tracking branch 'origin/unstable' into ttl-poc-new

commit 156c4a5
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 18 17:10:41 2025 +0300

    address more PR comments

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit de675bc
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 18 13:11:48 2025 +0300

    minot review fixes

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 9d39b40
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon Jun 16 13:10:45 2025 +0300

    Revert " partial work. introduce set expirations"

    This reverts commit 04f2006.

commit 04f2006
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon Jun 16 13:08:25 2025 +0300

     partial work. introduce set expirations

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit cd674be
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Sun Jun 15 14:17:05 2025 +0300

    fix misspel in test

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 25954b3
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Sun Jun 15 13:35:58 2025 +0300

    fix flakey test with EXPIREAT

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit aee670e
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Sun Jun 15 11:36:27 2025 +0300

    fix some more memory leaks

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 2fea8e7
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Fri Jun 13 15:02:13 2025 +0300

    fix memory leak issue

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 56db999
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Thu Jun 12 17:21:51 2025 +0300

    fix bad compilation

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 1b1ce58
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Thu Jun 12 17:17:58 2025 +0300

    add missing files

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 7a89a70
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Thu Jun 12 17:15:38 2025 +0300

    Separate hash entry implementation

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 23bb4a2
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Tue Jun 10 19:25:32 2025 +0300

    extend the comment

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 2a5e9a2
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Tue Jun 10 18:52:53 2025 +0300

    fix merge issues

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit e7683b6
Merge: 12151e5 c41ffc3
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Tue Jun 10 18:52:18 2025 +0300

    Merge remote-tracking branch 'origin/unstable' into ttl-poc-new

commit 12151e5
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Tue Jun 10 18:47:11 2025 +0300

    fix some bugs and added HPERSIST tests to help schema validator

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 846f943
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Tue Jun 10 14:06:58 2025 +0300

    fix new commands json and enable silent tests

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 51f9bdc
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Tue Jun 10 11:40:09 2025 +0300

    better enforce fields number to match the number of provided fields in httl and hpersist commands

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit c1cefec
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Tue Jun 10 10:17:31 2025 +0300

    fix reply schema of commands fetching the hash field ttl

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 4db5b7c
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Tue Jun 10 09:47:05 2025 +0300

    fix hexpire flaky test

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit f4ae1a2
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon Jun 9 22:02:39 2025 +0300

    remove fmacros include from volatile_set

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 3190eb4
Merge: 99d25d3 1941d28
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon Jun 9 21:15:07 2025 +0300

    Merge remote-tracking branch 'origin/unstable' into ttl-poc-new

commit 99d25d3
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon Jun 9 19:56:14 2025 +0300

    completely remove server level access context 9it was mainly used for lazy expiration logic)

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 2e082db
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Thu Jun 5 09:34:16 2025 +0300

    exlude hexpire tests from external tests

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 2c4c312
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 4 15:36:32 2025 +0300

    switch hashtable type only when object has volatile items

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 124acbe
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 4 14:45:19 2025 +0300

    return syntax error when fields is not provided in new API arguments

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit dd071f9
Merge: 12a35e2 5699c8c
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 4 13:02:30 2025 +0300

    Merge remote-tracking branch 'origin/unstable' into ttl-poc-new

commit 12a35e2
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 4 13:02:03 2025 +0300

    remove lazy expiration logic and tests

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 0cadaec
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 4 12:27:39 2025 +0300

    copy hash object should also copy the fields ttl

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 79b7e78
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed Jun 4 12:18:48 2025 +0300

     remove metadata from hash entry

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 8654080
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed May 21 19:41:59 2025 +0300

    make sure to remove the volatile set on hash object detructor

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit eab6fc4
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed May 21 16:01:56 2025 +0300

    fix trackUpdate condition

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 6d9551e
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed May 21 15:16:53 2025 +0300

    fix bad memory access issue on entry tracking update

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit d719dcd
Author: xbasel <103044017+xbasel@users.noreply.github.com>
Date:   Wed May 21 14:50:18 2025 +0300

    Hash TTL - add tests (valkey-io#1)

    * add tests

    Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>

    * fix a bug - return on error

    Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>

    * disable failing tests

    Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>

    * rmeove redundant test

    Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>

    * Update tests/unit/hashexpire.tcl

    ---------

    Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>
    Co-authored-by: Ran Shidlansik <ranshid@amazon.com>

commit e604b37
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed May 21 14:43:49 2025 +0300

    make hashtable call entry destructor on delete access

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 0b8dc03
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Wed May 21 14:31:44 2025 +0300

    centralize keyspace and key signal notifications to the reset context

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 753ba3c
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Tue May 20 14:19:34 2025 +0300

    fix object pass to keyspace notification in HSETEX

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit c5b8d76
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 21:47:42 2025 +0300

    fix formatting issue

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit e72d7a6
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 21:25:17 2025 +0300

    fix build issues

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 36b7356
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 21:13:15 2025 +0300

    allow setting the key object in context

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit a6844ac
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 20:15:18 2025 +0300

    add commands json files

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 20c0d29
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 20:13:23 2025 +0300

    fix hexpire propagation to use hpexpireat

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 5e19c90
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 17:19:41 2025 +0300

    fix HGETEX replication handling

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 31923c5
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 15:46:31 2025 +0300

    make httl functions verify the type

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit b782d44
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 15:36:22 2025 +0300

    fix case of hll command issues on non-existing listpack encoded hash

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 0723625
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 15:24:59 2025 +0300

    Fix HEXPIRE parse limits

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit d97e23f
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 15:19:53 2025 +0300

    fix FNX/FXX logic

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 4301399
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 15:08:02 2025 +0300

    fix wrong assert condition on update entry

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 6465314
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 15:00:20 2025 +0300

    handle negative ttl correctly

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit f62c163
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 14:20:49 2025 +0300

    format fixes

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit a59f31a
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon May 19 14:17:19 2025 +0300

    Add support for HGETEX and HSETEX

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 4a09f3d
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Sun May 18 12:16:44 2025 +0300

    free entry when calling hashTypeDelete

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit dd62037
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Sun May 18 11:12:49 2025 +0300

    remove hashtable redundant log

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit ea039c4
Merge: fce9a43 8d686dd
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Sun May 18 10:40:39 2025 +0300

    Merge remote-tracking branch 'origin/unstable' into ttl-poc-new

commit fce9a43
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Sun May 18 10:39:22 2025 +0300

    fix cmake compilation

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 1f0c933
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Sun May 18 10:34:27 2025 +0300

    avoid extra ref count incrementing in hashTypePropagateDeletion

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 6ee497c
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Thu May 15 21:32:30 2025 +0300

    fix some more format issues

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 90b7536
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Thu May 15 21:30:46 2025 +0300

    fix typo

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit fcce92b
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Thu May 15 21:28:02 2025 +0300

    fix expire propagation

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit cc7c2a3
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Thu May 15 20:48:12 2025 +0300

    handle some format check issues

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 61bd39a
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Thu May 15 20:45:18 2025 +0300

    fix some spelling checks

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit 89f56b0
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Thu May 15 20:38:21 2025 +0300

    fix new introduced commands

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

commit ecdcce0
Author: Ran Shidlansik <ranshid@amazon.com>
Date:   Mon Jan 6 10:46:47 2025 +0200

    Introduce HASH items expiration

    Signed-off-by: Ran Shidlansik <ranshid@amazon.com>

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
xbasel added a commit to xbasel/valkey that referenced this pull request Jun 22, 2025
* add tests

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>

* fix a bug - return on error

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>

* disable failing tests

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>

* rmeove redundant test

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>

* Update tests/unit/hashexpire.tcl

---------

Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>
Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
cherukum-Amazon pushed a commit to cherukum-Amazon/valkey that referenced this pull request Oct 17, 2025
With valkey-io#1401, we introduced additional filters to CLIENT LIST/KILL
subcommand. The intended behavior was to pick the last value of the
filter. However, we introduced memory leak for all the preceding
filters.

Before this change:
```
> CLIENT LIST IP 127.0.0.1 IP 127.0.0.1
id=4 addr=127.0.0.1:37866 laddr=127.0.0.1:6379 fd=10 name= age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=21 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=16989 events=r cmd=client|list user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=49 tot-net-out=0 tot-cmds=0
```
Leak:
```
Direct leak of 11 byte(s) in 1 object(s) allocated from:
    #0 0x7f2901aa557d in malloc (/lib64/libasan.so.4+0xd857d)
    valkey-io#1 0x76db76 in ztrymalloc_usable_internal /workplace/harkrisp/valkey/src/zmalloc.c:156
    valkey-io#2 0x76db76 in zmalloc_usable /workplace/harkrisp/valkey/src/zmalloc.c:200
    valkey-io#3 0x4c4121 in _sdsnewlen.constprop.230 /workplace/harkrisp/valkey/src/sds.c:113
    valkey-io#4 0x4dc456 in parseClientFiltersOrReply.constprop.63 /workplace/harkrisp/valkey/src/networking.c:4264
    valkey-io#5 0x4bb9f7 in clientListCommand /workplace/harkrisp/valkey/src/networking.c:4600
    valkey-io#6 0x641159 in call /workplace/harkrisp/valkey/src/server.c:3772
    valkey-io#7 0x6431a6 in processCommand /workplace/harkrisp/valkey/src/server.c:4434
    valkey-io#8 0x4bfa9b in processCommandAndResetClient /workplace/harkrisp/valkey/src/networking.c:3571
    valkey-io#9 0x4bfa9b in processInputBuffer /workplace/harkrisp/valkey/src/networking.c:3702
    valkey-io#10 0x4bffa3 in readQueryFromClient /workplace/harkrisp/valkey/src/networking.c:3812
    valkey-io#11 0x481015 in callHandler /workplace/harkrisp/valkey/src/connhelpers.h:79
    valkey-io#12 0x481015 in connSocketEventHandler.lto_priv.394 /workplace/harkrisp/valkey/src/socket.c:301
    valkey-io#13 0x7d3fb3 in aeProcessEvents /workplace/harkrisp/valkey/src/ae.c:486
    valkey-io#14 0x7d4d44 in aeMain /workplace/harkrisp/valkey/src/ae.c:543
    valkey-io#15 0x453925 in main /workplace/harkrisp/valkey/src/server.c:7319
    valkey-io#16 0x7f2900cd7139 in __libc_start_main (/lib64/libc.so.6+0x21139)
```

Note: For filter ID / NOT-ID we group all the option and perform
filtering whereas for remaining filters we only pick the last filter
option.

---------

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
(cherry picked from commit 155b0bb)
Signed-off-by: cherukum-amazon <cherukum@amazon.com>
cherukum-Amazon pushed a commit to cherukum-Amazon/valkey that referenced this pull request Oct 19, 2025
With valkey-io#1401, we introduced additional filters to CLIENT LIST/KILL
subcommand. The intended behavior was to pick the last value of the
filter. However, we introduced memory leak for all the preceding
filters.

Before this change:
```
> CLIENT LIST IP 127.0.0.1 IP 127.0.0.1
id=4 addr=127.0.0.1:37866 laddr=127.0.0.1:6379 fd=10 name= age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=21 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=16989 events=r cmd=client|list user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=49 tot-net-out=0 tot-cmds=0
```
Leak:
```
Direct leak of 11 byte(s) in 1 object(s) allocated from:
    #0 0x7f2901aa557d in malloc (/lib64/libasan.so.4+0xd857d)
    valkey-io#1 0x76db76 in ztrymalloc_usable_internal /workplace/harkrisp/valkey/src/zmalloc.c:156
    valkey-io#2 0x76db76 in zmalloc_usable /workplace/harkrisp/valkey/src/zmalloc.c:200
    valkey-io#3 0x4c4121 in _sdsnewlen.constprop.230 /workplace/harkrisp/valkey/src/sds.c:113
    valkey-io#4 0x4dc456 in parseClientFiltersOrReply.constprop.63 /workplace/harkrisp/valkey/src/networking.c:4264
    valkey-io#5 0x4bb9f7 in clientListCommand /workplace/harkrisp/valkey/src/networking.c:4600
    valkey-io#6 0x641159 in call /workplace/harkrisp/valkey/src/server.c:3772
    valkey-io#7 0x6431a6 in processCommand /workplace/harkrisp/valkey/src/server.c:4434
    valkey-io#8 0x4bfa9b in processCommandAndResetClient /workplace/harkrisp/valkey/src/networking.c:3571
    valkey-io#9 0x4bfa9b in processInputBuffer /workplace/harkrisp/valkey/src/networking.c:3702
    valkey-io#10 0x4bffa3 in readQueryFromClient /workplace/harkrisp/valkey/src/networking.c:3812
    valkey-io#11 0x481015 in callHandler /workplace/harkrisp/valkey/src/connhelpers.h:79
    valkey-io#12 0x481015 in connSocketEventHandler.lto_priv.394 /workplace/harkrisp/valkey/src/socket.c:301
    valkey-io#13 0x7d3fb3 in aeProcessEvents /workplace/harkrisp/valkey/src/ae.c:486
    valkey-io#14 0x7d4d44 in aeMain /workplace/harkrisp/valkey/src/ae.c:543
    valkey-io#15 0x453925 in main /workplace/harkrisp/valkey/src/server.c:7319
    valkey-io#16 0x7f2900cd7139 in __libc_start_main (/lib64/libc.so.6+0x21139)
```

Note: For filter ID / NOT-ID we group all the option and perform
filtering whereas for remaining filters we only pick the last filter
option.

---------

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
(cherry picked from commit 155b0bb)
Signed-off-by: cherukum-amazon <cherukum@amazon.com>
cherukum-Amazon pushed a commit to cherukum-Amazon/valkey that referenced this pull request Oct 21, 2025
With valkey-io#1401, we introduced additional filters to CLIENT LIST/KILL
subcommand. The intended behavior was to pick the last value of the
filter. However, we introduced memory leak for all the preceding
filters.

Before this change:
```
> CLIENT LIST IP 127.0.0.1 IP 127.0.0.1
id=4 addr=127.0.0.1:37866 laddr=127.0.0.1:6379 fd=10 name= age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=21 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=16989 events=r cmd=client|list user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=49 tot-net-out=0 tot-cmds=0
```
Leak:
```
Direct leak of 11 byte(s) in 1 object(s) allocated from:
    #0 0x7f2901aa557d in malloc (/lib64/libasan.so.4+0xd857d)
    valkey-io#1 0x76db76 in ztrymalloc_usable_internal /workplace/harkrisp/valkey/src/zmalloc.c:156
    valkey-io#2 0x76db76 in zmalloc_usable /workplace/harkrisp/valkey/src/zmalloc.c:200
    valkey-io#3 0x4c4121 in _sdsnewlen.constprop.230 /workplace/harkrisp/valkey/src/sds.c:113
    valkey-io#4 0x4dc456 in parseClientFiltersOrReply.constprop.63 /workplace/harkrisp/valkey/src/networking.c:4264
    valkey-io#5 0x4bb9f7 in clientListCommand /workplace/harkrisp/valkey/src/networking.c:4600
    valkey-io#6 0x641159 in call /workplace/harkrisp/valkey/src/server.c:3772
    valkey-io#7 0x6431a6 in processCommand /workplace/harkrisp/valkey/src/server.c:4434
    valkey-io#8 0x4bfa9b in processCommandAndResetClient /workplace/harkrisp/valkey/src/networking.c:3571
    valkey-io#9 0x4bfa9b in processInputBuffer /workplace/harkrisp/valkey/src/networking.c:3702
    valkey-io#10 0x4bffa3 in readQueryFromClient /workplace/harkrisp/valkey/src/networking.c:3812
    valkey-io#11 0x481015 in callHandler /workplace/harkrisp/valkey/src/connhelpers.h:79
    valkey-io#12 0x481015 in connSocketEventHandler.lto_priv.394 /workplace/harkrisp/valkey/src/socket.c:301
    valkey-io#13 0x7d3fb3 in aeProcessEvents /workplace/harkrisp/valkey/src/ae.c:486
    valkey-io#14 0x7d4d44 in aeMain /workplace/harkrisp/valkey/src/ae.c:543
    valkey-io#15 0x453925 in main /workplace/harkrisp/valkey/src/server.c:7319
    valkey-io#16 0x7f2900cd7139 in __libc_start_main (/lib64/libc.so.6+0x21139)
```

Note: For filter ID / NOT-ID we group all the option and perform
filtering whereas for remaining filters we only pick the last filter
option.

---------

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
(cherry picked from commit 155b0bb)
Signed-off-by: cherukum-amazon <cherukum@amazon.com>
madolson pushed a commit that referenced this pull request Oct 21, 2025
With #1401, we introduced additional filters to CLIENT LIST/KILL
subcommand. The intended behavior was to pick the last value of the
filter. However, we introduced memory leak for all the preceding
filters.

Before this change:
```
> CLIENT LIST IP 127.0.0.1 IP 127.0.0.1
id=4 addr=127.0.0.1:37866 laddr=127.0.0.1:6379 fd=10 name= age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=21 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=16989 events=r cmd=client|list user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=49 tot-net-out=0 tot-cmds=0
```
Leak:
```
Direct leak of 11 byte(s) in 1 object(s) allocated from:
    #0 0x7f2901aa557d in malloc (/lib64/libasan.so.4+0xd857d)
    #1 0x76db76 in ztrymalloc_usable_internal /workplace/harkrisp/valkey/src/zmalloc.c:156
    #2 0x76db76 in zmalloc_usable /workplace/harkrisp/valkey/src/zmalloc.c:200
    #3 0x4c4121 in _sdsnewlen.constprop.230 /workplace/harkrisp/valkey/src/sds.c:113
    #4 0x4dc456 in parseClientFiltersOrReply.constprop.63 /workplace/harkrisp/valkey/src/networking.c:4264
    #5 0x4bb9f7 in clientListCommand /workplace/harkrisp/valkey/src/networking.c:4600
    #6 0x641159 in call /workplace/harkrisp/valkey/src/server.c:3772
    #7 0x6431a6 in processCommand /workplace/harkrisp/valkey/src/server.c:4434
    #8 0x4bfa9b in processCommandAndResetClient /workplace/harkrisp/valkey/src/networking.c:3571
    #9 0x4bfa9b in processInputBuffer /workplace/harkrisp/valkey/src/networking.c:3702
    #10 0x4bffa3 in readQueryFromClient /workplace/harkrisp/valkey/src/networking.c:3812
    #11 0x481015 in callHandler /workplace/harkrisp/valkey/src/connhelpers.h:79
    #12 0x481015 in connSocketEventHandler.lto_priv.394 /workplace/harkrisp/valkey/src/socket.c:301
    #13 0x7d3fb3 in aeProcessEvents /workplace/harkrisp/valkey/src/ae.c:486
    #14 0x7d4d44 in aeMain /workplace/harkrisp/valkey/src/ae.c:543
    #15 0x453925 in main /workplace/harkrisp/valkey/src/server.c:7319
    #16 0x7f2900cd7139 in __libc_start_main (/lib64/libc.so.6+0x21139)
```

Note: For filter ID / NOT-ID we group all the option and perform
filtering whereas for remaining filters we only pick the last filter
option.

---------

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
(cherry picked from commit 155b0bb)
Signed-off-by: cherukum-amazon <cherukum@amazon.com>
yang-z-o referenced this pull request in yang-z-o/valkey Jan 12, 2026
Signed-off-by: Yang Zhao <zymy701@gmail.com>
PingXie added a commit to PingXie/valkey that referenced this pull request Jan 27, 2026
Issue: PR valkey-io#1 falsely matched Redis #13157 due to patch-id collision.
- Valkey PR valkey-io#1: 2-file test PR touching dict.h and t_zset.c
- Redis PR #13157: 36-file license change PR
- Both happened to have same patch-id (efdd4d9b5640) for t_zset.c

Fix: Add file count ratio check for patch-id matches.
- If file counts differ by >3x, treat as collision (suppress match)
- Example: 2 files vs 36 files = 18x ratio -> filtered out
- Example: 2 files vs 2 files = 1x ratio -> kept

Result:
- PR valkey-io#1 now passes (no false positive from license change PR)
- PR valkey-io#3088 still correctly detects #14243 (2 files vs 2 files)

Signed-off-by: Ping Xie <pingxie@outlook.com>
PingXie added a commit to PingXie/valkey that referenced this pull request Jan 27, 2026
Implements line-by-line comparison of actual diff content after initial
fingerprint matching to validate matches are truly similar.

Features:
- deep_compare_diffs(): Compares normalized diffs using Jaccard + sequence similarity
- Validates top candidates only (up to 2x report limit) to minimize API calls
- Adaptive thresholds: 50% for patch-id matches, 70% for simhash matches
- Graceful fallback: accepts fingerprint match if deep comparison fails

Benefits:
- Filters patch-id hash collisions (different changes, same hash)
- Validates simhash matches are semantically similar
- PR valkey-io#3088: correctly detects #14243, rejects 40+ collision false positives
- PR valkey-io#1: still passes (no false matches)

Performance:
- Before: checked every match (~60+ API calls for PR valkey-io#3088)
- After: checks top 10 candidates (~10 API calls)
- Speedup: 9s vs 60+s for PR valkey-io#3088

Signed-off-by: Ping Xie <pingxie@outlook.com>
PingXie added a commit to PingXie/valkey that referenced this pull request Jan 27, 2026
Issue: PR valkey-io#3105 was passing even though it's a subset of Redis #14435.
- Valkey: Adds IFNE option (132 lines, 7 files)
- Redis: Adds IFEQ/IFNE/IFNEQ + xxhash + digest (9843 lines, 18 files)
- Jaccard similarity only 20% due to size difference
- But 90-100% of Valkey tokens exist in Redis (subset relationship)

Fix: Check subset ratio in addition to Jaccard similarity.
- Subset ratio = |Valkey ∩ Redis| / |Valkey|
- Detects when smaller PR cherry-picks from larger PR
- Final similarity = max(weighted_jaccard, subset_ratio)

Per-file analysis of PR valkey-io#3105 vs #14435:
- src/commands.def: 92.9% subset ratio
- src/commands/set.json: 93.8% subset ratio
- src/t_string.c: 95.5% subset ratio
- tests/unit/type/string.tcl: 100% subset ratio

Result:
- PR valkey-io#3105 now correctly detects Redis #14435 match
- PR valkey-io#3088 still works (detects #14243)
- PR valkey-io#1 still passes (no false positives)

Signed-off-by: Ping Xie <pingxie@outlook.com>
enjoy-binbin pushed a commit that referenced this pull request Feb 5, 2026
Address #2683 

For the test, I found two problems.

### 1. Test Assumes Winner Has Rank #0
In my test run, in the failing case:
- Replica -3 (rank 0) won epoch 10 first
- Replica -6 (rank 1) won epoch 11 second
- Replica -3 saw higher epoch and stepped down
- Final result: rank 1 became master

Rank #0 doesn't guarantee being the final master.

### 2. Pattern Matching Bug
The pattern `*Start of election*rank #0*` incorrectly matches "primary
rank #0":

Log: `Start of election delayed for 350 milliseconds (rank #1, primary
rank #0, offset 2172)`

This line has rank `#1`, but the pattern matches because of "primary
rank `#0`" at the end.

## Solution
- Fixed the format problem by checking if `(rank #0` or `(rank #1` so
that we won't accidentally match the primary rank
- Only check to make sure replica 3 and replica 6 have different ranks
without assuming the replica with rank 0 will become the master.

Signed-off-by: Hanxi Zhang <hanxizh@amazon.com>
enjoy-binbin pushed a commit that referenced this pull request Feb 8, 2026
)

I was working on ASAN large memory tests when I countered this issue.

The issue was that the hardcoded `999` key could land in an early
bucket. Then shrink rehash could finish early, and later inserts could
trigger a new expansion rehash, resetting rehash_idx low. The test now
picks the survivor key dynamically as the key mapped to the highest
bucket index.

```
[test_hashtable.c] Memory leak detected of 336 bytes
=================================================================
==3901==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 80 byte(s) in 1 object(s) allocated from:
    #0 0x7fb0556fd9c7 in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x563bfdf4c47d in ztrymalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:156
    #2 0x563bfdf4c47d in valkey_malloc /home/runner/work/valkey/valkey/src/zmalloc.c:185
    #3 0x563bfdd42eaf in hashtableCreate /home/runner/work/valkey/valkey/src/hashtable.c:1217
    #4 0x563bfdaa1cbf in test_empty_buckets_rehashing unit/test_hashtable.c:232
    #5 0x563bfdae772b in runTestSuite unit/test_main.c:36
    #6 0x563bfda86b20 in main unit/test_main.c:108
    #7 0x7fb05522a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    #8 0x7fb05522a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    #9 0x563bfda8a5c4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x17c5c4) (BuildId: 44cfc183e6e82e499bcc9f6adc094d7f774ee9d2)

Indirect leak of 128 byte(s) in 1 object(s) allocated from:
    #0 0x7fb0556fd340 in calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x563bfdf4c922 in ztrycalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:214
    #2 0x563bfdf4c922 in valkey_calloc /home/runner/work/valkey/valkey/src/zmalloc.c:257
    #3 0x563bfdd40967 in resize /home/runner/work/valkey/valkey/src/hashtable.c:741
    #4 0x563bfdd45eb1 in hashtableExpandIfNeeded /home/runner/work/valkey/valkey/src/hashtable.c:1446
    #5 0x563bfdd45eb1 in hashtableExpandIfNeeded /home/runner/work/valkey/valkey/src/hashtable.c:1433
    #6 0x563bfdd45eb1 in insert /home/runner/work/valkey/valkey/src/hashtable.c:1041
    #7 0x563bfdd45eb1 in hashtableAddOrFind /home/runner/work/valkey/valkey/src/hashtable.c:1554
    #8 0x563bfdd45eb1 in hashtableAdd /home/runner/work/valkey/valkey/src/hashtable.c:1539
    #9 0x563bfdaa1e3b in test_empty_buckets_rehashing unit/test_hashtable.c:254
    #10 0x563bfdae772b in runTestSuite unit/test_main.c:36
    #11 0x563bfda86b20 in main unit/test_main.c:108
    #12 0x7fb05522a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    #13 0x7fb05522a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    #14 0x563bfda8a5c4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x17c5c4) (BuildId: 44cfc183e6e82e499bcc9f6adc094d7f774ee9d2)

Indirect leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7fb0556fd340 in calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x563bfdf4c922 in ztrycalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:214
    #2 0x563bfdf4c922 in valkey_calloc /home/runner/work/valkey/valkey/src/zmalloc.c:257
    #3 0x563bfdd3f553 in bucketConvertToChained /home/runner/work/valkey/valkey/src/hashtable.c:908
    #4 0x563bfdd3f553 in findBucketForInsert /home/runner/work/valkey/valkey/src/hashtable.c:1021
    #5 0x563bfdd45d9e in insert /home/runner/work/valkey/valkey/src/hashtable.c:1045
    #6 0x563bfdd45d9e in hashtableAddOrFind /home/runner/work/valkey/valkey/src/hashtable.c:1554
    #7 0x563bfdd45d9e in hashtableAdd /home/runner/work/valkey/valkey/src/hashtable.c:1539
    #8 0x563bfdaa1e3b in test_empty_buckets_rehashing unit/test_hashtable.c:254
    #9 0x563bfdae772b in runTestSuite unit/test_main.c:36
    #10 0x563bfda86b20 in main unit/test_main.c:108
    #11 0x7fb05522a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    #12 0x7fb05522a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    #13 0x563bfda8a5c4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x17c5c4) (BuildId: 44cfc183e6e82e499bcc9f6adc094d7f774ee9d2)

Indirect leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7fb0556fd340 in calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x563bfdf4c922 in ztrycalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:214
    #2 0x563bfdf4c922 in valkey_calloc /home/runner/work/valkey/valkey/src/zmalloc.c:257
    #3 0x563bfdd40967 in resize /home/runner/work/valkey/valkey/src/hashtable.c:741
    #4 0x563bfdaa1df8 in test_empty_buckets_rehashing unit/test_hashtable.c:248
    #5 0x563bfdae772b in runTestSuite unit/test_main.c:36
    #6 0x563bfda86b20 in main unit/test_main.c:108
    #7 0x7fb05522a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    #8 0x7fb05522a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 274eec488d230825a136fa9c4d85370fed7a0a5e)
    #9 0x563bfda8a5c4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x17c5c4) (BuildId: 44cfc183e6e82e499bcc9f6adc094d7f774ee9d2)

SUMMARY: AddressSanitizer: 336 byte(s) leaked in 4 allocation(s).
```

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
enjoy-binbin pushed a commit that referenced this pull request May 7, 2026
CI caught ip and name SDS allocations being leaked in
fetchClusterConfiguration. The ip SDS was copied again via sdsnew()
before being passed to createClusterNode(), leaking the original. The
name SDS was leaked when the node already existed in the dict.

Free ip and name on all exit paths in fetchClusterConfiguration. Remove
stale guard in freeClusterNode, no longer needed since #1392

CI Error -
```
Direct leak of 33 byte(s) in 3 object(s) allocated from:
      #0 0x7f4c3a0fd9c7 in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
      #1 0x5564620c124a in ztrymalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:172
      #2 0x5564620c124a in zmalloc_usable /home/runner/work/valkey/valkey/src/zmalloc.c:268
      #3 0x5564620dfbe6 in _sdsnewlen.constprop.0 /home/runner/work/valkey/valkey/src/sds.c:102
      #4 0x556462050996 in sdsnewlen /home/runner/work/valkey/valkey/src/sds.c:169
      #5 0x556462050996 in sdsnew /home/runner/work/valkey/valkey/src/sds.c:185
      #6 0x556462050996 in fetchClusterConfiguration /home/runner/work/valkey/valkey/src/valkey-benchmark.c:1477
```

Issue was reproduceable locally using `leaks --atExit`

Signed-off-by: nmvk <r@nmvk.com>
lucasyonge pushed a commit that referenced this pull request May 11, 2026
CI caught ip and name SDS allocations being leaked in
fetchClusterConfiguration. The ip SDS was copied again via sdsnew()
before being passed to createClusterNode(), leaking the original. The
name SDS was leaked when the node already existed in the dict.

Free ip and name on all exit paths in fetchClusterConfiguration. Remove
stale guard in freeClusterNode, no longer needed since #1392

CI Error -
```
Direct leak of 33 byte(s) in 3 object(s) allocated from:
      #0 0x7f4c3a0fd9c7 in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
      #1 0x5564620c124a in ztrymalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:172
      #2 0x5564620c124a in zmalloc_usable /home/runner/work/valkey/valkey/src/zmalloc.c:268
      #3 0x5564620dfbe6 in _sdsnewlen.constprop.0 /home/runner/work/valkey/valkey/src/sds.c:102
      #4 0x556462050996 in sdsnewlen /home/runner/work/valkey/valkey/src/sds.c:169
      #5 0x556462050996 in sdsnew /home/runner/work/valkey/valkey/src/sds.c:185
      #6 0x556462050996 in fetchClusterConfiguration /home/runner/work/valkey/valkey/src/valkey-benchmark.c:1477
```

Issue was reproduceable locally using `leaks --atExit`

Signed-off-by: nmvk <r@nmvk.com>
lucasyonge pushed a commit that referenced this pull request May 12, 2026
CI caught ip and name SDS allocations being leaked in
fetchClusterConfiguration. The ip SDS was copied again via sdsnew()
before being passed to createClusterNode(), leaking the original. The
name SDS was leaked when the node already existed in the dict.

Free ip and name on all exit paths in fetchClusterConfiguration. Remove
stale guard in freeClusterNode, no longer needed since #1392

CI Error -
```
Direct leak of 33 byte(s) in 3 object(s) allocated from:
      #0 0x7f4c3a0fd9c7 in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
      #1 0x5564620c124a in ztrymalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:172
      #2 0x5564620c124a in zmalloc_usable /home/runner/work/valkey/valkey/src/zmalloc.c:268
      #3 0x5564620dfbe6 in _sdsnewlen.constprop.0 /home/runner/work/valkey/valkey/src/sds.c:102
      #4 0x556462050996 in sdsnewlen /home/runner/work/valkey/valkey/src/sds.c:169
      #5 0x556462050996 in sdsnew /home/runner/work/valkey/valkey/src/sds.c:185
      #6 0x556462050996 in fetchClusterConfiguration /home/runner/work/valkey/valkey/src/valkey-benchmark.c:1477
```

Issue was reproduceable locally using `leaks --atExit`

Signed-off-by: nmvk <r@nmvk.com>
GilboaAWS referenced this pull request in GilboaAWS/valkey Jun 24, 2026
Walk through 31 review threads sequentially; integrate all decisions
into design artifacts. 22 threads were @ikolomi self-review,
9 from @GilboaAWS.

Substantive design changes:
- Default compression-max-value-size lowered 1 MiB -> 128 KiB
- Full-sync replication RDB always uncompressed in v1 (new R2.6.8);
  disk RDB still compressed; wire negotiation deferred to v2
- Config surface split into 5 primary + 11 advanced knobs
- SDS immutability invariant formalized via existing dbUnshareStringValue
  COW discipline; added R2.4.4-R2.4.6 audit + compression-cow-invariant.tcl
  as a merge blocker
- New compression-aware benchmark suite (sect. 7.5) extending
  valkey-benchmark with --key-distribution, --value-size-distribution,
  --value-data and six canonical scenarios
- EMBSTR dropped from eligibility in 6 locations
- Hotness checks decoupled from maxmemory-policy: universal write-age
  and idle-seconds gates; LFU-freq only when LFU active; rename
  compression-lru-idle-seconds -> compression-min-idle-seconds
- Retry-guard scoped by dict ID via new incompressibleKeys{} hashtable
- Training flow rewritten: main-thread iteration copies samples into
  a contiguous buffer; bio runs ZDICT_trainFromBuffer on immutable
  main-thread-owned bytes; bio never touches robj/kvstore/refcounts
- Appendix C.7 added: io-threads for decompression rejected in v1
- 'Block the client' clarified as BLPOP-style (main thread free)

New artifacts:
- implementation/plan.md: phased parallel-ownership implementation plan
  (7 subsystems, ~11 weeks with @ikolomi and @GilboaAWS in parallel)
- summary.md: feature summary for reviewers
- DESIGN_TODO.md: per-thread audit trail with status/decision/resolved_by
- pr-feedback.json: machine-readable sidecar
- tools/fetch-pr-comments.sh + normalize-pr-comments.py + post-pr-replies.py:
  walkthrough tooling (GitHub REST + GraphQL, paginated, cached)

GitHub round-trip complete: per-thread replies posted and all 31
review threads resolved; top-level PR comment issuecomment-4420021714
links to DESIGN_TODO.md.
GilboaAWS referenced this pull request in GilboaAWS/valkey Jun 24, 2026
Inline compression design (initial draft)
GilboaAWS referenced this pull request in GilboaAWS/valkey Jun 24, 2026
Addresses 5 of 6 review comments on the QSBR design. Comment valkey-io#6
(`compressionJob.key` extra-lookup concern) is explicitly deferred to
a follow-up PR per reviewer guidance.

Comment #1 (line 428) and valkey-io#5 (line 544) — drop language-comparison
framing:
  Removed all references to Rust / `Arc<T>` / "memory-safe languages"
  / `shared_ptr` from §4.4 intro, the "Why QSBR" bullet list, and the
  §4.6 "Why the worker loads the active dict itself" paragraph. The
  rationale now stands on its own technical merit (decoupling the
  registry from worker hot paths; minimal worker contract;
  safe-directional failure modes) rather than via comparison to
  another language's type system. C with explicit protocols is the
  right tool for this problem; the comparison added rhetorical
  weight without adding signal.

Comment valkey-io#2 (line 326) — duplication with R2.11.4:
  §3.3 Separation invariants restated the worker contract that
  R2.11.4 already specifies authoritatively. Slimmed the §3.3 bullet
  to a one-liner that points at §2.11 R2.11.4 and §4.4. Eliminates
  drift risk between the two places.

Comment valkey-io#3 (line 439) — bound the retiring list, block on cap:
  Added new step 7 to the QSBR section explaining the cap interaction
  with R2.3.3. The retiring list is a subset of `dicts[]`, capped at
  `compression-dict-max-versions`. When grace-barrier draining cannot
  keep up (worker starvation, persistent `frame_refs > 0`), the cap
  is reached and BOTH training AND promotion are refused per R2.3.3:
  `LL_WARNING` log entry, `compression_dict_cap_reached` set in INFO,
  operator intervention required (raise cap or run COMPRESSION SWEEP).

Comment valkey-io#4 (line 449) — grace-barrier wake-up via cond_broadcast:
  The original step 6 proposed enqueueing barrier jobs into the SPMC
  inbox to force idle workers to advance generations. This doesn't
  actually work: under work-stealing semantics a single fast worker
  can drain all barrier jobs while siblings stay asleep on the cond
  var. Rewrote step 6 to use a wake-all primitive built on
  `pthread_cond_broadcast`, and added a "Wake-all primitive"
  paragraph to §4.6 that describes extending `mutexqueue.h` with two
  new APIs: a broadcast wake-all (for QSBR grace barriers, config
  changes, etc.) and a shutdown-signal variant (for pool teardown).
  Step 6 cross-references §4.6 for the mechanism.

Comment valkey-io#6 (line 513) — DEFERRED:
  Reviewer flagged that `compressionJob.key` (a `robj *` carried in
  the job) implies the main thread does an additional lookup at
  install time, doubling the per-write lookup cost. The reviewer
  explicitly tagged this as "follow up PR" — addressing it would
  require a redesign of the install-side data flow and is out of
  scope for the QSBR design change. Tracked as an open item; will
  be addressed before code lands for the install path (S2.7 in
  the implementation plan).
GilboaAWS referenced this pull request in GilboaAWS/valkey Jun 24, 2026
* docs(design): align eligibility predicate; document policy-aware hot-key skip

Two corrections discovered while preparing S2.2 implementation:

1. The two design docs had drifted on minor wording (idea-honing.md
   said `obj->encoding == RAW` and `obj->refcount != SHARED`;
   detailed-design.md said `OBJ_ENCODING_RAW` / `OBJ_SHARED_REFCOUNT`).
   Predicates now match exactly across both docs, using the actual C
   constants from src/server.h. Per the agreed convention: keep the
   exact predicate inline in both docs (different audiences both need
   it readable in place) rather than a cross-reference.

2. The Thread valkey-io#18/valkey-io#19 walkthrough resolutions made a claim that
   doesn't match the existing Valkey source: "the existing LRU field
   already provides the signal for every eviction policy" / time-based
   checks "work uniformly across LRU, LFU, and noeviction" /
   `estimateObjectIdleTime()` is "a reliable read-hotness signal
   universally."

   Reading src/lrulfu.h:
     - LRU and noeviction policies: robj->lru is seconds-based.
       lru_idle_secs(o) returns real seconds. Time-based thresholds
       work as designed.
     - LFU policy: robj->lru encodes 16 bits "last eval time in
       minutes" + 8 bits approximate freq counter. There is no
       per-second access timestamp. lru_idle_secs(o) would
       misinterpret the bits. The function `estimateObjectIdleTime()`
       referenced by the design does not exist in current Valkey;
       the closest available helper is `lrulfu_getIdleness()` which
       returns `UINT8_MAX - freq` in LFU mode (a 0..255 freq-derived
       heuristic, NOT seconds).

   The fix is policy-aware checks, not policy-uniform:
     - LRU and noeviction: apply settle-seconds AND min-idle-seconds
       against lru_idle_secs(o).
     - LFU: skip the time-based thresholds entirely (the metric is
       wrong unit). Apply compression-lfu-threshold against the freq
       counter — already in the predicate as the LFU branch.

   The dual-knob operator surface (`compression-settle-seconds` and
   `compression-min-idle-seconds`) is preserved across modes; in
   LRU/noeviction both knobs apply to the same metric (since
   robj->lru is touched on every read AND write — v1 cannot
   distinguish source) so the effective threshold is `max(settle,
   min_idle)`. Operators get to express two intents.

   The Thread valkey-io#18/valkey-io#19 resolutions stand as written in DESIGN_TODO.md
   (audit trail; the decision to use the existing LRU field rather
   than add a new write-time field is unchanged); only the
   implementation interpretation in the live design docs is refined.

Updated:
  - detailed-design.md §2.2 R2.2 predicate + new explanatory
    paragraph below it.
  - detailed-design.md §2.12 config table: settle-seconds and
    min-idle-seconds descriptions now correctly note "Inactive in LFU
    mode."
  - idea-honing.md Q6 baseline filter bullet (rewritten as
    "policy-aware" with sub-bullets per policy).
  - idea-honing.md Q6 consolidated predicate (now identical to
    detailed-design.md §2.2).
  - idea-honing.md Q6a answer: rewritten with the policy-aware
    framing; cross-references "S2.2 implementation review" so future
    readers can trace this refinement.

* Inline compression: S2.2 — eligibility predicate

Implements R2.2 / Q6 `compressionIsEligible(robj *o, const sds key)`,
replacing the Phase 0 stub that returned 0 for every value.

The predicate has six gates, evaluated in cheapest-first order so the
master switch short-circuits early when the feature is disabled:

  1. Master switch (server.compression_enabled).
  2. Type + encoding gate. STRING values only (R2.2, Q6c). Of the four
     string encodings, only OBJ_ENCODING_RAW is a candidate:
       - INT      — already memory-optimal.
       - EMBSTR   — ≤44 B, header overhead erases any savings (Threads
                    valkey-io#17 / valkey-io#21 explicitly excluded).
       - COMPRESSED — defense-in-depth no-op for double-compress.
  3. Refcount gate. Shared RESP constants are never installed in a db
     (lookupKey asserts this); we mirror the assertion as a safety
     check.
  4. Size bounds. compression-min-value-size (lower) prevents wasting
     CPU on values too small to recoup the per-value header (~16 B).
     compression-max-value-size (upper, 0 = disabled) caps worst-case
     sync-decompression latency on the main thread (~1 µs/KB).
  5. Hot-key skip — POLICY-AWARE per the corrected R2.2:
       - LRU and noeviction: robj->lru is seconds-based. Apply both
         compression-settle-seconds (recent-write proxy) and
         compression-min-idle-seconds (recent-read proxy) against
         lru_getIdleSecs(o->lru). v1 cannot distinguish source
         (robj->lru is touched on read AND write); the dual surface
         lets operators express two intents that share an underlying
         signal. Effective threshold is max(settle, min_idle).
       - LFU: robj->lru encodes a freq counter (no per-second
         timestamp). Time-based knobs are inactive in this mode.
         Apply compression-lfu-threshold against the freq counter via
         lfu_getFrequency(), which mirrors the standard Valkey
         decay-on-read pattern (objectGetLFUFrequency in src/object.c).
         The signature is `robj *o` (not `const robj *`) because of
         this in-place decay.
  6. Incompressible-keys retry guard. Stubbed as "always retry-eligible"
     in S2.2; S2.3 lands the side hashtable and wires
     compressionRetryEligible(key) here.

Test coverage in src/unit/test_compression_eligibility.cpp (16 tests,
auto-discovered by src/unit/Makefile's `wildcard *.cpp`):

  - Master switch off / on.
  - Each rejection branch:
      * non-STRING type
      * INT, EMBSTR, COMPRESSED encodings
      * shared refcount
      * size below min, above max
  - Size-bound boundary cases (at exact min, at exact max,
    max=0 disables upper bound).
  - LRU branch:
      * recent touch (idle < settle) — rejected
      * idle between settle and min_idle (max wins) — rejected
      * cold key (idle >= max(settle, min_idle)) — accepted
      * zero thresholds — accept immediately
  - noeviction policy: same code path as LRU per R2.2.
  - LFU branch:
      * freq at threshold — rejected (>= comparison)
      * freq above threshold — rejected
      * freq below threshold — accepted
      * time-based knobs are inactive even at INT_MAX values.

The fixture saves and restores `server.compression_*` and
`server.maxmemory_policy`, and re-syncs lrulfu's cached
`is_using_lfu_policy` boolean via `lrulfu_updateClockAndPolicy()` on
both setup and teardown so tests don't leak policy state into each
other.

Verified locally:
  - `make -j` builds clean.
  - `./runtest --single unit/type/compression` 10/10 passes (the Phase 0
    integration fixture exercises feature-off semantics; with
    compression-enabled still 0 by default, eligibility is never
    consulted in the integration server).

Not verified locally (CI will validate):
  - gtest unit tests on Linux/macOS/32-bit (no libgtest-dev locally).

S2.3 (incompressible-keys hashtable) wires the last branch and ships
its own gtest coverage. After that, S2.4–S2.10 wire the rest of the
hot path.

* docs(design): rewrite eligibility predicate's hot-key check in branched form

Per review feedback during S2.2 review: the previous form encoded the
policy split using short-circuit booleans —

  && (lfu_mode  || lru_idle_secs(obj) >= compression-settle-seconds)
  && (lfu_mode  || lru_idle_secs(obj) >= compression-min-idle-seconds)
  && (!lfu_mode || lfu_freq(obj) < compression-lfu-threshold)

— which is logically correct but reads awkwardly. Three lines mention
`lfu_mode` (twice unprimed, once primed); the reader has to mentally
short-circuit twice to see that line 1+2 fire only in LRU/noeviction
and line 3 fires only in LFU. It also looks at first glance like the
predicate might be using `compression-lfu-threshold` as an LRU-mode
threshold.

Replaced with a branched helper that mirrors how the C
implementation's if/else branches:

  && hot_key_check(obj)                       // policy-aware

  where hot_key_check(obj) is:
      if lfu_mode:
          lfu_freq(obj) < compression-lfu-threshold
      else:
          lru_idle_secs(obj) >= compression-settle-seconds
          AND lru_idle_secs(obj) >= compression-min-idle-seconds

Same behavior; the implementation in src/compression.c
(`compressionIsEligible()`) already uses this exact branching shape —
the docs now match it visually.

Updated:
  - detailed-design.md §2.2 R2.2 predicate.
  - idea-honing.md Q6 consolidated predicate.

Both docs were already aligned (per the previous predicate-alignment
commit); they remain identical with this rewrite. The explanatory
paragraph below the §2.2 predicate (LRU vs LFU lru-field encoding)
already covers the rationale and is unchanged.

* Inline compression: drop compression-settle-seconds knob (YAGNI)

Removes one of the two time-based hot-key skip knobs. The eligibility
predicate's LRU/noeviction branch now compares against a single
threshold (`compression-min-idle-seconds`) instead of two.

Why this matters
----------------

The dual-knob surface — `compression-settle-seconds` ("recent-write
protection") and `compression-min-idle-seconds` ("recent-access
protection") — was introduced via the Thread valkey-io#18 walkthrough resolution
on the theory that operators benefit from being able to express two
different intents.

In v1 reality this is documentation theater. Both knobs compare against
the same metric (`lru_idle_secs(o)`) because Valkey's `robj->lru` field
is touched on every read AND every write — gated only by
`LOOKUP_NOTOUCH` and fork. v1 cannot distinguish read-recency from
write-recency from this single signal. The math always reduces to:

    eligible iff lru_idle_secs(o) >= max(settle, min_idle)

Setting `settle=10, min_idle=120` is identical to the single-knob
`min_idle=120`. Tested every scenario I could construct (different
time scales, sweep cron sequences, heterogeneous workloads, forward-
compat with v2) — none give the dual surface operationally distinct
behavior in v1.

The only non-trivial argument for keeping both was forward-compat: if
v2 adds a per-object write-time field, the dual knob becomes
meaningful. But adding a config in v2 is non-breaking; existing
operators on v1 see no change. Removing later is harder than adding
later.

Plus the dual surface is an active footgun: operators tuning the two
knobs differently expecting different effects get a confusing
no-difference outcome. PR #1 Thread valkey-io#3 specifically pushed back on
"too many knobs" — that pressure applies here.

Per YAGNI, ship v1 with the single knob. v2 reintroduces a
write-time-specific knob non-breakingly when per-object write-time
tracking lands.

Code changes
------------

- src/server.h: remove `compression_settle_seconds` field.
- src/config.c: remove the `createIntConfig` registration.
- src/compression.c: drop the second `idle_secs >= settle` check
  in `compressionIsEligible`'s LRU branch. Updated the comment block
  to reflect single-signal reality.
- src/unit/test_compression_eligibility.cpp:
    - Drop `LruRejectsBetweenSettleAndMinIdle` (test of dual-knob
      max-wins behavior — no longer applicable).
    - Replace `LruRejectsRecentTouch` / `LruAcceptsBeyondBothThresholds`
      / `LruZeroThresholdsAcceptImmediately` with single-knob
      equivalents (`LruRejectsRecentTouch`, `LruAcceptsBeyondThreshold`,
      `LruAtThresholdAcceptsBoundary`, `LruZeroThresholdAcceptsImmediately`).
    - Drop `compression-settle-seconds` from `LfuTimeKnobsAreInactive`
      and rename to `LfuTimeKnobIsInactive`.
- tests/unit/type/compression.tcl: drop the
  `compression-settle-seconds` config-default assertion; update
  comment from "Advanced (11)" to "Advanced (10)".

Doc changes
-----------

- detailed-design.md §2.2 R2.2 predicate: hot_key_check helper now has
  one comparison in the LRU/noeviction branch instead of two. The
  rationale paragraph below the predicate explains the v1 single-
  signal reality and the YAGNI motivation for dropping the second
  knob; future v2 reintroduction noted.
- detailed-design.md §2.12 advanced config table: 11 → 10 knobs;
  `compression-settle-seconds` row removed; `compression-min-idle-
  seconds` description simplified.
- detailed-design.md §7.1 transparency-mode harness config: drop the
  `--compression-settle-seconds 0` line so the harness doesn't pass
  an unknown option.
- idea-honing.md Q6 baseline filter bullet: collapse the two-bullet
  LRU branch into a single bullet; add an _italicized rationale
  paragraph_ explaining why the second knob was dropped (preserves
  the historical thinking for future readers).
- idea-honing.md Q6 consolidated predicate: matches detailed-design.md.
- idea-honing.md Q6 config table: drop the `compression-settle-seconds`
  row.
- idea-honing.md Q6a answer: rewrite to reflect single-knob reality
  with reference to "S2.2 implementation review" so future readers
  can trace this refinement chain (Thread valkey-io#18 → Thread valkey-io#19 → S2.2
  refinement).
- idea-honing.md §7.1 harness config: drop the
  `--compression-settle-seconds 0` line.
- implementation/plan.md S2.2 description: simplify to "policy-aware
  hot-key skip" + the actual operator-facing knobs.
- summary.md: update the eligibility table row + walkthrough-
  highlights bullet to reflect the policy-aware single-knob outcome.

Audit-trail files (DESIGN_TODO.md, pr-feedback.json) intentionally
unchanged — they capture decisions at a point in time. The
walkthrough Thread valkey-io#18/valkey-io#19 resolutions stand as written; only the
implementation interpretation in the live design docs is refined.

Verified locally
----------------

- `make -j` builds clean.
- `./runtest --single unit/type/compression` 10/10 passes (the Tcl
  fixture's config-default assertion was updated in lockstep with the
  C-side removal, so the integration test catches any drift between
  src/config.c and tests/unit/type/compression.tcl).

Not verified locally (CI will validate):
- gtest unit tests (no libgtest-dev locally).

Test count delta
----------------
S2.2 gtest: 16 tests → 14 tests (dropped 2, simplified 2 to remove
the dual-knob exercise paths).

* Cleanup: untrack proposal-issue.md; mark S2.2 complete in plan.md

Two small fixes to the previous commit's collateral:

1. proposal-issue.md was inadvertently committed via `git add -A` in
   the previous commit. The file is a working draft of the upstream
   issue (already tracked in the valkey-io issue tracker) and doesn't
   belong in the planning directory. Removing.

2. plan.md still showed S2.2 as `[ ]`. Implementation-complete state
   matches the S2.1 marking convention (`[x]` once the task ships); on
   merge to unstable the marking becomes definitive.
GilboaAWS referenced this pull request in GilboaAWS/valkey Jun 24, 2026
* [S2.7] Compression write-path hook

Wires compressionEnqueueCandidate into dbAddInternal and dbSetValue,
and replaces the TODO(S2.7) placeholder in the drain handler with a
real install path. With this change, writes to eligible STRING values
get queued for background compression and the result is installed back
into the kvstore as an OBJ_ENCODING_COMPRESSED robj.

The decoder (S2.6) is shipped but not yet wired into read paths (S2.8),
so as long as compression-enabled stays no (default), behavior is
unchanged. Once an operator turns the switch on, written values get
compressed, but reads return the compressed bytes until S2.8 lands.
Existing transparency tests verify no regression in the default-off
configuration.

Producer side (compression.c, db.c)

  Two seams in db.c — end of dbAddInternal and end of dbSetValue —
  call compressionEnqueueCandidate(key, value, db->id). The candidate
  function applies four guards:
    1. Master switch (compression_enabled, via compressionIsEligible).
    2. R2.2 eligibility (type/encoding/size/hot-key — also via predicate).
    3. R2.1.5 active-dict check — saves an allocator round-trip when
       compression-enabled=yes but training hasn't completed.
    4. incrRefCount(value) — pins the bytes for the worker AND
       reserves the robj address for the drain handler's pointer-
       equality stale check (ABA-safe per R2.4.4 + the lifetime
       discussion in PR valkey-io#18).

  If the worker pool refuses (not started; future S2.11 inbox full),
  the pin is released immediately. RDB-load enqueue is deliberately
  skipped — TODO(S2.10): the sweep tick will rediscover RDB-loaded
  values without hammering the inbox during load.

API change: compressionWorkersEnqueue

  Old: compressionWorkersEnqueue(sds key, int dbid, uint64_t version, sds src)
  New: compressionWorkersEnqueue(robj *value, int dbid)

  The new form requires a pinned robj; the worker reads
  objectGetVal(value) once at enqueue (captured into job->src) and
  never touches the robj afterwards (R2.11.4 intact). The drain
  handler uses job->value for the kvstore lookup and the pointer-
  equality stale check.

  The version field is gone — pointer equality, made ABA-safe by the
  pin, is sufficient. R2.4.4 explains why: holding incrRefCount(value)
  prevents the allocator from reusing the address while the job is
  in flight.

Drain install (compression_workers.c)

  New compressionInstall() helper:
    1. void **slot = kvstoreHashtableFindRef(db->keys, didx, key_sds);
    2. If slot == NULL OR *slot != job->value: stale (overwrite, expire,
       or COW). Discard.
    3. Else: createCompressedObject(OBJ_STRING, job->dst, job->dst_len);
       dbReplaceValue installs.
    4. compressionRegistryIncRef(job->dict_id) on success.

  dbReplaceValue routes through dbSetValue(..., overwrite=0, ...),
  which does NOT call signalModifiedKey, moduleNotifyKeyUnlink, or
  signalDeletedKeyAsReady. Background compression is a storage-only
  change per R2.9.2 — no WATCH dirty_cas, no client-side-caching
  invalidations, no keyspace notifications.

  Pin released on every drain completion path (success, stale-discard,
  net-savings reject, ZSTD error, no-active-dict). Test-mode jobs
  (job->value == NULL) skip both install and decRef.

Test migration

  The 15 existing test-fixture call sites passed raw sds + dummy
  version. Migrated to a new testOnlyCompressionWorkersEnqueueRaw(src,
  dbid) that sets job->value = NULL. Tests extract jobs via
  testOnlyCompressionWorkersDrainOutbox before the production drain
  runs, so production-only paths (install, decRef) are never reached
  by the value=NULL sentinel.

  No new gtest cases for the install path itself — that requires a
  fully-initialized server.db / kvstore that the unit-test environment
  doesn't construct. End-to-end coverage will come from the Tcl
  transparency harness once S2.8 wires the read path.

TODO(S4.1) markers added at:
  - compressionInstall: compression_compressions_per_sec, EMA fold,
    compression_compressed_objects.
  - compressionEnqueueCandidate: compression_candidates_dropped_total
    when S2.11 lands (today the pool-not-started rejection is a
    config state, not back-pressure).

Verified locally:
  - make -j2 -C src              → clean (BUILD_ZSTD=yes default).
  - make -j2 -C src BUILD_ZSTD=no → clean.
  - ./runtest --single unit/type/compression → 10/10 pass.

  gtest unit tests not runnable locally; CI validates.

Diff stat:

  .../implementation/plan.md                |   4 +-
  src/compression.c                         |  35 +++-
  src/compression.h                         |  27 ++-
  src/compression_workers.c                 | 185 +++++++++++++++------
  src/compression_workers.h                 |  56 +++----
  src/db.c                                  |  14 ++
  src/unit/test_compression_workers.cpp     |  31 ++--
  7 files changed, 244 insertions(+), 108 deletions(-)

* [S2.7] PR valkey-io#19 review: assert + design-doc alignment

Two reviewer threads addressed:

Thread #1 (T-3369017721) — production code carrying test concerns

  The drain handler had a `if (job->value == NULL)` branch that only
  existed to handle test-only jobs from
  testOnlyCompressionWorkersEnqueueRaw. Reviewer correctly pointed out
  that production code shouldn't carry test-only branches.

  Fix: replaced with serverAssert(job->value != NULL) at the top of
  the per-job loop. Production drain assumes every job has a real
  pinned robj; tests must extract their value=NULL jobs via
  testOnlyCompressionWorkersDrainOutbox before this drain runs.

  Side effect: removed the conditional `if (job->value != NULL)`
  guards around decrRefCount and the install branch — the top-of-loop
  assert means every code path can assume value is non-NULL.

Thread valkey-io#2 (T-3356207626) — design doc out of sync with implementation

  Design §4.6 still described the original version-counter approach
  for staleness detection (`uint64_t version` field on compressionJob,
  "if version counter moved, discard"). The implementation has used
  pointer equality + the incrRefCount-pin since S2.4 PR valkey-io#13.

  Fix: updated §4.6 to:
    - compressionJob struct: drop `version`, drop `robj *key`, add
      `robj *value` (pinned via incrRefCount), and `sds src` and
      `int dbid` separately, matching the actual struct.
    - Concurrency notes: replaced the "version counter moved" bullet
      with the pointer-equality + ABA-safety reasoning, naming the
      incrRefCount-reserves-the-address invariant as the protection
      mechanism (same property explained in PR valkey-io#18 review).

Verified locally:
  - make -j2 -C src              → clean
  - ./runtest --single unit/type/compression  → 10/10 pass

* [S2.7] Fix CI: remove erroneous & on server.db indexing

build-32bit (and the 30+ downstream cells, all CI cells use -Werror):

  compression_workers.c:531:20: error: initialization of 'serverDb *'
    from incompatible pointer type 'serverDb **'
    [-Werror=incompatible-pointer-types]

`server.db` is `serverDb **` (array of pointers, one per DB). So
`server.db[i]` is already `serverDb *` — the address-of operator was
redundant and produced `serverDb **`.

Fix: drop the `&`. Matches the pattern used everywhere else in the
codebase (db.c, server.c, etc.).

Local make didn't catch this — the default SERVER_CFLAGS doesn't
include -Werror. CI does. Built locally with `make SERVER_CFLAGS=-Werror`
to confirm clean.

* [S2.7] Fix CI: tests must use testOnly drain for value=NULL jobs

5 gtest cases failed on build-32bit (and would on every test cell)
with the new production-drain serverAssert(job->value != NULL):

  ASSERTION FAILED: compression_workers.c:591 'job->value != NULL'

  in: SingleJobRoundTrip, BurstOf256JobsOneWorker,
      BurstOf1024JobsFourWorkers, ResizeAcrossEnqueuedJobs,
      NetSavingsGuardRejectsIncompressible

Root cause: the previous commit's reviewer-driven hardening (PR valkey-io#19
review thread #1) made the production drain assert that every job
has a non-NULL pinned robj. The premise was "tests use the testOnly
drain to extract jobs before the production drain runs". That premise
was wrong — many tests ALSO call compressionWorkersDrainOutbox
directly to consume-and-dispose test-mode jobs (the drainUntil helper
is the most-used path).

Fix: add testOnlyCompressionWorkersDrainAndDispose(budget) — pulls
jobs via the existing testOnlyCompressionWorkersDrainOutbox, frees
them via testOnlyCompressionWorkersFreeJob, returns count. Migrate
the test fixture's drainUntil helper and all 8 direct
compressionWorkersDrainOutbox call sites in the test file to the
new helper.

Production drain stays clean — no test concerns. Reviewer thread #1
intent preserved.

Verified locally:
  - make -j2 -C src SERVER_CFLAGS=-Werror   → clean
  - ./runtest --single unit/type/compression → 10/10 pass
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants