Fix defrag test #12674

hpatro · 2023-10-18T22:08:15Z

Fixing issues described in #12672, started after #11695 when the defrag tests are being executed in cluster mode too.
For some reason, it looks like the defragmentation is over too quickly, before the test is able to detect that it's running.
so now instead of waiting to see that it's active, we wait to see that it did some work

[err]: Active defrag big list: cluster in tests/unit/memefficiency.tcl
defrag not started.
[err]: Active defrag big keys: cluster in tests/unit/memefficiency.tcl
defrag didn't stop.

This reverts commit 0f259cd.

hpatro · 2023-10-19T21:39:25Z

For failure with defrag not started. The observation from the output on failure is active_defrag_running is 0. However, the total_active_defrag_time is greater than 0, In this case it's 120. Also, the allocator_frag_ratio has dropped below the threshold (1.05). See the info memory / info stats o/p below.

So, the validation can be moved to check total_active_defrag_time to be greater than 0 which is more accurate rather than checking if active_defrag_running or not.

Fix: 93b309c

# Memory
used_memory:106934160
used_memory_human:101.98M
used_memory_rss:171999232
used_memory_rss_human:164.03M
used_memory_peak:183308360
used_memory_peak_human:174.82M
used_memory_peak_perc:58.34%
used_memory_overhead:32940064
used_memory_startup:3747088
used_memory_dataset:73994096
used_memory_dataset_perc:71.71%
allocator_allocated:107043576
allocator_active:108261376
allocator_resident:154058752
total_system_memory:99000156160
total_system_memory_human:92.20G
used_memory_lua:18091008
used_memory_vm_eval:18091008
used_memory_lua_human:17.25M
used_memory_scripts_eval:27324288
number_of_cached_scripts:50000
number_of_functions:0
number_of_libraries:0
used_memory_vm_functions:32768
used_memory_vm_total:18123776
used_memory_vm_total_human:17.28M
used_memory_functions:184
used_memory_scripts:27324472
used_memory_scripts_human:26.06M
maxmemory:0
maxmemory_human:0B
maxmemory_policy:allkeys-lru
allocator_frag_ratio:1.01
allocator_frag_bytes:1217800
allocator_rss_ratio:1.42
allocator_rss_bytes:45797376
rss_overhead_ratio:1.12
rss_overhead_bytes:17940480
mem_fragmentation_ratio:1.61
mem_fragmentation_bytes:65065272
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_total_replication_buffers:0
mem_clients_slaves:0
mem_clients_normal:33424
mem_cluster_links:0
mem_aof_buffer:0
mem_allocator:jemalloc-5.3.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0

# Stats
total_connections_received:1
total_commands_processed:1100070
instantaneous_ops_per_sec:8
total_net_input_bytes:150701569
total_net_output_bytes:9976882
total_net_repl_input_bytes:0
total_net_repl_output_bytes:0
instantaneous_input_kbps:0.13
instantaneous_output_kbps:51.73
instantaneous_input_repl_kbps:0.00
instantaneous_output_repl_kbps:0.00
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:0
expired_stale_perc:0.00
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:0
evicted_keys:0
evicted_clients:0
total_eviction_exceeded_time:0
current_eviction_exceeded_time:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
pubsubshard_channels:0
latest_fork_usec:0
total_forks:0
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:109231
active_defrag_misses:360773
active_defrag_key_hits:7
active_defrag_key_misses:0
total_active_defrag_time:120
current_active_defrag_time:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:0
dump_payload_sanitizations:0
total_reads_processed:634334
total_writes_processed:632781
io_threaded_reads_processed:0
io_threaded_writes_processed:0
client_query_buffer_limit_disconnections:0
client_output_buffer_limit_disconnections:0
reply_buffer_shrinks:4
reply_buffer_expands:8
eventloop_cycles:636246
eventloop_duration_sum:6318397
eventloop_duration_cmd_sum:1634310
instantaneous_eventloop_cycles_per_sec:107
instantaneous_eventloop_duration_usec:8
acl_access_denied_auth:0
acl_access_denied_cmd:0
acl_access_denied_key:0
acl_access_denied_channel:0

oranagra · 2023-10-21T05:09:35Z

Ci test
https://github.com/redis/redis/actions/runs/6595540519

So if I understand correctly, you're arguing that because your your changes (few additional keys, expiry data, and cluster mode), the tests run faster and finish too quickly? Sounds unlikely to me.

I think I saw other errors for certain thresholds, not just the "Defrag didn't start". Don't remember which tests I saw failing.

hpatro · 2023-10-21T05:26:21Z

Ci for that unit https://github.com/redis/redis/actions/runs/6593855885

So if I understand correctly, you're arguing that because your your changes (few additional keys, expiry data, and cluster mode), the tests run faster and finish too quickly? Sounds unlikely to me.

Well the info output suggests otherwise.

I think I saw other errors for certain thresholds, not just the "Defrag didn't start". Don't remember which tests I saw failing.

Yes @roshkhatri is digging into the other ones listed on #12672 .

oranagra

it seems odd that the per-slot dict and / or cluster mode will cause the defrag to finish sooner.
but i also don't see anything wrong with that change.

Fixing issues described in #12672, started after #11695 Related to #12674 Fixes the `defrag didn't stop' issue. In some cases of how the keys were stored in memory defrag_later_item_in_progress was not getting reset once we finish defragging the later items and we move to the next slot. This stopped the scan to happen in the later slots and did not get

Reverts the skipping defrag tests in cluster mode (done in #12672. instead it skips only some defrag tests that are relevant for cluster modes. The test now run well after investigating and making the changes in #12674 and #12694. Co-authored-by: Oran Agra <oran@redislabs.com>

Fixing issues described in redis#12672, started after redis#11695 Related to redis#12674 Fixes the `defrag didn't stop' issue. In some cases of how the keys were stored in memory defrag_later_item_in_progress was not getting reset once we finish defragging the later items and we move to the next slot. This stopped the scan to happen in the later slots and did not get

hpatro added 3 commits October 18, 2023 18:37

Disable flaky defrag tests affecting daily run

0f259cd

Revert "Disable flaky defrag tests affecting daily run"

340703d

This reverts commit 0f259cd.

Fix defrag not started tests by depending on time spent on defrag

93b309c

hpatro mentioned this pull request Oct 19, 2023

Disable flaky defrag tests affecting daily run #12672

Merged

oranagra marked this pull request as ready for review October 20, 2023 18:17

oranagra approved these changes Oct 22, 2023

View reviewed changes

oranagra merged commit 26eb4ce into redis:unstable Oct 22, 2023

roshkhatri mentioned this pull request Oct 25, 2023

Reset later item flag after defrag later is done #12694

Merged

roshkhatri mentioned this pull request Oct 31, 2023

re-enable defrag tests in cluster mode #12710

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix defrag test #12674

Fix defrag test #12674

Uh oh!

hpatro commented Oct 18, 2023 •

edited by oranagra

Loading

Uh oh!

hpatro commented Oct 19, 2023 •

edited

Loading

Uh oh!

oranagra commented Oct 21, 2023 •

edited

Loading

Uh oh!

hpatro commented Oct 21, 2023

Uh oh!

oranagra left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix defrag test #12674

Fix defrag test #12674

Uh oh!

Conversation

hpatro commented Oct 18, 2023 • edited by oranagra Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hpatro commented Oct 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oranagra commented Oct 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hpatro commented Oct 21, 2023

Uh oh!

oranagra left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hpatro commented Oct 18, 2023 •

edited by oranagra

Loading

hpatro commented Oct 19, 2023 •

edited

Loading

oranagra commented Oct 21, 2023 •

edited

Loading