Conversation
shahsb
left a comment
There was a problem hiding this comment.
Scalability in Clusters: How does this behave in Redis Cluster? Slot filtering helps, but aggregating across nodes might need future work.
Comparison to Alternatives: How does CHK compare to other sketches like Count-Min or HyperLogLog in this context? A brief rationale would strengthen the case.
|
@minchopaskal I had a quick pass over the PR and left a few minor comments. I assume the API is not final yet so we don't have more details about the commands in the top comment. e.g. what is SLOTS or SAMPLE. I assume we also don't have more tests because of this. Regarding cuckoo impl, maybe it is a good idea to add some comments over each function (only to important ones) to describe what it is doing. So, people who have no idea about the algorithm will be a chance to understand what is going on. |
|
I'm thinking about if we can add a new hotkey.c file and put both the chk.c and hotkey commands in that file. It doesn't seem appropriate to put them in server.c. server.c already has over 7000 lines and is too large. |
❌ Security scan failedSecurity scan failed: Branch hotkeys-detection does not exist in the remote repository 💡 Need to bypass this check? Comment |
❌ Security scan failedSecurity scan failed: Branch hotkeys-detection does not exist in the remote repository 💡 Need to bypass this check? Comment |
❌ Security scan failedSecurity scan failed: Branch hotkeys-detection does not exist in the remote repository 💡 Need to bypass this check? Comment |
❌ Security scan failedSecurity scan failed: Branch hotkeys-detection does not exist in the remote repository 💡 Need to bypass this check? Comment |
oranagra
left a comment
There was a problem hiding this comment.
sorry for the delay. a few suggestions for INFO
| "tracking-active:%d\r\n" | ||
| "used-memory:%zu\r\n" | ||
| "cpu-time:%lld\r\n", |
There was a problem hiding this comment.
the fact these in a "hotkeys" section isn't enough, they should have some prefix.
the sections are just used to filter / ask specific info sections.
but the result is that all fields are then mixed in one dict.
There was a problem hiding this comment.
i wonder if the used-memory should be moved to the memory section.
in any case, it should probably be accounted for in the used_memory_overhead metric.
but being able to see a breakdown there (if it can be big) might be a good idea.
There was a problem hiding this comment.
It may be big only if for some reason key names are really big as we store the top-K's key names in a heap. Do you think we expect that?
I will add it to used_memory_overhead and fix the oversight of the missing prefixes. A bit reserverd about adding it in the memory section though - I think it should be either in "hotkeys" or in "memory" section (not both) and "hotkeys" seems more logical - WDYT, @oranagra
There was a problem hiding this comment.
obviously not in both.
the question is what's the trigger for someone wanting to see the memory overhead of this mechanism. is it that they're looking for hot keys, or that they're looking to explain memory isssues.
e.g. the mem_replication_backlog metric is in memory, not in replication, same goes for mem_aof_buffer.
if we argue that it's very unlikely that it'll be very big, we can just sum it into the overhead, and skip creating a specific metric for it.
There was a problem hiding this comment.
I see what you mean. I'll remove the metric then - I don't see a scenario in which overhead becomes more than 1 MB.
Add the memory overhead of the hotkeyStats structure to `used_memory_overhead`, add `hotkeys-` prefix to hotkey keys in INFO and remove `used_memory` in the hotkeys info section as it's unneeded (too little memory for us to care about). Tnx @oranagra for pointing [this](#14680 (comment)) out.
…s#14711) Add the memory overhead of the hotkeyStats structure to `used_memory_overhead`, add `hotkeys-` prefix to hotkey keys in INFO and remove `used_memory` in the hotkeys info section as it's unneeded (too little memory for us to care about). Tnx @oranagra for pointing [this](redis#14680 (comment)) out.
Description
Introducing a new method for identifying hotkeys inside a redis server during a tracking time period.
Hotkeys in this context are defined by two metrics:
Usage
Although the API is subject to change the general idea is for the user to initiate a hotkeys tracking process which should run for some time. The keys' metrics are recorded inside a probabilistic structure and after that the user is able to fetch the top K of them.
Current API
HOTKEYS START
Start a tracking session if either no is already started, or one was stopped or reset. Return error if one is in progress.
HOTKEYS GET
Return array of the chosen metrics to track and various other metadata. (nil) if no tracking was started or it was reset.
HOTKEYS STOP
Stop tracking session but user can still get results from
HOTKEYS GET.HOTKEYS RESET
Release resources used for hotkeys tracking only when it is stopped. Return error if a tracking is active.
Additional changes
The
INFOcommand now has a "hotkeys" section with 3 fieldsImplementation
Independent of API, implementation is based on a probabilistic structure - Cuckoo Heavy Keeper structure with added min-heap to keep track of top K hotkey's names. CHK is an loosely based on HeavyKeeper which is used in RedisBloom's TopK but has higher throughput.
Random fixed probability sampling via the
HOTKEYS start sample <ratio>param. Each key is sampled with probability1/ratio.Performance implications
With low enough sample rate (controlled by
HOTKEYS start sample <ratio>) there is negligible performance hit. Tracking every key though can incur up to 15% hit in the worst case after running the tests in this bench.