[MP][optimize] optimize evict in lru policy#2740
Conversation
Signed-off-by: idellzheng <idellzheng@tencent.com>
maobaolong
left a comment
There was a problem hiding this comment.
@chunxiaozheng Thanks for this fix, LGTM.
@ApostaC @sammshen Would you like to take a look at this PR? Thanks!
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a strategic optimization to the LRU eviction policy by altering the order in which keys are processed during creation and access. The change specifically targets scenarios where keys might have prefix dependencies, ensuring that the eviction logic prioritizes keys in a way that maximizes the likelihood of subsequent cache hits for related data. This refinement is expected to lead to more efficient cache utilization and improved performance. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request optimizes the LRU eviction policy for a specific prefix-caching use case by reversing the order of keys processed in a batch. While the intention is good, this change breaks existing unit tests that rely on the previous batch processing order. I've pointed this out as a high-severity issue. Additionally, I've made a couple of medium-severity suggestions to improve the clarity and reduce duplication in the comments that were added to explain the change.
| # NOTE: for the request, the later keys should be evicted first. | ||
| # For example, the request has (key1, key2, key3), if we first | ||
| # evict key1, due to prefix match, key2 and key3 will not be hit. | ||
| for key in reversed(keys): |
There was a problem hiding this comment.
This change, while addressing the prefix caching scenario, seems to break existing unit tests. For example, TestLRUEvictionPolicyOrdering.test_touch_moves_key_to_most_recent and test_create_existing_key_moves_to_most_recent in tests/v1/distributed/test_lru_eviction_policy.py will likely fail.
These tests create keys in a batch, e.g., policy.on_keys_created([make_key(1), make_key(2), make_key(3)]), and then assert on the LRU order.
- Before your change: The order of keys in the
_orderdict would be1, 2, 3(from LRU to MRU). - After your change: The order becomes
3, 2, 1.
This change in ordering breaks the assertions in those tests. Please update the tests to reflect the new intended behavior and ensure no existing functionality is regressed.
| # NOTE: for the request, the later keys should be evicted first. | ||
| # For example, the request has (key1, key2, key3), if we first | ||
| # evict key1, due to prefix match, key2 and key3 will not be hit. |
There was a problem hiding this comment.
This comment is helpful for context, but it's duplicated in on_keys_touched and the explanation could be clearer. To improve maintainability and clarity, consider replacing this comment and the one in on_keys_touched with a more descriptive one.
Suggested alternative:
# When processing a batch of keys from a single request (e.g., for prefix caching),
# keys are often ordered from shortest to longest prefix. To maximize cache hits,
# we should evict longer prefixes before shorter ones. By reversing the list,
# we mark shorter prefixes (which appear earlier in the original list) as more
# recently used, making them less likely to be evicted.| # NOTE: for the request, the later keys should be evicted first. | ||
| # The example is the same as `on_keys_created`. |
Signed-off-by: idellzheng <idellzheng@tencent.com> Signed-off-by: Aaron Wu <aaron.wu@dell.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
For one request, the later keys should be evicted first, for example, the request has
(key1, key2, key3), if we first evictkey1, due to prefix match,key2andkey3will not be hit.My test result is as follows:
