Skip to content

perf(mooncakestore connector): optimize get/put with zero-copy operations#1269

Merged
maobaolong merged 2 commits intoLMCache:devfrom
xiaguan:enhance_mooncake_connector
Aug 20, 2025
Merged

perf(mooncakestore connector): optimize get/put with zero-copy operations#1269
maobaolong merged 2 commits intoLMCache:devfrom
xiaguan:enhance_mooncake_connector

Conversation

@xiaguan
Copy link
Copy Markdown
Contributor

@xiaguan xiaguan commented Aug 7, 2025

another version of #988

Although I've run multiple benchmark, the performance doesn't match that of PR #988. This might be because the remote connector requires additional code paths to handle operations. However, it still provides significant performance improvements.

optimized connector

============ Serving Benchmark Result ============
Successful requests:                     50        
Benchmark duration (s):                  5.72      
Total input tokens:                      404689    
Total generated tokens:                  6400      
Request throughput (req/s):              8.74      
Output token throughput (tok/s):         1118.59   
Total Token throughput (tok/s):          71849.95  
---------------Time to First Token----------------
Mean TTFT (ms):                          1540.05   
Median TTFT (ms):                        1477.40   
P99 TTFT (ms):                           2029.28   
-----Time per Output Token (excl. 1st token)------
Mean TPOT (ms):                          32.74     
Median TPOT (ms):                        33.24     
P99 TPOT (ms):                           43.12     
---------------Inter-token Latency----------------
Mean ITL (ms):                           32.74     
Median ITL (ms):                         29.06     
P99 ITL (ms):                            42.61     
==================================================

compare to result in #988, throuput 77494 tok/s, mean ttft: 925ms, media ttft: 947ms

@xiaguan
Copy link
Copy Markdown
Contributor Author

xiaguan commented Aug 7, 2025

@maobaolong @Shaoting-Feng

Are you available to review this? I've modified it to the connector level based on the feedback, which introduces a slight performance overhead, but overall it still delivers substantial performance improvements.

Hope we can merge either of these optimizations soon. Thanks!

Copy link
Copy Markdown
Collaborator

@maobaolong maobaolong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiaguan Thanks for this PR, left some comments.

"""
raise NotImplementedError

async def batched_get(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems duplicated with L209

Comment thread lmcache/v1/storage_backend/connector/instrumented_connector.py Outdated
Comment thread lmcache/v1/storage_backend/connector/mooncakestore_connector.py
…ions

Signed-off-by: Jinyang Su <751080330@qq.com>
@xiaguan xiaguan force-pushed the enhance_mooncake_connector branch from f7adf01 to 7f759e0 Compare August 11, 2025 05:19
@xiaguan
Copy link
Copy Markdown
Contributor Author

xiaguan commented Aug 11, 2025

@maobaolong Thanks for your review. All issues have been addressed. Could you please take another look?

I've only modified the mooncakestore_connector.py file - could we expedite the merge process?

@maobaolong
Copy link
Copy Markdown
Collaborator

@xiaguan Thanks for help me and @chunxiaozheng to go through the code by tencent metting, it really help us to know well to this PR quickly. Will give you some fead back this week.

First of all this PR really supply some performance improvement points by batched operation and zero copy.

Comment thread lmcache/v1/storage_backend/connector/mooncakestore_connector.py
Copy link
Copy Markdown
Collaborator

@maobaolong maobaolong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiaguan This PR look good overal left some exception handling related comments inline, otherwise, it LGTM. thank for your improvement.

Comment thread lmcache/v1/storage_backend/connector/mooncakestore_connector.py
Comment thread lmcache/v1/storage_backend/connector/mooncakestore_connector.py
Comment thread lmcache/v1/storage_backend/connector/mooncakestore_connector.py
Comment thread lmcache/v1/storage_backend/connector/mooncakestore_connector.py
@maobaolong
Copy link
Copy Markdown
Collaborator

@xiaguan After a discussion with @chunxiaozheng offline, I aware that we should not raise the exception, return the mem_object we read or None is enough, because there is another work is right doing parallel by others. For this PR, it is ok for me to keep it not worse than before.

…mooncake_connector

Signed-off-by: Jinyang Su <751080330@qq.com>
@xiaguan xiaguan force-pushed the enhance_mooncake_connector branch from 1fe203a to 74d7c60 Compare August 18, 2025 02:48
@xiaguan
Copy link
Copy Markdown
Contributor Author

xiaguan commented Aug 18, 2025

Thank you both for your thorough reviews - all comments have been resolved.

Copy link
Copy Markdown
Collaborator

@chunxiaozheng chunxiaozheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@maobaolong maobaolong merged commit d1bbb41 into LMCache:dev Aug 20, 2025
11 checks passed
@byte-ss
Copy link
Copy Markdown

byte-ss commented Aug 20, 2025

Could you please explain why the method batched_submit_put_task() was not included in this commit? I believe it would also bring performance improvements.

@xiaguan
Copy link
Copy Markdown
Contributor Author

xiaguan commented Aug 20, 2025

Could you please explain why the method batched_submit_put_task() was not included in this commit? I believe it would also bring performance improvements.

Write requests are executed asynchronously in the background and aren't on the critical path.

There might be some benefit to this approach - feel free to give it a try.

@ANormalMan12
Copy link
Copy Markdown

Nice

ziruiliu pushed a commit to ziruiliu/LMCache that referenced this pull request Sep 5, 2025
…tions (LMCache#1269)

perf(mooncakestore connector): optimize get/put with zero-copy operations

Signed-off-by: Jinyang Su <751080330@qq.com>
KevinCheung2259 pushed a commit to KevinCheung2259/LMCache that referenced this pull request Nov 5, 2025
…tions (LMCache#1269)

perf(mooncakestore connector): optimize get/put with zero-copy operations

Signed-off-by: Jinyang Su <751080330@qq.com>
DongDongJu pushed a commit to DongDongJu/LMCache that referenced this pull request Feb 22, 2026
…tions (LMCache#1269)

perf(mooncakestore connector): optimize get/put with zero-copy operations

Signed-off-by: Jinyang Su <751080330@qq.com>
sammshen pushed a commit to sammshen/LMCache that referenced this pull request Mar 1, 2026
…tions (LMCache#1269)

perf(mooncakestore connector): optimize get/put with zero-copy operations

Signed-off-by: Jinyang Su <751080330@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants