Add async store and pipelining code by YaoJiayi · Pull Request #4 · LMCache/LMCache

YaoJiayi · 2024-06-17T04:30:57Z

No description provided.

ApostaC · 2024-06-17T18:09:55Z

A general comment on the design choices in coding:
If we want to support two different implementations of the same "logical operation", please consider implementing 2 different classes instead of writing new functions in the same class.
For example, when implementing pipeline optimization for get, it would be better to have a new class derived from LMCBackendInterface and override the batched_get function instead of adding a new function called batched_get_pipeline in the existing class.

This has at least 2 benefits:

you don’t need to change the processing logic of the callers (so the callers will call the same function, instead of calling a new function named batched_get_pipeline
If there is a configuration item that controls the use of the pipeline or not, you only need to check the configuration once during the initialization. Otherwise, you need to check the configuration during the runtime, which makes the code less readable.

Note that this is less about the runtime performance but more related to the readability and extendability of the code.

ApostaC · 2024-06-17T18:30:00Z

Also, we need to have the updated Dockerfile and the deploy instructions (now it does not include installation of torchac_cuda)

* [Add] optimized NIXL backend (WIP) * [Add] better async primitives * update cache engine log message Signed-off-by: ApostaC <yihua98@uchicago.edu> * [stash] debug works Signed-off-by: ApostaC <yihua98@uchicago.edu> * [Fix] performance and correctness bugs Signed-off-by: ApostaC <yihua98@uchicago.edu> * fix format checker issues Signed-off-by: ApostaC <yihua98@uchicago.edu> * [fix] isort errors Signed-off-by: ApostaC <yihua98@uchicago.edu> * fix format * fix format again * disable debug logs * update Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * merge conflict Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> * updated Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> --------- Signed-off-by: ApostaC <yihua98@uchicago.edu> Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by: ApostaC <yihua98@uchicago.edu> Co-authored-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

add full_lookup in api_server

Add async store and pipelining code

…request the current log belongs to (LMCache#4) LMCache#2812 Signed-off-by: baoloongmao <baoloongmao@tencent.com>

YaoJiayi and others added 3 commits June 14, 2024 02:17

add async store and pipeline to test on runpod

ed81641

server seems to crash with multiple net threads, set network thread to 1

6e9a9e8

Merge branch 'main' into main

0ebc39f

YaoJiayi merged commit eff09fd into LMCache:main Jun 17, 2024

NumberWan pushed a commit to NumberWan/LMCache that referenced this pull request Sep 10, 2025

Merge pull request LMCache#4 from echo-zx/add-full-lookup

01e71f5

add full_lookup in api_server

KevinCheung2259 pushed a commit to KevinCheung2259/LMCache that referenced this pull request Nov 5, 2025

Merge pull request LMCache#4 from YaoJiayi/main

59bcd49

Add async store and pipelining code

DongDongJu referenced this pull request in DongDongJu/LMCache Feb 22, 2026

Merge pull request #4 from YaoJiayi/main

5f505f4

Add async store and pipelining code

lavanyabollepalli mentioned this pull request Mar 12, 2026

[BUG] GPU failure during repeated model loading when using --enable-prefix-caching with KV transfer (LMCacheConnectorV1) GPU ERR! state #2756

Open

gemini-code-assist Bot mentioned this pull request Mar 16, 2026

[MP][Observability][1/3] EventBus core infrastructure + OpenTelemetry dependency #2792

Merged

3 tasks

hlin99 mentioned this pull request May 10, 2026

Add global device abstraction to replace torch.cuda with unified torch_dev/torch_device_type #3091

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add async store and pipelining code#4

Add async store and pipelining code#4
YaoJiayi merged 3 commits intoLMCache:mainfrom
YaoJiayi:main

YaoJiayi commented Jun 17, 2024

Uh oh!

ApostaC commented Jun 17, 2024

Uh oh!

ApostaC commented Jun 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

YaoJiayi commented Jun 17, 2024

Uh oh!

ApostaC commented Jun 17, 2024

Uh oh!

ApostaC commented Jun 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants