Skip to content

Gemini Backend#9

Merged
Ying1123 merged 10 commits intomainfrom
gemini
Jan 17, 2024
Merged

Gemini Backend#9
Ying1123 merged 10 commits intomainfrom
gemini

Conversation

@caoshiyi
Copy link
Copy Markdown
Contributor

add support for gemini backend

@caoshiyi caoshiyi requested a review from Ying1123 January 16, 2024 00:33
@merrymercy merrymercy force-pushed the main branch 4 times, most recently from 5a84411 to 97b04d4 Compare January 16, 2024 05:44
Comment thread examples/quick_start/gemini_example_multimodal.py Outdated
Comment thread examples/quick_start/gemini_example_multimodal.py Outdated
Comment thread python/sglang/lang/ir.py Outdated
Comment thread python/sglang/backend/gemini.py Outdated
Comment thread examples/quick_start/gemini_example_stream.py Outdated
Comment thread examples/quick_start/images/cat.jpeg Outdated
Comment thread examples/quick_start/images/rat.jpeg Outdated
Comment thread python/sglang/lang/interpreter.py Outdated
Comment thread examples/quick_start/gemini_example_multimodal.py Outdated
@Ying1123
Copy link
Copy Markdown
Contributor

Ying1123 commented Jan 17, 2024

test/lang/run_all.py all passed. (https://github.com/sgl-project/sglang/blob/main/docs/test_process.md)

@Ying1123 Ying1123 merged commit fd7c479 into main Jan 17, 2024
@Ying1123 Ying1123 deleted the gemini branch January 17, 2024 06:29
Ying1123 pushed a commit that referenced this pull request Sep 13, 2024
yanbing-j added a commit to yanbing-j/sglang that referenced this pull request May 30, 2025
… and decode attention kernel (sgl-project#9)

* Add intel_amx backend for attention, including extend and decode

* update
pengxin99 pushed a commit to pengxin99/sglang that referenced this pull request Jun 19, 2025
sleepcoo added a commit to shuaills/sglang that referenced this pull request Jun 24, 2025
yichiche pushed a commit to yichiche/sglang that referenced this pull request Jul 30, 2025
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
siuhunh pushed a commit to xing-wenjin/sglang that referenced this pull request Aug 1, 2025
…agregation

add llm-datadist feature to realize pd disaggregate
yichiche pushed a commit to yichiche/sglang that referenced this pull request Aug 7, 2025
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
yichiche pushed a commit to yichiche/sglang that referenced this pull request Aug 11, 2025
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
someoneexistsontheinternet pushed a commit to someoneexistsontheinternet/sglang that referenced this pull request Oct 23, 2025
kalyank007 pushed a commit to kalyank007/sglang that referenced this pull request Nov 7, 2025
amd-youchen referenced this pull request in amd-youchen/sglang Nov 13, 2025
[Fix] fix fuse share expert function in MI35X
yhyang201 pushed a commit that referenced this pull request Dec 13, 2025
* test: add EPD disaggregation integration tests

* fix comment for encoder-only

* revert http_server warmup for vlm
fstandhartinger pushed a commit to fstandhartinger/sglang that referenced this pull request Jan 13, 2026
tpoisonooo pushed a commit to tpoisonooo/sglang that referenced this pull request Feb 12, 2026
chx96642264 pushed a commit to chx96642264/sglang that referenced this pull request Mar 11, 2026
lawrence-harmonic added a commit to lawrence-harmonic/sglang that referenced this pull request Mar 19, 2026
mmangkad pushed a commit to mmangkad-dev/sglang that referenced this pull request Apr 3, 2026
rucnyz added a commit to rucnyz/sglang that referenced this pull request Apr 30, 2026
…s 28 xfers

v9 pool-binding-shift trace produces real differentiation:
- Phase B (KV-bound 8K random): L1+L2 -37% mean TTFT vs stock
- Phase C (mixed 4K random):     L1+L2 -38% median E2E vs stock
- Cross-pool transfers: stock 0, L1-only 0, L2-only 0, L1+L2 28

Two surprising findings documented:
1. Layer 2 alone fires zero transfers — Layer 1 retention is what
   makes Layer 2 cross the firing threshold.
2. Phase A regresses with L1 (-20% TPS) because K_big=8192 hurts on
   prefix-friendly GSP. Consistent with A2's K_big=0-wins finding.
   Adaptive K_big control marked as follow-up.

Settings status: Setting 1 marked **DONE v6 NULL + v9 PASS**.
All 4 user-requested follow-ups (sgl-project#9 Q3.A 4-arm, sgl-project#10 Sweep 1
multi-seed, sgl-project#11 Setting 4 fallback rule, sgl-project#12 Setting 1 v9 trace)
now complete.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants