SemiAnalysisAI / InferenceX Public

Notifications You must be signed in to change notification settings
Fork 125
Star 792

Code
Issues 93
Pull requests 21
Discussions
Actions
Projects
Models
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Models
Security and quality
Insights

Pull requests: SemiAnalysisAI/InferenceX

Labels 32 Milestones 6

New pull request New

21 Open 662 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Upgrade GLM-5 image to v0.5.10

#1023 opened Apr 10, 2026 by chunfangamd Collaborator

Loading…

[AMD/ROCM] qwen3.5 fp4 on mi355x, search space TP2/TP4

#1022 opened Apr 10, 2026 by seungrokj Collaborator

Loading…

Fix config/script consistency: remove bogus ep, add missing env var checks

#1019 opened Apr 10, 2026 by Ankur-singh Collaborator

Loading…

[WIP] Update Qwen3.5 FP4 B200 SGLang sweep-enabled

#1018 opened Apr 10, 2026 by Ankur-singh Collaborator

Loading…

[WIP][NV] Update: sglang v2 Qwen3.5 h200 MTP NVIDIA sweep-enabled

#1017 opened Apr 8, 2026 by hshrivastava-droid Collaborator

Loading…

[AMD] Upgrade DeepSeek-R1 MI35x docker to the latest SGLang version 0.5.10 AMD

#1013 opened Apr 8, 2026 by aarnetalman Collaborator • Draft

[NVIDIA] [WIP] Bump GLM-5 FP8 B200 SGLang concurrency to 256 NVIDIA sweep-enabled

#1012 opened Apr 8, 2026 by Ankur-singh Collaborator

Loading…

[experimental] Add multinode profiling workflow experimental github_actions

Pull requests that update GitHub Actions code

#1007 opened Apr 6, 2026 by hbarclay Collaborator

Loading…

Multinode evals

#1000 opened Apr 3, 2026 by Oseltamivir Collaborator

Loading…

[AMD] feat: MiniMax M2.5 PD Disagg (1P2D) + PIECEWISE cudagraph optimization (+20% throughput) AMD vllm/sglang release broken -need to wait

#999 opened Apr 2, 2026 by ChuanLi1101 Contributor • Draft

6 tasks done

feat: MI300X disaggregated inference with Broadcom IBGDA (#982) sweep-enabled

#998 opened Apr 2, 2026 by JordanNanos Collaborator

Loading…

[AMD/ROCm] qwen3.5 fp8 mi355x SGL performance update AMD

#995 opened Apr 2, 2026 by seungrokj Collaborator • Draft

[WIP][experimental] add agentic trace replay benchmark infrastructure experimental

#993 opened Apr 1, 2026 by cquil11 Collaborator • Draft

[AMD][MI30X]Update Qwen3.5 perf AMD

#986 opened Apr 1, 2026 by zhentaocc Collaborator

Loading…

[AMD] [code not in mergable state yet][blocker waiting for more nodes to speed up dev iteration speed] mi325 sglang disagg AMD

#985 opened Mar 31, 2026 by JordanNanos Collaborator • Draft

2 of 8 tasks

[AMD] improve dsr1 fp4 disagg perf on mi355x AMD

#983 opened Mar 31, 2026 by billishyahao Collaborator

Loading…

[AMD][MI35X]Update qwen3.5 perf AMD

#980 opened Mar 30, 2026 by zhentaocc Collaborator

Loading…

[AMD] [Draft, no merge] MVP for vLLM Disagg AMD

#948 opened Mar 26, 2026 by chunfangamd Collaborator

Loading…

[NVIDIA] chore: upgrade h200 gptoss to latest trtllm NVIDIA

#854 opened Mar 2, 2026 by cquil11 Collaborator

Loading…

[AMD] [DNM, still merge in 0.18 as trust_remote_code=True is not passed to quark] Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4) AMD vllm/sglang release broken -need to wait

#827 opened Mar 1, 2026 by functionstackx Contributor

Loading…

[AMD] Performance Improvements for MI300X with GEMM and FP8 Enhancements AMD sweep-enabled

#811 opened Feb 26, 2026 by chunfangamd Collaborator

Loading…

ProTip! Filter pull requests by the default branch with base:main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!