Improve error message & Add vicuna template by merrymercy · Pull Request #57 · sgl-project/sglang

merrymercy · 2024-01-20T01:03:28Z

No description provided.

…roject#50) * [Transpiler] Add config class and interface for the transpiler * [Misc] Add a static checking for STensor::num_elements * [Transpiler] Implement all kernel operators and their runtime * Format code * Add support for broadcast in ElemwiseBinaryKernel * Refine Python interface for transpiler & Add layout resolve for DTensor * Finish layout resolution for threadblock level ops * Add basic support for threadblock level transpiling * Merge Python JIT frontend (sgl-project#40). Add support for Python JIT frontend * temp frontend * bug fix * bug fix * frontend with output shapes/strides * add jit demo * Add checking for MIRAGE_ROOT --------- Co-authored-by: Shengyu Liu <interestingLSY@gmail.com> * Add unit test for the transpiler * Add document for the transpiler * Add threadblock level matmul operator * Add threadblock-level reduction op * clang format * nits * Remove nonexist examples * Merge from main * Add tb scheduling & Add support for forloop accumulator * Add support for tb operator fusion * TB_FORLOOP_ACCUM_OP->TB_FORLOOP_ACCUM_NO_RED_OP * Refine documents * nits * nits * Add support for chunked copy and async copy * fix Python JIT compilation errors * [CUDA Transpiler] Fix Python JIT compilation errors (sgl-project#51) * fix Python JIT compilation errors * Remove __getattr__ from wrapper --------- Co-authored-by: SpiritedAwayCN <541845219@qq.com> * checkpoint * Bugfix in async copy * Optimize matmul * Add support for output chunked copy * Python interface for creating threadblock graph * rename python objects * Optimize ClearAccumulatorKernel * Add testcases for IO * Optimize threadblock input ops * support customized * Small optimization * Change memory alignment to 128B for dtensor and stensors * bug fixes * remove the Py prefix for tensor objects in Python * fix typo * fix typo * Modify lib.h to adapt to PR sgl-project#53 * Bugfix in testcase * Support in-register accumulation for matmul * Rewrite tb scheduling * Add support for advanced memory planning interface & algos * Allocate software pipeline buffers in memory planner too * Code formatting * Fix test script * Add doc for TB scheduling and memory planning * Add doc for register-backed accumulator * Refine tb elementwise binary operator for broadcast support * Add test for tb elementwise binary operator with broadcast * Fix a subtle bug in matmul * Add some comments * Refine TB input and output ops (do not rely on stride) * Refine TB reduction kernel to avoid using stride * nits * Refine matmul operator: do not rely on stride * Slightly reorganize procedure in Transpiler * Change matmul perf args to align with tb matmul perf * Rename a file * Add support for swizzling (XOR and SHIFT) (has a bug) * Add doc for swizzling * Add a test for the SHIFT swizzling * Fix issue (sgl-project#56) * Format boolean variable as `true` and `false` for better readability * nits * nits * Hotfix CuTe's bug Ref: NVIDIA/cutlass#1766 * Bump Cutlass's version and remove the workaround in the last commit * Remove debug info * Update doc * Bugfix * [Transpiler] Add more Python examples for transpiler testing (sgl-project#57) * bug fixes * fix compile issue * minor updates --------- Co-authored-by: interestingLSY <interestingLSY@gmail.com> Co-authored-by: Chun'an Shi <44977219+SpiritedAwayCN@users.noreply.github.com> Co-authored-by: SpiritedAwayCN <541845219@qq.com>

* fix * fix AWQ for DSv3 * don't use absorb MLA for AWQ * lint * more fixes * add w4a16 kernel * remove unnecessary name * add note * use prefetch. simplify impl * clean up. add brgemm (WIP) * fix brgemm * fix mismatch BLOCK_N * use at::quint4x2 to signify type better * change type of zero point back to uint8 * add FusedMoE interface * use FusedMoE kernel * fix types * fix MoE * update deepseek.cpp

…t#57)

[Acc] update hash function to speed multimodal preprocess

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

…hidden states generation (sgl-project#57) * add local data path support and more assistant * small refactor * separate out the data-preprocess logic

improve error message

6d261a6

merrymercy merged commit f30abd0 into main Jan 20, 2024

merrymercy deleted the msg branch January 20, 2024 01:03

CSEEduanyu mentioned this pull request Jan 26, 2025

[Bug] NCCL Crash with SIGSEGV Frequently when deploying deepseek v3 #2803

Closed

5 tasks

timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025

Improve error message & Add vicuna template (sgl-project#57)

1229bea

blzheng added a commit to blzheng/sglang that referenced this pull request Aug 9, 2025

add torch_native_sink backend and enable decode with sink (sgl-projec…

8e2ae3d

…t#57)

gaolaobao mentioned this pull request Aug 25, 2025

[Bug] RTX 5060: RMSNorm failed, same as the #7249 issue, when running qwen2.5-0.5b-instruct model. #9600

Closed

5 tasks

key4ng pushed a commit to key4ng/sglang that referenced this pull request Nov 9, 2025

[Docs] Fix errors in docs (sgl-project#57)

0d64d7e

amd-youchen pushed a commit to amd-youchen/sglang that referenced this pull request Dec 8, 2025

Merge pull request sgl-project#57 from ZLkanyo009/dev/perf

04e95c6

[Acc] update hash function to speed multimodal preprocess

Garrybest pushed a commit to Garrybest/sglang that referenced this pull request Jan 9, 2026

add config (sgl-project#57)

9b8bcfb

vschandramourya pushed a commit to vschandramourya/sglang that referenced this pull request Feb 3, 2026

Copy OSS code from c3c26f7 on 20250915 (sgl-project#57)

a0222aa

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve error message & Add vicuna template#57

Improve error message & Add vicuna template#57
merrymercy merged 1 commit intomainfrom
msg

merrymercy commented Jan 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

merrymercy commented Jan 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant