Skip to content

Refactor attention backend#1381

Merged
merrymercy merged 10 commits intomainfrom
attn_backend
Sep 11, 2024
Merged

Refactor attention backend#1381
merrymercy merged 10 commits intomainfrom
attn_backend

Conversation

@merrymercy
Copy link
Copy Markdown
Contributor

  • Introduce a base class AttentionBackend to abstract away the different attention kernel backends
  • Now we have FlashInferAttnBackend and TritonAttnBackend

@merrymercy merrymercy force-pushed the attn_backend branch 2 times, most recently from 9183cb2 to 4684e97 Compare September 11, 2024 14:06
@merrymercy merrymercy merged commit fec185c into main Sep 11, 2024
@merrymercy merrymercy deleted the attn_backend branch September 11, 2024 18:44
@merrymercy merrymercy mentioned this pull request Sep 19, 2024
3 tasks
timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant