Skip to content

[Model] Add HyperCLOVAX-SEED-Think-14B language model support#37107

Merged
DarkLight1337 merged 3 commits into
vllm-project:mainfrom
bigshanedogg:feat/hyperclovax
Mar 16, 2026
Merged

[Model] Add HyperCLOVAX-SEED-Think-14B language model support#37107
DarkLight1337 merged 3 commits into
vllm-project:mainfrom
bigshanedogg:feat/hyperclovax

Conversation

@bigshanedogg

@bigshanedogg bigshanedogg commented Mar 15, 2026

Copy link
Copy Markdown
Contributor

Purpose

Add inference support for HyperCLOVA X (HyperCLOVAXForCausalLM), a large language model family developed by NAVER Cloud.

Changes

  • vllm/model_executor/models/hyperclovax.py (new) — HyperCLOVAXForCausalLM model implementation
  • vllm/transformers_utils/configs/hyperclovax.py (new) — HyperCLOVAXConfig configuration class
  • vllm/model_executor/models/registry.py — Register HyperCLOVAXForCausalLM
  • vllm/transformers_utils/configs/__init__.py — Register HyperCLOVAXConfig
  • docs/models/supported_models.md — Add HyperCLOVAXForCausalLM entry
  • tests/models/registry.py — Add test registry entry (naver-hyperclovax/HyperCLOVAX-SEED-Think-14B)
  • tests/models/language/generation/test_common.py — Add HyperCLOVAXForCausalLM to common generation tests

Test Plan

Launch server

  vllm serve naver-hyperclovax/HyperCLOVAX-SEED-Think-14B \
    --max-model-len 32768 \
    --max-num-batched-tokens 16384 \
    --tensor-parallel-size 1 \
    --trust-remote-code \
    --enable-prefix-caching

Test Result

Benchmark validation

Tasks Metric vLLM (this PR)
hellaswag acc_norm 0.6521
gsm8k flexible-extract 0.9484

Evaluated with lm-evaluation-harness defaults and default sampling params for server validation.

Request

client

import requests

payload = {
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Please briefly explain what you can help with. Think carefully before answering."},
            ],
        }
    ],
    "temperature": 0.2,
    "skip_special_tokens": False,
    "stop": ["<|im_end|><|endofturn|>", "<|im_end|><|stop|>"],
    "chat_template_kwargs": {"skip_reasoning": True},
}

resp = requests.post(
    f"http://{url}/v1/chat/completions", 
    json=payload, 
    timeout=300,
)
resp.raise_for_status()

data = resp.json()
print(data["choices"][0]["message"].get("content"))

output

Okay, the user is asking me to briefly explain what I can help with. Let me start by recalling my capabilities. I know I can answer questions, provide explanations, assist with learning, help brainstorm ideas, and offer suggestions. But I should make sure not to overstate what I can do.

Wait, I should also mention that I can't access real-time information or perform physical actions. That's important to set the right expectations. Maybe start by listing the main areas: answering questions, explaining concepts, helping with tasks like writing or coding, and offering recommendations. But keep it concise since they asked for a brief explanation.

Hmm, should I include examples? The user might appreciate a quick list of specific areas. Like, "I can help with homework, language translation, coding problems, creative writing, and more." Also, clarify that I rely on existing knowledge up to my last update in July 2024. Oh right, and I can't browse the internet or access personal data unless shared in the conversation. Privacy is a key point here.

Wait, the user said "think carefully before answering," so maybe I should structure it clearly. Start with a general statement about assisting with information and tasks, then list key areas, mention limitations, and ensure it's all in a few short sentences. Let me check if I missed anything. Oh, yes, I should avoid jargon and keep it simple. Alright, time to put it all together concisely.<|im_end|>
<|im_start|>assistant
I can assist with providing information, explanations, and guidance across a wide range of topics, including:  
- **Answering questions** (science, history, technology, etc.).  
- **Explaining concepts** (math, programming, philosophy, etc.).  
- **Helping with tasks** (writing, editing, coding, problem-solving).  
- **Offering recommendations** (books, learning resources, strategies).  
- **Brainstorming ideas** (creative projects, studies, discussions).  

**Limitations**: I cannot access real-time data, perform physical actions, or retrieve personal information unless shared during our conversation. My knowledge is current up to July 2024. Let me know how I can assist! 😊

@mergify

mergify Bot commented Mar 15, 2026

Copy link
Copy Markdown
Contributor

Documentation preview: https://vllm--37107.org.readthedocs.build/en/37107/

@mergify mergify Bot added documentation Improvements or additions to documentation new-model Requests to new models labels Mar 15, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the HyperCLOVAX-SEED-Think-14B language model. The changes include a new model implementation, a corresponding configuration class, and updates to the model registries, documentation, and tests. The new implementation handles the model's specific architectural features, such as muP scaling and optional Peri-Layer Normalization. The code is well-structured and follows existing patterns in the vLLM codebase. One issue was found in the test registry update, where a redundant and incorrect entry was added.

Comment thread tests/models/registry.py Outdated
Comment thread tests/models/language/generation/test_common.py Outdated

@DarkLight1337 DarkLight1337 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

Signed-off-by: bigshanedogg <bigshane319@gmail.com>
Signed-off-by: bigshanedogg <bigshane319@gmail.com>
Signed-off-by: bigshanedogg <bigshane319@gmail.com>
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) March 16, 2026 05:00
@github-actions github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 16, 2026
@DarkLight1337 DarkLight1337 merged commit 2390d44 into vllm-project:main Mar 16, 2026
55 checks passed
Lucaskabela pushed a commit to Lucaskabela/vllm that referenced this pull request Mar 17, 2026
wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026
khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026
JiantaoXu pushed a commit to JiantaoXu/vllm that referenced this pull request Mar 28, 2026
mtparet pushed a commit to blackfuel-ai/vllm that referenced this pull request Apr 9, 2026
mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request May 10, 2026
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
0826joyce pushed a commit to 0826joyce/vllm-serving-optimization that referenced this pull request May 19, 2026
@jp1924

jp1924 commented Jun 2, 2026

Copy link
Copy Markdown

@bigshanedogg
Adding SEED-Think-14B and 32B is a good idea, but to use them properly, it need to implement reasoning and a tool parser, right?
However, those components are missing from the current vLLM.
The official documentation says to install a plugin, but since it’s installed as a separate dependency package, it’s quite inconvenient to use.
With the vLLM version upgrade, the import structure has changed, causing the plugin to throw a lot of errors.
So, I think work is underway to add the plugin’s reasoning and tool parser. Could you please go to this PR and leave a review?
#42366
#44171

mvanhorn pushed a commit to mvanhorn/vllm that referenced this pull request Jun 4, 2026
…roject#37107)

Signed-off-by: bigshanedogg <bigshane319@gmail.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants