Skip to content

Implement return_hidden_states for the OpenAI API#6137

Merged
zhyncs merged 16 commits intosgl-project:mainfrom
context-labs:openai-hidden-states
May 19, 2025
Merged

Implement return_hidden_states for the OpenAI API#6137
zhyncs merged 16 commits intosgl-project:mainfrom
context-labs:openai-hidden-states

Conversation

@kyle-pena-kuzco
Copy link
Copy Markdown
Contributor

@kyle-pena-kuzco kyle-pena-kuzco commented May 9, 2025

Motivation

The native API supports returning hidden states from the model. This PR implements the same feature for OpenAI.

Returning hidden states is important for model verification and diagnostics purposes. Inference providers that use SGLang as a backend route requests that are in the OpenAI format, so if an inference provider would like to do internal diagnostics and verification, it is much more straightforward to simply include return_hidden_states.

This PR also fixes #5761 and adds appropriate test coverage for that bug.

Modifications

Changes were made to protocol.py to support the return_hidden_states flag as well as returning hidden_states on /v1/completions and /v1/chat/completions for both streaming and non-streaming.

If no hidden states are requested, the hidden states property is omitted from the response instead of including a null field. That way, the responses are completely backwards compatible.

The adapter was changed to include hidden states in responses when requested.

The dictionary n_prev_tokens was not being updated in v1_chat_completions for a streaming response, leading the same top logprobs to being repeated for every chunk. This has been fixed as well.

Checklist

@kyle-pena-kuzco kyle-pena-kuzco marked this pull request as ready for review May 9, 2025 05:14
@kyle-pena-kuzco
Copy link
Copy Markdown
Contributor Author

@zhyncs - let me know if you have any feedback! We'd like to get this feature merged.

@zhaochenyang20
Copy link
Copy Markdown
Collaborator

@Qiaolin-Yu qiaolin could you take a look plz?

@Qiaolin-Yu Qiaolin-Yu self-assigned this May 16, 2025
@Qiaolin-Yu
Copy link
Copy Markdown
Collaborator

@Qiaolin-Yu qiaolin could you take a look plz?

Sure. Very happy to help.

Copy link
Copy Markdown
Collaborator

@Qiaolin-Yu Qiaolin-Yu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! LGTM

@atbe
Copy link
Copy Markdown

atbe commented May 16, 2025

thank you @Qiaolin-Yu @zhaochenyang20 @yinfan98 <3

@zhaochenyang20
Copy link
Copy Markdown
Collaborator

will merge it, no need to rebase

@zhaochenyang20
Copy link
Copy Markdown
Collaborator

nice work @kyle-pena-kuzco

@zhyncs
Copy link
Copy Markdown
Collaborator

zhyncs commented May 18, 2025

@kyle-pena-kuzco May you help resolve the conflicts? Thanks. cc @CatherineSue @ispobock @Qiaolin-Yu

@kyle-pena-kuzco
Copy link
Copy Markdown
Contributor Author

@kyle-pena-kuzco May you help resolve the conflicts? Thanks. cc @CatherineSue @ispobock @Qiaolin-Yu

Yes. Starting on that now. i'll comment when completed.

@kyle-pena-kuzco
Copy link
Copy Markdown
Contributor Author

@kyle-pena-kuzco May you help resolve the conflicts? Thanks. cc @CatherineSue @ispobock @Qiaolin-Yu

Yes. Starting on that now. i'll comment when completed.

@zhyncs - I've fixed the merge conflicts.

Copy link
Copy Markdown
Collaborator

@CatherineSue CatherineSue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhyncs zhyncs merged commit 4f39bcf into sgl-project:main May 19, 2025
60 of 78 checks passed
zhyncs added a commit that referenced this pull request May 20, 2025
woodx9 pushed a commit to woodx9/sglang that referenced this pull request Jun 8, 2025
Layssy pushed a commit to Layssy/sglang-iaas that referenced this pull request Jun 9, 2025
Layssy pushed a commit to Layssy/sglang-iaas that referenced this pull request Jun 9, 2025
xwu-intel pushed a commit to xwu-intel/sglang that referenced this pull request Jun 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] The TopLogprob are same for each stream token

8 participants