Skip to content

extproc: token latency stat only when stream=true#470

Merged
yuzisun merged 1 commit intomainfrom
tokenlatencyonlyonstreaming
Mar 8, 2025
Merged

extproc: token latency stat only when stream=true#470
yuzisun merged 1 commit intomainfrom
tokenlatencyonlyonstreaming

Conversation

@mathetake
Copy link
Copy Markdown
Member

Commit Message

This changes the stat collection behavior so that token latency metrics are only recorded on stream=true requests. This was brought up in an offline discussion and otherwise the metrics doesn't make sense.

Related Issues/PRs (if applicable)

#459

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
@mathetake mathetake marked this pull request as ready for review March 8, 2025 01:21
@mathetake mathetake requested a review from a team as a code owner March 8, 2025 01:21
@yuzisun yuzisun merged commit 3f98761 into main Mar 8, 2025
17 checks passed
@yuzisun yuzisun deleted the tokenlatencyonlyonstreaming branch March 8, 2025 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants