Skip to content

backport: completion stream + metrics and assistant content#497

Merged
aabchoo merged 3 commits intorelease/v0.1from
aaron/fix-openai-fields
Mar 14, 2025
Merged

backport: completion stream + metrics and assistant content#497
aabchoo merged 3 commits intorelease/v0.1from
aaron/fix-openai-fields

Conversation

@aabchoo
Copy link
Copy Markdown
Contributor

@aabchoo aabchoo commented Mar 14, 2025

Commit Message

PR to backport mockChatCompletionMetrics, chat completion stream fix, and openai content type.

Including:

nacx and others added 3 commits March 14, 2025 14:41
**Commit Message**

extproc: add GenAI metrics to track token usage and latency

Adds GenAI metrics according to the OpenTelemetry Semantic Conventions
for Generative AI Metrics [1].
Note those metrics are still in experimental phase and may still be
subject to change.

1: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/

**Related Issues/PRs (if applicable)**

This is a follow-up of
#432, implementing the
remaining review comments.

---------

Signed-off-by: Huamin Chen <hchen@redhat.com>
Signed-off-by: Ignasi Barrera <ignasi@tetrate.io>
**Commit Message**

This fixes a bug in the extproc when handling stream=true requests.
Previously, mode_override was set at the request body handling phase,
and it was not set in the response headers phase. That resulted in
buffering the entire response body which is clearly not ideal as from
clients point of view, they will receive the entire streaming vs line by
line. This refactors around the mode override and properly handle it.

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
**Commit Message**

Content type should be able to be string so adding this for
compatibility. It's already done for the other types - keeping it
consistent.

---------

Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
@mathetake
Copy link
Copy Markdown
Member

nice

@aabchoo aabchoo changed the title backport: completion stream + metrics, and openai content type fix backport: completion stream + metrics and assistant content Mar 14, 2025
@aabchoo
Copy link
Copy Markdown
Contributor Author

aabchoo commented Mar 14, 2025

edit: ah I have an extra space

@aabchoo aabchoo changed the title backport: completion stream + metrics and assistant content backport: completion stream + metrics and assistant content Mar 14, 2025
@aabchoo aabchoo marked this pull request as ready for review March 14, 2025 19:18
@aabchoo aabchoo requested a review from a team as a code owner March 14, 2025 19:18
@aabchoo aabchoo merged commit 2cc198a into release/v0.1 Mar 14, 2025
14 of 16 checks passed
@aabchoo aabchoo deleted the aaron/fix-openai-fields branch March 14, 2025 19:41
mathetake pushed a commit that referenced this pull request Mar 14, 2025
**Commit Message**

Updating for newest release:
https://github.com/envoyproxy/ai-gateway/releases/tag/v0.1.3

**Related Issues/PRs (if applicable)**


#497

Signed-off-by: Aaron Choo <achoo30@bloomberg.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants