feat: support more delimiter for SSE events in GCP gemini#1536
feat: support more delimiter for SSE events in GCP gemini#1536nacx merged 6 commits intoenvoyproxy:mainfrom
Conversation
…tex AI Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1536 +/- ##
==========================================
+ Coverage 84.15% 84.19% +0.03%
==========================================
Files 150 150
Lines 12984 13005 +21
==========================================
+ Hits 10927 10949 +22
+ Misses 1436 1434 -2
- Partials 621 622 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| } | ||
| lines := bytes.Split(bodyBytes, []byte("\n\n")) | ||
| bodyReader := io.MultiReader(bytes.NewReader(o.bufferedBody), body) | ||
| scanner := bufio.NewScanner(bodyReader) |
There was a problem hiding this comment.
When implementing the mcp sse parser, I got issues with an initial implementation that used the Scanner; for some servers, I got long lines and the parsing failed with bufio.ErrTooLong (bufio.Scanner: token too long). I only caught this when using some real MCP servers.
I don't know if this can be the case as well here, but just a data point to consider not using the scanner if long lines could be expected.
There was a problem hiding this comment.
good point.
as of go 1.25.3 the default for MaxScanTokenSize seems to be 64KB
Since each streaming chunk represents couple of generated tokens, the size of each chunk should not exceed this limit. though i couldn't find docs that guarantee that the chunk will be smaller than a specific size
There was a problem hiding this comment.
What comes from the LLM is inherently unpredictable, and I don't know if we can make that statement (for example, there are some tests that use chunks that have tool calls, and the length of the arguments could be unpredictable...
I agree it is unlikely to happen, but my main concern is that we are introducing a limitation we didn't have before, and I don't want this to hit us in the form of a bug.
There was a problem hiding this comment.
two approaches i can think of
- make the
MaxScanTokenSizeconfigurable - follow the mcp sse parser logic to explicitly parse the chunks
I personally prefer option 1 as relying on scanner to do the parsing aspect is simpler and easier maintain
what do you think @nacx ?
There was a problem hiding this comment.
What about a third option: in the previous implementation, we're loading everything in memory and doing a bytes.Split. Why don't just keep that, but use the right function to split? I mean, test for the presence of the delimiters to figure out the "split characters", then keep the original code just splitting using those.
This way we don't change the behaviour, keep it backwards-compatible, and we don't introduce a new config option that users won't know what value to set confidently and how to set it. Also, if we find loading everything in memory to be an issue (it hasn't been till now), then we can address the stream reader approach, but probably it's not needed for this PR.
Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net> # Conflicts: # internal/translator/openai_gcpvertexai.go
|
unsure about test failures
|
|
/retest |
Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>
|
/retest |
…#1536) **Description** The SSE event spec supports 3 different delimiters (pair of CRLF / LF / CR). This PR updates the stream processing translator for GCP VertexAI to support all 3 delimiters. [SSE docs] about delimiters > **Note:** The docs define an event as `event = *( comment / field ) end-of-line` `end-of-line` is defined as a single CRLF or LF or CR (not a pair) But the `comment` and `field` definitions also end in `end-of-line` implying that an event always ends in a pair of CRLR / CR / LF [SSE docs]: https://html.spec.whatwg.org/multipage/server-sent-events.html#parsing-an-event-stream --------- Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net> Co-authored-by: Ignasi Barrera <ignasi@tetrate.io> Signed-off-by: yxia216 <yxia216@bloomberg.net>
…#1536) **Description** The SSE event spec supports 3 different delimiters (pair of CRLF / LF / CR). This PR updates the stream processing translator for GCP VertexAI to support all 3 delimiters. [SSE docs] about delimiters > **Note:** The docs define an event as `event = *( comment / field ) end-of-line` `end-of-line` is defined as a single CRLF or LF or CR (not a pair) But the `comment` and `field` definitions also end in `end-of-line` implying that an event always ends in a pair of CRLR / CR / LF [SSE docs]: https://html.spec.whatwg.org/multipage/server-sent-events.html#parsing-an-event-stream --------- Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net> Co-authored-by: Ignasi Barrera <ignasi@tetrate.io> Signed-off-by: Erica Hughberg <erica.sundberg.90@gmail.com>
Description
The SSE event spec supports 3 different delimiters (pair of CRLF / LF / CR). This PR updates the stream processing translator for GCP VertexAI to support all 3 delimiters.
SSE docs about delimiters