Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

fix(cody-gateway): streaming google endpoint#63306

Merged
abeatrix merged 6 commits into
mainfrom
bee/always-stream
Jun 18, 2024
Merged

fix(cody-gateway): streaming google endpoint#63306
abeatrix merged 6 commits into
mainfrom
bee/always-stream

Conversation

@abeatrix

@abeatrix abeatrix commented Jun 17, 2024

Copy link
Copy Markdown
Contributor

Issue: Currently, the ShouldStream() method will always returns false because the Stream value is removed before it was passed into the Handler.

To fix this, we will store the original googleRequest.Stream value if it's true so that ShouldStream() will return the correct Stream value. We will also use the transformBody method to remove the Stream value before we send it to Google API.

Here is the expected behaviour after the stream is fixed:

Screen.Recording.2024-06-18.at.9.22.51.AM.mov

Also confirmed it works with both Cody Gateway and BYOK:

image

Test plan

Always stream Cody Gateway's requests for Google Gemini models as we haven't implemented Code Completion feature on the client side.

Non-stream request

❯ curl 'https://sourcegraph.test:3443/.api/completions/code' -i \
-X POST \
-H 'authorization: token LOCALTOKEN' \
--data-raw '{"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":30,"temperature":0,"stopSequences":[],"timeoutMs":5000}'
HTTP/2 200
access-control-allow-credentials: true
access-control-allow-origin:
alt-svc: h3=":3443"; ma=2592000
cache-control: no-cache, max-age=0
content-type: text/plain; charset=utf-8
date: Tue, 18 Jun 2024 21:05:38 GMT
server: Caddy
server: Caddy
set-cookie: sourcegraphDeviceId=d4fa7789-2442-472a-b425-a68372d27944; Expires=Wed, 18 Jun 2025 21:05:36 GMT; Secure
vary: Cookie, Accept-Encoding, Authorization, Cookie, Authorization, X-Requested-With, Cookie
x-content-type-options: nosniff
x-frame-options: DENY
x-powered-by: Express
x-trace: 00f998a2a2e1b6895687ad7cc567b41c
x-trace-span: da9c93d16415b94f
x-trace-url: https://sourcegraph.test:3443/-/debug/jaeger/trace/00f998a2a2e1b6895687ad7cc567b41c
x-xss-protection: 1; mode=block
content-length: 147

{"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* **Large Language Model:** I'm","stopReason":"STOP"}%

Streaming request:

❯ curl 'https://sourcegraph.test:3443/.api/completions/stream' -i \
-X POST \
-H 'authorization: token $LOCALTOKEN' \
--data-raw '{"stream":true,"messages":[{"speaker":"human","text":"Who are you?"}],"maxTokensToSample":1000,"temperature":0,"stopSequences":[],"timeoutMs":5000}'

HTTP/2 200
access-control-allow-credentials: true
access-control-allow-origin:
alt-svc: h3=":3443"; ma=2592000
cache-control: no-cache
content-type: text/event-stream
date: Tue, 18 Jun 2024 21:07:02 GMT
server: Caddy
server: Caddy
set-cookie: sourcegraphDeviceId=38b45f36-d237-4f8d-8242-a63fcc801a32; Expires=Wed, 18 Jun 2025 21:06:59 GMT; Secure
vary: Cookie, Accept-Encoding, Authorization, Cookie, Authorization, X-Requested-With, Cookie
x-accel-buffering: no
x-content-type-options: nosniff
x-frame-options: DENY
x-powered-by: Express
x-trace: 984932973626e14f7cb0ce7e8e470717
x-trace-span: d285179cfb744e08
x-trace-url: https://sourcegraph.test:3443/-/debug/jaeger/trace/984932973626e14f7cb0ce7e8e470717
x-xss-protection: 1; mode=block

event: completion
data: {"completion":"I","stopReason":"STOP"}

event: completion
data: {"completion":"I am a large language model, trained by Google. \n\nHere's what","stopReason":"STOP"}

event: completion
data: {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* **I am not a person.** I am a computer","stopReason":"STOP"}

event: completion
data: {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* **I am not a person.** I am a computer program designed to process and generate human-like text. \n* **I learn from data.** I was trained on a massive dataset of text and code,","stopReason":"STOP"}

event: completion
data: {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* **I am not a person.** I am a computer program designed to process and generate human-like text. \n* **I learn from data.** I was trained on a massive dataset of text and code, which allows me to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.\n* **I am still","stopReason":"STOP"}

event: completion
data: {"completion":"I am a large language model, trained by Google. \n\nHere's what that means:\n\n* **I am not a person.** I am a computer program designed to process and generate human-like text. \n* **I learn from data.** I was trained on a massive dataset of text and code, which allows me to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.\n* **I am still under development.** I am constantly learning and improving, but I am not perfect and can sometimes make mistakes.\n\nHow can I help you today? \n","stopReason":"STOP"}

event: done
data: {}

Changelog

…eaming

The changes in this commit fix the `ShouldStream()` method of the `googleRequest` struct to always return `true`, until we support Code Completions for Google models on the client side.

This change ensures that the Cody Gateway always streams responses for Google model completions, regardless of the `Stream` field in the request.
@cla-bot cla-bot Bot added the cla-signed label Jun 17, 2024
@abeatrix abeatrix marked this pull request as draft June 17, 2024 22:13
@abeatrix abeatrix marked this pull request as ready for review June 17, 2024 22:16

@chrsmith chrsmith left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know enough about what this ShouldStream() function does, or how it relates to the way that we send requests to Cody Gateway to approve this...

It seems like this might have some unintentional consequences? 😬

If you know what you are doing, I'm happy to trust you. But perhaps it might be worth adding some unit tests or something to confirm that this does the right thing for the Google upstream provider? e.g. does this impact how we make the API request to Google?

Comment thread cmd/cody-gateway/internal/httpapi/completions/google.go
@abeatrix abeatrix changed the title fix(cody-gateway): Always return true for Google model completion streaming fix(cody-gateway): streaming google endpoint Jun 18, 2024
@abeatrix abeatrix requested a review from emidoots June 18, 2024 21:15
Comment thread cmd/cody-gateway/internal/httpapi/completions/google.go Outdated
@emidoots

Copy link
Copy Markdown
Member

I think the best/easiest way to do this might unfortunately be via the URL detection mechanism we talked about @abeatrix

@emidoots

Copy link
Copy Markdown
Member

Here's a proposed path forward:

In Cody gateway..

  1. Pass a boolean parameter to this function like isGoogleHandler - which is true in this case, false everywhere else.
  2. Inside of that function, e.g. right here you could drop some URL detection code in:
if isGoogleHandler {
    if strings.EndsWith(r.URL.Path, "streamGenerateContent") {
        body.Stream = true
    }
}

Since the Googe backend provider will already (before your changes) send Cody gateway a URL with that gRPC method at the end of the path, you can check it being there ike that^ and you just need a way for your ShouldStream() method to return the value from the URL handler like that.

In the snippet above, I just used body.Stream = true - which before your changes here would refer to that Stream bool json:"-" // This field will not be marshaled into JSON field of yours.

Then I think you would not need most (maybe all?) of the other changes in this PR and it'd be a bit cleaner

@abeatrix abeatrix enabled auto-merge (squash) June 18, 2024 22:20
@abeatrix abeatrix merged commit 0c777ba into main Jun 18, 2024
@abeatrix abeatrix deleted the bee/always-stream branch June 18, 2024 22:25
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants