Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

fix(cody): use client-provided timeout for completions requests#63875

Merged
taras-yemets merged 10 commits into
mainfrom
ty/compl-req-timeout-from-params
Jul 19, 2024
Merged

fix(cody): use client-provided timeout for completions requests#63875
taras-yemets merged 10 commits into
mainfrom
ty/compl-req-timeout-from-params

Conversation

@taras-yemets

@taras-yemets taras-yemets commented Jul 17, 2024

Copy link
Copy Markdown
Contributor

Closes CODY-2775

Enables client control over the request processing timeout on the server (both Sourcegraph backend and Cody Gateway). The context timeout is set to the value provided in the X-Timeout-Ms header of the client request. If the header is not provided, the default context timeout is used (1 minute on both Sourcegraph backend and Cody Gateway).

Previously, we only had a default timeout on the Sourcegraph backend side (8 minutes).

Corresponding client change:

Test plan

  • Manually tested and confirmed that if the request contains the X-Timeout-Ms header, its value is used. If not, the default maximum request duration is applied.
  • CI

Changelog

  • Use the provided timeout from request parameters if available; otherwise use the default maximum request duration (8 minutes)

@cla-bot cla-bot Bot added the cla-signed label Jul 17, 2024
@taras-yemets taras-yemets requested review from a team, rafax and valerybugakov July 17, 2024 11:21
@taras-yemets taras-yemets marked this pull request as ready for review July 17, 2024 11:21

@taras-yemets taras-yemets Jul 17, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rafax, is it the correct place to perform this check? Do we need to apply this check for all requests or only for those that are going to be sent to the Cody Gateway (as CODY-2775 suggests)?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the correct place to perform this check in the Sourcegraph backend (where it will apply to Cody Gateway and BYOK/BYOLLM deployments). I'm not 100% sure if SG is responsible for the cancellation propagation bugs, so we may need to do the same thing in Cody Gateway.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completion client's Stream/Complete methods are called with this context (with timeout). So I expect the context cancelation to be handled by the corresponding method of a matching client returned by codyGatewayClient.clientForParams.
I'm new to this part of the codebase, so this may be a wrong assumption. WDYT, @rafax?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we should apply this check for all requests, regardless of the completions provider it is sent to. It would be super-lame if the X-Timeout-Ms HTTP header was only honored by Cody Gateway, and completely ignored if the Sourcegraph backend was configured any other way.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR implements the same X-Tineout-Ms header checks both on the Sourcegraph backend and on the Cody Gateway.

@valerybugakov valerybugakov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fast path support for this task is critical because most of the autocomplete traffic goes directly to Cody Gateway.

Comment thread cmd/frontend/internal/httpapi/completions/handler.go
@taras-yemets

Copy link
Copy Markdown
Contributor Author

@rafax, @valerybugakov, I updated the PR to use the timeout value provided via the X-Timeout-Ms header.
I also added a default maximum duration of 5 minutes on the Gateway side. This value is arbitrary; I'm not sure why we have an 8-minute timeout on the Sourcegraph backend, so I chose a slightly smaller value.
Can you please review the PR again?

@taras-yemets taras-yemets changed the title fix(cody): use client-provided timeout from for completions requests fix(cody): use client-provided timeout for completions requests Jul 18, 2024

// maxRequestDuration is the maximum amount of time a request can take before
// being cancelled as DeadlineExceeded.
const maxRequestDuration = 5 * time.Minute

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it pains me to see the default timeout so high - we should really default to a shorter timeout (10s) and make long timeouts opt-in, but I know it's not in scope of this change.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want timeout per feature (chat/completions). WDYT, @chrsmith?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having different timeouts per-request might make sense... but even then, 5 minutes is still kinda bonkers high.

Like, for example, I'm not willing to wait five minutes to microwave a piece of pizza. Let alone to receive a chat completion. I'd drop this down all the way to a minute at max. And just keep the same timeout for both completions and chat until we have a more compelling reason to. (i.e. we find ourselves consistently hitting this timeout.)

But again, even at one minute it's still overly generous.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reduced the timeout to 1 minute for Cody Gateway.
8 minutes timeout remains unchanged for the Sourcegraph backend.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reduced the max request duration to 1 minute on the Sourcegraph backend too.

Comment thread cmd/cody-gateway/internal/httpapi/completions/upstream.go Outdated
Comment thread cmd/cody-gateway/internal/httpapi/completions/upstream.go Outdated
Comment thread cmd/cody-gateway/internal/httpapi/completions/upstream.go Outdated

@rafax rafax left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, but I think you really want @slimsag / @chrsmith 👁️ on this

@chrsmith chrsmith left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nits, suggestions, and comments. But this all looks good to me as well.


// maxRequestDuration is the maximum amount of time a request can take before
// being cancelled as DeadlineExceeded.
const maxRequestDuration = 5 * time.Minute

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having different timeouts per-request might make sense... but even then, 5 minutes is still kinda bonkers high.

Like, for example, I'm not willing to wait five minutes to microwave a piece of pizza. Let alone to receive a chat completion. I'd drop this down all the way to a minute at max. And just keep the same timeout for both completions and chat until we have a more compelling reason to. (i.e. we find ourselves consistently hitting this timeout.)

But again, even at one minute it's still overly generous.

Comment thread cmd/cody-gateway/internal/httpapi/completions/upstream.go

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we should apply this check for all requests, regardless of the completions provider it is sent to. It would be super-lame if the X-Timeout-Ms HTTP header was only honored by Cody Gateway, and completely ignored if the Sourcegraph backend was configured any other way.

taras-yemets referenced this pull request in sourcegraph/cody-public-snapshot Jul 19, 2024
Part of
[CODY-2775](https://linear.app/sourcegraph/issue/CODY-2775/%5Bautocomplete-latency%5D-apply-the-same-timeout-on-the-cody-gateway-side)

Sends the `timeoutMs` config value to the backend using via
`X-Timeout-Ms` header in both default and fast paths.

Backend support added in
https://github.com/sourcegraph/sourcegraph/pull/63875


## Test plan
Manually tested (using a debugger) alongside
https://github.com/sourcegraph/sourcegraph/pull/63875 on both default
and fast paths, using a locally running Sourcegraph instance and Cody
Gateway.


<!-- Required. See
https://sourcegraph.com/docs/dev/background-information/testing_principles.
-->
@taras-yemets taras-yemets enabled auto-merge (squash) July 19, 2024 14:07
@taras-yemets taras-yemets merged commit 26df35a into main Jul 19, 2024
@taras-yemets taras-yemets deleted the ty/compl-req-timeout-from-params branch July 19, 2024 14:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants