Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

Implement experimental Azure OpenAI provider for completions and embeddings#55178

Merged
chwarwick merged 5 commits into
mainfrom
es/azure-experiment
Jul 26, 2023
Merged

Implement experimental Azure OpenAI provider for completions and embeddings#55178
chwarwick merged 5 commits into
mainfrom
es/azure-experiment

Conversation

@eseliger

@eseliger eseliger commented Jul 20, 2023

Copy link
Copy Markdown
Member

This adds experimental support for Azure OpenAI as an embeddings and completions provider.

I added some experimental docs for it, and ran through it with a bunch of repos and queries, seems to be on-par with the OpenAI providers mostly. Only thing is that we make more requests for embeddings, because Azure doesn't yet support the batch API.

Test plan

Verified manually using our Azure deployment.

@cla-bot cla-bot Bot added the cla-signed label Jul 20, 2023
@eseliger eseliger changed the title Azure OpenAI experiment Implement experimental Azure OpenAI provider for completions and embeddings Jul 25, 2023
@eseliger eseliger added release-blocker Prevents us from releasing: https://about.sourcegraph.com/handbook/engineering/releases backport 5.1 labels Jul 25, 2023
@eseliger eseliger marked this pull request as ready for review July 25, 2023 19:09

@chwarwick chwarwick left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me and the only potential impact is when using that specific provider which is experimental.

}

func (c *azureOpenaiEmbeddingsClient) requestSingleEmbeddingWithRetryOnNull(ctx context.Context, input string, retries int) (*openaiEmbeddingAPIResponse, error) {
for i := 0; i < retries; i++ {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok for a prototype/demo but it looks like we would retry on all failures including things that shouldn't be retryable like 401 or bad requests etc...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait aren't the retries already built into the http client so this is 3 x 20?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this is a hack because the OpenAI is flaky and will sometimes return null for a single vector. When you have multiple problematic chunks in a request, the probability that a retry of the whole chunk will succeed decreases exponentially with the size of the chunk, so instead we retry each chunk that failed individually so the probability of failure isn't multiplicative.

So no, this is not actually 3x20 because OpenAI does not return an error status when it returns null for a vector, so the retry layer in the HTTP client won't get triggered.

It's super gross.

Maybe Azure OpenAI won't have this problem 🤞 ?

defer resp.Body.Close()

if resp.StatusCode != http.StatusOK {
respBody, _ := io.ReadAll(io.LimitReader(resp.Body, 1024))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though Azure is a single vector at a time, is 1024 bytes enough I'm not sure the size of a vector.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A vector has 1536 dimensions. But I wouldn't expect an error response to contain a vector, so 1024 seems like enough to catch a meaningful error message to me

@chwarwick chwarwick merged commit 9a01c36 into main Jul 26, 2023
@chwarwick chwarwick deleted the es/azure-experiment branch July 26, 2023 00:29
@github-actions

Copy link
Copy Markdown
Contributor

The backport to 5.1 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-5.1 5.1
# Navigate to the new working tree
cd .worktrees/backport-5.1
# Create a new branch
git switch --create backport-55178-to-5.1
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9a01c364b1810b1a62027c77f0b45dd7f6efbc17
# Push it to GitHub
git push --set-upstream origin backport-55178-to-5.1
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.1

Then, create a pull request where the base branch is 5.1 and the compare/head branch is backport-55178-to-5.1.

chwarwick pushed a commit that referenced this pull request Jul 26, 2023
…etions and embeddings (#55178)

This adds experimental support for Azure OpenAI as an embeddings and
completions provider.

I added some experimental docs for it, and ran through it with a bunch
of repos and queries, seems to be on-par with the OpenAI providers
mostly. Only thing is that we make more requests for embeddings, because
Azure doesn't yet support the batch API.

Verified manually using our Azure deployment.

(cherry picked from commit 9a01c36)

@Beso0111 Beso0111 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F

@Beso0111 Beso0111 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F

MaedahBatool pushed a commit that referenced this pull request Jul 28, 2023
…ddings (#55178)

This adds experimental support for Azure OpenAI as an embeddings and
completions provider.

I added some experimental docs for it, and ran through it with a bunch
of repos and queries, seems to be on-par with the OpenAI providers
mostly. Only thing is that we make more requests for embeddings, because
Azure doesn't yet support the batch API.

## Test plan

Verified manually using our Azure deployment.
davejrt pushed a commit that referenced this pull request Aug 9, 2023
…ddings (#55178)

This adds experimental support for Azure OpenAI as an embeddings and
completions provider.

I added some experimental docs for it, and ran through it with a bunch
of repos and queries, seems to be on-par with the OpenAI providers
mostly. Only thing is that we make more requests for embeddings, because
Azure doesn't yet support the batch API.

## Test plan

Verified manually using our Azure deployment.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

backports cla-signed release-blocker Prevents us from releasing: https://about.sourcegraph.com/handbook/engineering/releases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants