Skip to content

open_ai: Support fast mode in BYOK via the Responses API service_tier#57412

Merged
tomhoule merged 1 commit into
mainfrom
tomhoule-uzqlqrtwuzwm
May 27, 2026
Merged

open_ai: Support fast mode in BYOK via the Responses API service_tier#57412
tomhoule merged 1 commit into
mainfrom
tomhoule-uzqlqrtwuzwm

Conversation

@tomhoule

@tomhoule tomhoule commented May 21, 2026

Copy link
Copy Markdown
Contributor

Maps the existing Speed::Fast plumbing to OpenAI's service_tier: "priority", which matches what "fast mode" in Codex does. Relevant docs here.

Like for the existing Anthropic fast mode we have a Model::supports_priority method for the variants on https://openai.com/api-priority-processing. Pro, nano, and legacy gpt-4 are excluded; Custom defaults to false.

This is gated to staff only for now (not in this diff, but the existing fast mode feature), until we have the mechanism to require confirmation before you enable fast mode.

Release Notes:

  • Added support for Fast Mode (priority service tier) on the OpenAI API provider.

@cla-bot cla-bot Bot added the cla-signed The user has signed the Contributor License Agreement label May 21, 2026
@zed-community-bot zed-community-bot Bot added the staff Pull requests authored by a current member of Zed staff label May 21, 2026
Maps the existing `Speed::Fast` plumbing onto OpenAI's `service_tier:
"priority". Relevant docs [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-service_tier).

Like for the existing Anthropic fast mode we have a `Model::supports_priority`
method for the variants on openai.com/api-priority-processing. Pro, nano, and
legacy gpt-4 are excluded; Custom defaults to false.

This is gated to staff only for now, until we have the mechanism to require confirmation before you enable fast mode.
@tomhoule tomhoule force-pushed the tomhoule-uzqlqrtwuzwm branch from 162a135 to 88145a4 Compare May 21, 2026 15:11
@tomhoule

Copy link
Copy Markdown
Contributor Author

I tested this manually. It works as it should.

@tomhoule tomhoule marked this pull request as ready for review May 21, 2026 15:25
@Vlaaaaaaad

Vlaaaaaaad commented May 21, 2026

Copy link
Copy Markdown
Contributor

@tomhoule hehe, I was just working on something similar to this. I wanted to implement support for Bedrock service tiers but then I saw that in addition to Bedrock supporting 4 service tiers (standard, flex, priority, reserved) there's also Google with support for 3 service tiers (standard, flex, priority), OpenRouter with support for 2 service tiers (flex, priority), and OpenAI which supports 4 service tiers (standard, flex, priority, scale) 😅
My idea was to implement a drop-down similar to the already-existing reasoning effort level option. That would be more flexible (fast mode is on/off which means only 2 tiers) and it would extend this feature to all LLM providers in Zed (a quick search only showed Bedrock, Google, OpenRouter, and OpenAI supporting different service tiers but I didn't properly research the other built-in LLM providers).

I was looking at doing that in addition to the current fast mode, especially since for Anthropic that's an Anthropic-specific thing. This PR seems to merge fast mode with service tiers instead which seems like it will be limiting?

@tomhoule

tomhoule commented May 21, 2026

Copy link
Copy Markdown
Contributor Author

We can still expose service tiers separately later, but like you noted, that's a much more general and involved feature. I feel that in this specific case it's justified, because Codex itself calls it fast mode.

@Vlaaaaaaad

Copy link
Copy Markdown
Contributor

in this specific case it's justified, because Codex itself calls it fast mode

I think that may be a side-effect of OpenAI being a bit evilmischievous and delaying implementing native support for flex and other service tiers in Codex, but I see why y'all would want consistency!

We can still expose service tiers separately later, but like you noted, that's a much more general and involved feature.

Valid approach!

@tomhoule

Copy link
Copy Markdown
Contributor Author

I definitely don't want to discourage the work on priority tiers :) Once that's fleshed out and implemented, it will be more flexible than just fast mode and we can switch OpenAI models over to a priority tiers based UI. But I'm hoping to get fast mode out very soon so I think it makes sense to start with the simpler switch (especially since we already have the button etc. already implemented for Anthropic's fast mode).

@Vlaaaaaaad

Copy link
Copy Markdown
Contributor

I definitely don't want to discourage the work on priority tiers

No worries, you won't! It's been on my todo list for so long that now I am in "this is annoying me, i wanna get it done" territory 😅 I should have a nice PR opened up by Tuesday 🤞

But I'm hoping to get fast mode out very soon so I think it makes sense to start with the simpler switch (especially since we already have the button etc. already implemented for Anthropic's fast mode).

Totally legit! I commented here because I thought "here is Fast mode in Zed" followed by a quick "and now there's no more Fast mode, here are service tiers in Zed" might confuse users, but y'all at Zed know the users and how to prioritize features way better than I do! (plus, you know, a service tiers implementation would require a bunch of testing and stabilization work so there would be some non-trivial delay between the two)

@Vlaaaaaaad

Copy link
Copy Markdown
Contributor

in this specific case it's justified, because Codex itself calls it fast mode

I think that may be a side-effect of OpenAI being a bit evilmischievous and openai/codex#2916, but I see why y'all would want consistency!

I think this statement was wrong? I am confused and doing more research only intensifies that confusion :lolsob:

@bennetbo

Copy link
Copy Markdown
Member

We should probably add release notes, but looks good!

@tomhoule tomhoule added this pull request to the merge queue May 27, 2026
Merged via the queue into main with commit 5e717a0 May 27, 2026
34 checks passed
@tomhoule tomhoule deleted the tomhoule-uzqlqrtwuzwm branch May 27, 2026 09:21
TomPlanche pushed a commit to TomPlanche/zed that referenced this pull request Jun 2, 2026
…zed-industries#57412)

Maps the existing `Speed::Fast` plumbing to OpenAI's `service_tier:
"priority"`, which matches what "fast mode" in Codex does. Relevant docs
[here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-service_tier).

Like for the existing Anthropic fast mode we have a
`Model::supports_priority` method for the variants on
https://openai.com/api-priority-processing. Pro, nano, and legacy gpt-4
are excluded; Custom defaults to false.

This is gated to staff only for now (not in this diff, but the existing
fast mode feature), until we have the mechanism to require confirmation
before you enable fast mode.

Release Notes:

- Added support for Fast Mode (priority service tier) on the OpenAI API
provider.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed The user has signed the Contributor License Agreement staff Pull requests authored by a current member of Zed staff

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants