open_ai: Support fast mode in BYOK via the Responses API service_tier#57412
Conversation
Maps the existing `Speed::Fast` plumbing onto OpenAI's `service_tier: "priority". Relevant docs [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-service_tier). Like for the existing Anthropic fast mode we have a `Model::supports_priority` method for the variants on openai.com/api-priority-processing. Pro, nano, and legacy gpt-4 are excluded; Custom defaults to false. This is gated to staff only for now, until we have the mechanism to require confirmation before you enable fast mode.
162a135 to
88145a4
Compare
|
I tested this manually. It works as it should. |
|
@tomhoule hehe, I was just working on something similar to this. I wanted to implement support for Bedrock service tiers but then I saw that in addition to Bedrock supporting 4 service tiers ( I was looking at doing that in addition to the current fast mode, especially since for Anthropic that's an Anthropic-specific thing. This PR seems to merge fast mode with service tiers instead which seems like it will be limiting? |
|
We can still expose service tiers separately later, but like you noted, that's a much more general and involved feature. I feel that in this specific case it's justified, because Codex itself calls it fast mode. |
I think that may be a side-effect of OpenAI being a bit
Valid approach! |
|
I definitely don't want to discourage the work on priority tiers :) Once that's fleshed out and implemented, it will be more flexible than just fast mode and we can switch OpenAI models over to a priority tiers based UI. But I'm hoping to get fast mode out very soon so I think it makes sense to start with the simpler switch (especially since we already have the button etc. already implemented for Anthropic's fast mode). |
No worries, you won't! It's been on my todo list for so long that now I am in "this is annoying me, i wanna get it done" territory 😅 I should have a nice PR opened up by Tuesday 🤞
Totally legit! I commented here because I thought "here is Fast mode in Zed" followed by a quick "and now there's no more Fast mode, here are service tiers in Zed" might confuse users, but y'all at Zed know the users and how to prioritize features way better than I do! (plus, you know, a service tiers implementation would require a bunch of testing and stabilization work so there would be some non-trivial delay between the two) |
I think this statement was wrong? I am confused and doing more research only intensifies that confusion :lolsob: |
|
We should probably add release notes, but looks good! |
…zed-industries#57412) Maps the existing `Speed::Fast` plumbing to OpenAI's `service_tier: "priority"`, which matches what "fast mode" in Codex does. Relevant docs [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-service_tier). Like for the existing Anthropic fast mode we have a `Model::supports_priority` method for the variants on https://openai.com/api-priority-processing. Pro, nano, and legacy gpt-4 are excluded; Custom defaults to false. This is gated to staff only for now (not in this diff, but the existing fast mode feature), until we have the mechanism to require confirmation before you enable fast mode. Release Notes: - Added support for Fast Mode (priority service tier) on the OpenAI API provider.
Maps the existing
Speed::Fastplumbing to OpenAI'sservice_tier: "priority", which matches what "fast mode" in Codex does. Relevant docs here.Like for the existing Anthropic fast mode we have a
Model::supports_prioritymethod for the variants on https://openai.com/api-priority-processing. Pro, nano, and legacy gpt-4 are excluded; Custom defaults to false.This is gated to staff only for now (not in this diff, but the existing fast mode feature), until we have the mechanism to require confirmation before you enable fast mode.
Release Notes: