USDper 1M tokens
| Model | Window | Input | Cached | Output |
|---|---|---|---|---|
| Priority | 0.45 | 0.20 | 3.00 | |
| ASAP | 1.00 | 0.20 | 4.00 | |
| Standard | 0.50 | 0.12 | 2.50 | |
| Priority | 0.70 | 0.18 | 3.00 | |
| Flex | 0.40 | 0.08 | 1.80 | |
| ASAP | 1.40 | 0.26 | 4.40 | |
| Priority | 0.04 | 0.02 | 0.30 | |
| ASAP | 0.06 | 0.03 | 0.40 | |
| Standard | 0.12 | 0.08 | 0.60 | |
| Flex | 0.06 | 0.02 | 0.30 | |
| ASAP | 0.40 | 0.20 | 0.60 | |
| Standard | 0.07 | 0.05 | 0.40 | |
| ASAP | 0.14 | 0.07 | 0.40 | |
| ASAP | 1.75 | 0.15 | 4.50 | |
| ASAP | 0.30 | 0.06 | 1.20 |
- Sail supports four completion windows:
standard,priority,flex, andasap. See Completion Windows for details.- Not all models support all windows. We regularly bring up new models and expand completion window support for existing ones based on demand. If you have a need that’s not represented above, get in touch.
- Prompt caching is implicit, based on prefix matching. Optionally, you may use
prompt_cache_keyas a routing hint to help maximize cache hit rates. - See Models for capabilities and other details on supported models.
- To see what these rates add up to on a full agent workload, use the agent cost calculator.