Launch bonusWe match 100% of your first credit purchase — up to $50 free

Open-source inference,20% below the rest.

The 9 most popular open-source models — DeepSeek V4 Flash & Pro, V3, R1, Qwen 3.7 Max + 3.6 Plus + 3.6 + 3.5, Kimi K2.6 — through an OpenAI-compatible API. Cheaper than every other reseller. Change one line of code.

Get API Key View Pricing

or try the models live on HuggingFace — no signup required.

No subscription
OpenAI compatible
Pay as you go
Text + images

Drop-in with

OpenAI SDK
Aider
Cursor
Cline
Continue.dev
LangChain
Vercel AI SDK

python

1# One line change. That's it.
2from openai import OpenAI
3 
4client = OpenAI(
5    base_url="https://api.quicksilverpro.io/v1",
6    api_key="your-api-key",
7)

Pricing

Cheapest open-source inference

Per 1M tokens for text models · per image or per audio minute where noted.

Model

Context

Input

Output

Savings

DeepSeek V4 FlashNew

deepseek-v4-flash

fast chat & coding, 1M context, thinking on by default

$0.08$0.10

$0.16$0.20

−20%

DeepSeek V4 ProNew

deepseek-v4-pro

premium reasoning, 1M context

$0.348$0.435

$0.696$0.87

−20%

DeepSeek V3

deepseek-v3

chat, coding, structured output

128K

$0.16$0.20

$0.616$0.77

−20%

DeepSeek R1Reasoning

deepseek-r1

math, multi-step reasoning, logic

128K

$0.56$0.70

$2.00$2.50

−20%

Qwen3.7 MaxNew

qwen3.7-max

Qwen 3.7 flagship, agent / coding, 1M context

$2.00$2.50

$6.00$7.50

−20%

Qwen3.6 PlusNew

qwen3.6-plus

thinks-by-default flagship, 1M context

$0.26$0.325

$1.56$1.95

−20%

Qwen3.6-35B-A3B

qwen3.6-35b

long-context RAG, drop-in 3.5 upgrade

262K

$0.12$0.15

$0.80$1.00

−20%

Qwen3.5-35B-A3B

qwen3.5-35b

long-context RAG, summarization

262K

$0.111$0.139

$0.80$1.00

−20%

Kimi K2.6

kimi-k2.6

Opus-class agentic / planning

256K

$0.584$0.73

$2.79$3.49

−20%

Claude Opus 4.8New

claude-opus-4-8

top-tier reasoning, coding, agentic

200K

$4.00$5.00

$20.00$25.00

−20%

Claude Opus 4.6New

claude-opus-4-6

deep reasoning and coding

200K

$4.00$5.00

$20.00$25.00

−20%

Claude Sonnet 4.6New

claude-sonnet-4-6

balanced mid-tier, fast & capable

200K

$2.40$3.00

$12.00$15.00

−20%

Claude Haiku 4.5New

claude-haiku-4-5

fast, low-cost, high-volume tasks

200K

$0.80$1.00

$4.00$5.00

−20%

Whisper Large V3 TurboNew

whisper-large-v3-turbo

fast audio transcription via /v1/audio/transcriptions

Audio

$0.0004/min$0.000667/min

—

−40%

FLUX.2 ProNew

flux.2-pro

flagship image generation

—

$0.027/img$0.031/img

−13%

FLUX.1 SchnellNew

flux.1-schnell

fast, high-volume image generation

—

$0.0025/img$0.003/img

−17%

Gemini 2.5 FlashNew

gemini-2.5-flash

multimodal chat, 1M context

$0.255$0.30

$2.125$2.50

−15%

Gemini 2.5 Flash ImageNew

gemini-2.5-flash-image

image generation

$0.255$0.30

$25.50$30.00

−15%

Gemini 2.5 Flash LiteNew

gemini-2.5-flash-lite

high-volume cheap tasks

$0.085$0.10

$0.34$0.40

−15%

Gemini 3 Flash PreviewNew

gemini-3-flash-preview

next-gen flash reasoning

$0.425$0.50

$2.55$3.00

−15%

Gemini 3 Pro Image PreviewNew

gemini-3-pro-image-preview

pro-grade image generation

$1.70$2.00

$102.00$120.00

−15%

Gemini 3.1 Pro PreviewNew

gemini-3.1-pro-preview

flagship reasoning, 1M context

$1.70$2.00

$10.20$12.00

−15%

Gemini 3.5 FlashNew

gemini-3.5-flash

next-gen Flash GA, 1M context

$1.275$1.50

$7.65$9.00

−15%

Gemini 2.5 ProNew

gemini-2.5-pro

pro mid-tier reasoning, 1M context

$1.0625$1.25

$8.50$10.00

−15%

Gemini 3.1 Flash LiteNew

gemini-3.1-flash-lite

newest low-cost workhorse, high volume

$0.2125$0.25

$1.275$1.50

−15%

Text-model comparisons use OpenRouter, Together AI, and Fireworks AI. Image and audio rows show QSP list pricing as of May 2026.

Side-by-side pricing vs every competitor

DeepSeek V3 for tool-calling agents →

Reasoning

DeepSeek R1 for math & algorithms →

Long context

Qwen3.5-35B-A3B for 262K RAG →

See all comparisons →

Calculator

How much would you save?

Plug in your monthly usage — see the cost on QSP vs every competitor.

Common totals (10:1 input/output):

Thinking model — output token counts include the reasoning trace, which is typically 3-10× the visible reply.

Input tokens / month1M

Output tokens / month300K

QuickSilver Pro

$0.13cheapest

OpenRouter

$0.16+25%

OpenAIclosed model analog

$0.33+2.6×

QSP saves 3¢/month vs OpenRouter (20% cheaper).

CLIqsp

Built for terminals and AI agents. --json output with stable exit codes — Claude Code, Cursor, Aider can call it without parsing HTML.

PyPI GitHub Quickstart →

FAQ

Common questions

What is QuickSilver Pro?

An OpenAI-compatible HTTP API for 9 top open-source LLMs — DeepSeek V4 Flash & Pro, V3, R1, Qwen 3.7 Max, 3.6 Plus, 3.6 + 3.5-35B-A3B, and Kimi K2.6. Point the official OpenAI SDK at our base URL and get the same chat-completions interface, 20% below competing resellers.

What's the difference between V3 and V4 Flash?

V4 Flash is DeepSeek's newest model (released April 2026): ~74% cheaper output than V3, 1M context vs 128K, and thinks by default (chain-of-thought reasoning) — so a one-token "Hi" can return ~175 reasoning tokens. For V3-style cheap chat without the thinking overhead, pass `reasoning: { enabled: false }` in the request body. Existing V3 keeps working unchanged.

How much cheaper than OpenRouter / OpenAI?

20% below the public per-token rates at OpenRouter, Together AI, Fireworks AI, and DeepInfra on the same open-source models. V4 Flash: $0.08 / $0.16. V4 Pro: $0.348 / $0.696. V3: $0.16 / $0.616. R1: $0.56 / $2.00. Qwen 3.7 Max: $2.00 / $6.00. Qwen 3.6 Plus: $0.26 / $1.56. Qwen 3.6: $0.12 / $0.80. Qwen 3.5: $0.111 / $0.80. Kimi K2.6: $0.584 / $2.79. We don't serve closed models (GPT-4, Claude).

Is it really a drop-in OpenAI replacement?

Yes. Change base_url to https://api.quicksilverpro.io/v1 in the official openai Python / Node / Swift SDKs. Streaming, tool calling, json_schema strict mode, and usage.cost accounting all work out of the box.

Is there a free tier?

Launch bonus: any first purchase between $5 and $50 doubles. Pay $5 and get $10. Pay $50 and get $100. One-time bonus on your first credit purchase. After that it's standard pay-as-you-go.

See all questions

Start saving on inference today

Create an account, buy credits, get your API key in 30 seconds.

Get API Key