Open-source inference,20% below the rest.
The 9 most popular open-source models — DeepSeek V4 Flash & Pro, V3, R1, Qwen 3.7 Max + 3.6 Plus + 3.6 + 3.5, Kimi K2.6 — through an OpenAI-compatible API. Cheaper than every other reseller. Change one line of code.
or try the models live on HuggingFace — no signup required.
- No subscription
- OpenAI compatible
- Pay as you go
- Text + images
- OpenAI SDK
- Aider
- Cursor
- Cline
- Continue.dev
- LangChain
- Vercel AI SDK
1# One line change. That's it.2from openai import OpenAI34client = OpenAI(5 base_url="https://api.quicksilverpro.io/v1",6 api_key="your-api-key",7)
Cheapest open-source inference
Per 1M tokens for text models · per image or per audio minute where noted.
claude-opus-4-8claude-opus-4-6claude-sonnet-4-6claude-haiku-4-5whisper-large-v3-turbogemini-2.5-flash-imagegemini-3-pro-image-previewText-model comparisons use OpenRouter, Together AI, and Fireworks AI. Image and audio rows show QSP list pricing as of May 2026.
Side-by-side pricing vs every competitor
How much would you save?
Plug in your monthly usage — see the cost on QSP vs every competitor.
qspBuilt for terminals and AI agents. --json output with stable exit codes — Claude Code, Cursor, Aider can call it without parsing HTML.
Common questions
An OpenAI-compatible HTTP API for 9 top open-source LLMs — DeepSeek V4 Flash & Pro, V3, R1, Qwen 3.7 Max, 3.6 Plus, 3.6 + 3.5-35B-A3B, and Kimi K2.6. Point the official OpenAI SDK at our base URL and get the same chat-completions interface, 20% below competing resellers.
V4 Flash is DeepSeek's newest model (released April 2026): ~74% cheaper output than V3, 1M context vs 128K, and thinks by default (chain-of-thought reasoning) — so a one-token "Hi" can return ~175 reasoning tokens. For V3-style cheap chat without the thinking overhead, pass `reasoning: { enabled: false }` in the request body. Existing V3 keeps working unchanged.
20% below the public per-token rates at OpenRouter, Together AI, Fireworks AI, and DeepInfra on the same open-source models. V4 Flash: $0.08 / $0.16. V4 Pro: $0.348 / $0.696. V3: $0.16 / $0.616. R1: $0.56 / $2.00. Qwen 3.7 Max: $2.00 / $6.00. Qwen 3.6 Plus: $0.26 / $1.56. Qwen 3.6: $0.12 / $0.80. Qwen 3.5: $0.111 / $0.80. Kimi K2.6: $0.584 / $2.79. We don't serve closed models (GPT-4, Claude).
Yes. Change base_url to https://api.quicksilverpro.io/v1 in the official openai Python / Node / Swift SDKs. Streaming, tool calling, json_schema strict mode, and usage.cost accounting all work out of the box.
Launch bonus: any first purchase between $5 and $50 doubles. Pay $5 and get $10. Pay $50 and get $100. One-time bonus on your first credit purchase. After that it's standard pay-as-you-go.
Start saving on inference today
Create an account, buy credits, get your API key in 30 seconds.
Get API Key