Badges — Powered by QuickSilver Pro

Terms — the short version

Badges

fast, low-cost, 200K context

Anthropic flagship, 200K context

balanced mid-tier, 200K context

reasoning, math, o1-equivalent

general chat, coding, tool calling

1M ctx, thinks by default, ~74% cheaper than V3

premium reasoning, 1M context

1M context, multimodal, thinking

1M context, image generation

cheapest Gemini, 1M context

pro mid-tier reasoning, 1M context

newest low-cost workhorse, 1M context

flagship reasoning, 1M context

next-gen Flash GA, 1M context

next-gen flash, 1M context

pro image generation

Opus-class reasoning, 256K

262K long-context, RAG

262K long-context, MoE upgrade

1T-MoE flagship, 1M context, thinks by default

Qwen 3.7 flagship, 1M context, thinks by default

Launch bonus: we match your first deposit 100%, up to $50 free. Drop in to the official OpenAI SDK and start saving.

When your workload could use open-source quality, QSP is 6×–30× cheaper than Azure's closed catalog. No resource provisioning, no AAD setup.

QuickSilver Pro vs Azure OpenAI

~9% cheaper on DeepSeek R1 output, and DeepSeek V4 / Qwen 3.6 / Kimi K2.6 aren't on Bedrock yet. Drop-in OpenAI SDK — no SigV4 or AWS plumbing.

QuickSilver Pro vs AWS Bedrock

Lower list price on DeepSeek V3; R1 output ~9% cheaper (input at parity). DeepInfra's cache discount may change the math for cache-heavy prompts.

QuickSilver Pro vs DeepInfra

~32% cheaper on V3, ~75% cheaper on R1 output. OpenAI-compatible surface, same tool-calling semantics.

QuickSilver Pro vs Fireworks AI

Different categories: managed token-priced LLM API vs serverless GPU. QSP for stock open-source chat; Modal for custom models or non-LLM workloads.

QuickSilver Pro vs Modal

Managed inference vs containers you deploy on H100s. QSP is faster to ship when you don't have a GPU fleet to fill or strict data-locality requirements.

QuickSilver Pro vs NVIDIA NIM

20% cheaper on DeepSeek V4 Flash & Pro, V3, R1, Qwen 3.6 & 3.5-35B-A3B, and Kimi K2.6 at the per-token level. Same OpenAI-compatible API, two-line migration.

QuickSilver Pro vs OpenRouter

~71% cheaper on DeepSeek R1 output. Largest gap among resellers for reasoning-heavy workloads.

QuickSilver Pro vs Together AI

DeepSeek V3 is ~7× cheaper than Gemini 2.0 Pro on output. Plain OpenAI SDK — no GCP project, service-account JSON, or quota request.

QuickSilver Pro vs Vertex AI

Head-to-head pricing comparisons for QuickSilver Pro vs the competing OpenAI-compatible inference providers, plus per-model use-case guides with quickstart code.

Compare & Use — QuickSilver Pro

Compare & Use

Math, algorithms, multi-step planning. Open-source o1 alternative at 30x less.

Code generation, refactoring, tool-calling agents. $0.16 / $0.616 per 1M tokens.

262K context, 3B active MoE. RAG and long-document summarization at $0.111 input.

One quick question — it helps us know which channels actually reach developers like you.

How did you find QuickSilver Pro?

Check your inbox

Sign in with your email and password.

Sign in

Set a new password for this account, then we'll sign you in automatically.

Choose a new password

Enter your email and we'll send a link to set a new dashboard password.

Reset your password

Beta invite accepted.

This invite has already been claimed.

We'll email you a verification link to finish sign-up. {amount} free credits included.

We'll email you a verification link to finish sign-up. Top up to start — your first credit purchase is doubled, up to $50 free.

Create your account

Reset your password below — we'll email you a fresh reset link.

Account already exists.

Links are valid for 30 minutes. Enter your email below to send a fresh one.

Link expired.

We couldn't verify that link. Enter your email below to receive a new one.

Invalid link.

Stripe payment method

Keep your balance above a floor and recharge automatically with a saved card.

Auto Recharge

Credits are added to your key automatically within seconds. No refunds once applied.

Use the same email you registered with.

Friend or partner gave you a code? Apply it here to add {amount} of credits to this account. One per account.

Have a referral code?

Credits never expire. Pay only for what you use.

Buy Credits

Permanently remove your account, all API keys, and any remaining credits. This cannot be undone.

If a key was exposed (leaked to git, shared in a screenshot, posted in logs), use this to immediately revoke all keys. A fresh replacement key will be generated automatically.

Danger zone

Name

Create API key

will immediately start getting 401 errors. This can't be undone.

Delete this key?

Delete your account?

Editing {alias}. This caps the spend that can be charged to this specific key within a 30-day window; other keys are unaffected.

Monthly spend limit

All existing keys will be permanently revoked and stop working immediately. Any app using them will get 401 errors. A fresh replacement key will be generated so you're not locked out.

Your old keys have been revoked. Copy this new replacement key now — you won't see it again after closing this dialog.

All keys revoked

Revoke all API keys?

Create named keys for each app or environment. All keys share your account balance.

API Keys

Full keys are shown only once at creation time. If a key leaks, delete it and create a new one - other keys keep working.

Keep your keys secret.

Top open-source models at 20% below market.

Models

Paste these three values wherever your agent / CLI / IDE asks for an "OpenAI-compatible endpoint". Keep your key secret.

Connect your agent

Your key is provisioned but balance is zero, so the demo call below will 402 until you load credits. Launch bonus: we match your first deposit 100%, up to $50 free. Pay $5, get $10.

Top up to enable your key.

Copy this command, paste into your terminal. Costs about

Make your first call

Friends signing up with your link get {amount} in credits on their first top-up. We add the same {amount} to your balance at the same time.

Invite friends — get {amount} each

Your balance is depleted ({spent} spent so far). Add credits to keep making API calls — same pricing, same models, pay as you go.

You're out of credits

Your account activity and usage at a glance.

Overview

Agent-friendly:

V4 Flash, V4 Pro, and Kimi K2.6 think by default - pass

in the request body for V3-style cheap chat. (Qwen 3.6 always thinks today; DeepSeek R1 is a dedicated reasoning model - don't send this flag to either.)

Available models:

Vercel AI SDK users:

Drop-in replacement for OpenAI. Change one line.

Quick Start

Spend and requests across models.

Usage

The page you're looking for doesn't exist or has been moved.

Everything you might want to know about QuickSilver Pro — the OpenAI-compatible inference API for DeepSeek V4 Flash & Pro, V3, R1, Qwen 3.6 & 3.5-35B-A3B, and Kimi K2.6.

FAQ — QuickSilver Pro

Frequently asked questions

Plug in your monthly usage — see the cost on QSP vs every competitor.

How much would you save?

output with stable exit codes — Claude Code, Cursor, Aider can call it without parsing HTML.

Create an account, buy credits, get your API key in 30 seconds.

Start saving on inference today

Common questions

The 9 most popular open-source models — DeepSeek V4 Flash & Pro, V3, R1, Qwen 3.7 Max + 3.6 Plus + 3.6 + 3.5, Kimi K2.6 — through an OpenAI-compatible API. Cheaper than every other reseller. Change one line of code.

20% below the rest.

Open-source inference,

Customers save 20% today on a curated catalog. The gap comes from a narrow operational surface and tight engineering, not margin shaving. Phase 2 widens the gap as more of the stack moves in-house.

Today: launched on a curated catalog

We're building a self-hosted serving layer on H100/H200 using SGLang + continuous batching, EAGLE-3 speculative decoding, FP8 quantization via DeepGEMM, and SageAttention / ThunderMLA custom kernels. Target: another 30-50% below current prices on DeepSeek V3.

Next: our own inference stack on dedicated GPUs

Weights are public — we can actually run and optimize them. Closed models (GPT-4, Claude) don't expose weights, so no amount of infra work makes them cheaper. That's why our catalog is 7 open models we can verify, route, and eventually host ourselves.

Why open-source is the only way there

Per 1M tokens for text models · per image or per audio minute where noted.

Cheapest open-source inference

DeepSeek V3 for tool-calling agents →

Qwen3.5-35B-A3B for 262K RAG →

DeepSeek R1 for math & algorithms →

vs DeepInfra

vs Fireworks

vs OpenAI

vs OpenRouter

vs Together AI

OpenAI-compatible API for top open-source LLMs — Qwen 3.7 Max & 3.6 Plus (new), DeepSeek V4 Flash & Pro, V3, R1, Kimi K2.6 — 20% cheaper than OpenRouter, Together AI, Fireworks. One-line drop-in. Launch bonus: match 100% of your first credit purchase, up to $50 free.

Qwen 3.7 Max, DeepSeek V4, R1, Kimi K2.6 API · 20% cheaper · QSP

The Controller determines the purposes and means of processing Personal Data submitted to the service. The Processor processes that Personal Data solely on documented instructions from the Controller (via API request parameters) and only to provide the service.

Data subjects: end-users of the Controller's application whose inputs are submitted to the API. Categories: content of prompts and completions (which may contain any Personal Data the Controller chooses to send), account metadata, and usage metadata. Duration: for the term of the service agreement.

Process Personal Data only on Controller's instructions, unless required by law (in which case the Processor will notify the Controller first where legally permitted).

Impose confidentiality obligations on personnel authorized to process Personal Data.

Implement the technical and organizational measures described in Section 6.

Not train machine-learning models on Controller Personal Data.

Assist the Controller with data subject requests and regulatory investigations, at the Controller's reasonable cost for non-routine requests.

The Controller authorizes the Processor to engage the sub-processors listed on our Privacy Policy. We will give at least 30 days' notice (by email or dashboard) before adding or replacing a sub-processor. If the Controller reasonably objects, the Controller may terminate and receive a pro-rata refund of prepaid unused fees.

Personal Data may be processed in the United States. For transfers originating in the EEA, UK, or Switzerland, the parties rely on the EU Standard Contractual Clauses (Module Two, Controller-to-Processor) and the UK International Data Transfer Addendum, which are incorporated by reference upon execution of the enterprise DPA.

TLS 1.2+ for data in transit; HSTS enforced.

API keys stored as SHA-256 hashes; shown in cleartext only once at creation.

Webhook signatures verified with HMAC-SHA256 and timestamp freshness window.

Prompt and completion content not persisted to our storage (held in memory only during request processing).

Access to production systems restricted to authorized personnel, with logs retained for investigation.

Hosted on Railway (SOC 2 Type II baseline) and Cloudflare edge.

We will notify the Controller without undue delay, and in any event within 72 hours of becoming aware, of any confirmed Personal Data breach affecting Controller Personal Data. Notifications will include the nature of the breach, categories and approximate number of data subjects, likely consequences, and measures taken.

On termination, we will delete all Personal Data within 30 days, subject to any retention required by law (for tax/accounting, up to 7 years in aggregated, non-personal form). The Controller may export account and usage metadata via the dashboard or by API at any time during the service term.

The Processor will make available to the Controller, upon request, its most recent third-party audit reports (once completed, SOC 2 Type II) or equivalent summaries. Where additional audit rights are required by applicable law, the Processor will reasonably cooperate at the Controller's expense.

In case of conflict between this DPA and the Terms of Service, this DPA controls as to processing of Personal Data.

Data Protection contact: hello@quicksilverpro.io — MachineFi Inc., 68 Willow Road, Menlo Park, CA 94205, USA.

Data Processing Addendum — QuickSilver Pro

Data Processing Addendum

Account data: email address you provide on sign-up.

API keys: we store only a cryptographic hash; the raw secret is shown once at creation and not recoverable.

Usage metadata: timestamps, model name, token counts (prompt and completion), request duration, and per-request cost — for billing and abuse detection.

Payment data: handled by Stripe. We see the last four digits of the card and the billing email; we do not see full card numbers.

We do not persist the contents of your API prompts or model completions. Requests are processed in memory during the call and discarded on completion. Infrastructure partners we use for compute may have their own retention policies — see Section 5.

We use collected data to: operate and bill the service, enforce usage limits, investigate abuse, send transactional emails (receipts, security notices), and comply with legal obligations. We do not sell your data to third parties, and we do not train machine-learning models on your prompts or completions.

Account metadata is kept while your account is active and for 12 months after closure (for tax and accounting records). Usage logs are kept for 90 days. You can request earlier deletion by emailing hello@quicksilverpro.io; we will honor verified requests within 30 days subject to legal retention requirements.

We share data only with the providers required to operate the service.

Data in transit is protected with TLS 1.2+. API keys are stored as SHA-256 hashes. Stripe webhook signatures are verified against an HMAC secret. Our backend runs on Railway (SOC 2 Type II baseline). We do not persist prompt content.

You can request access to, correction of, export of, or deletion of your personal data by emailing hello@quicksilverpro.io. California residents have additional rights under the CCPA, and EU/UK residents under the GDPR, including the right not to be subject to unlawful automated decision-making. We will verify your identity before acting on requests.

We use localStorage only, to remember your signed-in dashboard session across visits. We do not use tracking cookies, analytics cookies, or ad networks.

The service is not intended for use by anyone under 18.

We will announce material changes to this policy by email at least 30 days before taking effect.

MachineFi Inc., 68 Willow Road, Menlo Park, CA 94205, USA — hello@quicksilverpro.io

Privacy Policy — QuickSilver Pro

Privacy Policy

We provide an OpenAI-compatible HTTP API for serving popular open-source language models at predictable prices. Availability, model selection, and pricing are subject to change with reasonable notice.

You are responsible for keeping your API keys confidential and for all activity under your account. You may rotate or revoke keys at any time from the dashboard. Accounts are for a single legal entity; do not share accounts.

Service is prepaid in USD via Stripe. Credits never expire while your account remains active. Purchased credits are non-refundable once spent. Unused credits may be refunded within 7 days of purchase on written request to hello@quicksilverpro.io. Stripe charges appear on your statement as MACHINEFI INC.

You may not use the service to (a) generate content that is illegal, infringing, or otherwise violates our published usage policies; (b) attempt to reverse-engineer the service, extract model weights, or bypass rate limits; (c) send automated spam or facilitate abuse; or (d) resell the raw API without a separate written agreement.

We may throttle or temporarily block traffic that we reasonably believe is abusive or that threatens service stability for other customers. Our best-effort uptime target is 99.5% monthly, excluding scheduled maintenance and events outside our reasonable control. We do not offer a contractual SLA on standard plans.

The service is provided "as is" and "as available". We disclaim all warranties, express or implied, including fitness for a particular purpose and non-infringement. Model outputs may be inaccurate, offensive, or unsuitable for your use case; you are responsible for reviewing outputs before relying on them.

To the maximum extent permitted by law, our aggregate liability arising out of or related to your use of the service will not exceed the fees you paid to us in the 12 months preceding the claim. We are not liable for indirect, incidental, or consequential damages.

You may close your account at any time by emailing hello@quicksilverpro.io. We may suspend or terminate accounts that violate these Terms. Remaining unused credits will be refunded within 30 days unless termination is for breach of Section 4.

We may modify the service, pricing, or these Terms. Material changes will be announced by email or dashboard notice at least 30 days before taking effect. Continued use after the effective date constitutes acceptance.

These Terms are governed by the laws of the State of California. Any dispute will be resolved exclusively in the state or federal courts located in San Mateo County, California.

Terms of Service — QuickSilver Pro

Terms of Service

QuickSilver Pro

Stats — QuickSilver Pro

Honest stats only. Aggregate counters (tokens served, dollars saved) go live as the user base grows enough for the numbers to be credible — see the "Live counters" note below.

QuickSilver Pro by the numbers

Customers save 20% today on a curated catalog at low list prices. The narrow operational surface is what keeps the gap honest, and the same surface scales straight into Phase 2.

Now - launched on a curated catalog

Self-hosted serving on dedicated GPUs using SGLang + continuous batching, EAGLE-3 speculative decoding, FP8 quantization via DeepGEMM, and SageAttention / ThunderMLA custom kernels. At that point system_fingerprint becomes stable (it changes only when we rev the stack), and repeatable-seed workflows start working properly. Target: 30-50% below current prices on DeepSeek V3.

Q2 2026 - our own inference stack on H100/H200

Move from rented (Vast.ai) to self-owned or colocated racks. Partner with AI-datacenter operators where that makes sense. The goal is the cheapest reliable inference for open-source models on the planet - full stack, our engineering.

H2 2026 - colocated data center + AIDC partnerships

QuickSilver Pro system status

Email raullen@machinefi.com with rough monthly spend, model mix, and any compliance requirements. A founder replies — usually same business day.

Reserved per-key throughput so a noisy neighbor in the shared pool can't slow your prod traffic. Burst capacity scales with your contract. Optional region pinning for data-residency requirements.

Dedicated capacity

Same OpenAI-compatible API as the self-serve plan, same 7 models, same client SDKs. Start in dev on a self-serve key; once spend is predictable, flip to invoice and your existing integration keeps working — only the billing terms change.

Production inference for engineering teams. Invoice billing, monthly net-30 terms, volume discounts past $5K/mo, 99.9% uptime target, dedicated per-key capacity, and a real human you can email.

QuickSilver Pro for teams — invoice billing, SLA, volume pricing

Production inference for engineering teams running $1K–$50K/mo of traffic. Same 7-model OpenAI-compatible API. Invoice billing, volume pricing, SLA, dedicated capacity, and an email you can actually send.

QuickSilver Pro Docs

What is QuickSilver Pro?

Start here

Conventions used in these docs