Status & Roadmap - QuickSilver Pro

All systems operational

Checking...

Services

90-day history · client-measured

Inference API

api.quicksilverpro.io

Checking...

— · —

Dashboard backend

pay.quicksilverpro.io

Checking...

— · —

Website

quicksilverpro.io

Checking...

— · —

Model availability

Synthetic probe · updates every 3 min

DeepSeek V4 Flash

deepseek-v4-flash

1M context · cheap chat, thinks by default

Checking...

— · —

DeepSeek V4 Pro

deepseek-v4-pro

1M context · premium reasoning

Checking...

— · —

DeepSeek V3

deepseek-v3

128K context · general-purpose

Checking...

— · —

DeepSeek R1

deepseek-r1

128K context · reasoning

Checking...

— · —

Qwen3.7 Max

qwen3.7-max

1M context · Qwen 3.7 flagship · agent

Checking...

— · —

Qwen3.6 Plus

qwen3.6-plus

1M context · 1T-MoE flagship · thinks by default

Checking...

— · —

Qwen3.6-35B-A3B

qwen3.6-35b

3B active · 262K context · MoE upgrade

Checking...

— · —

Qwen3.5-35B-A3B

qwen3.5-35b

3B active · 262K context · MoE

Checking...

— · —

Kimi K2.6

kimi-k2.6

256K context · agentic / planning

Checking...

— · —

Roadmap - how we become a real inference company

Now - launched on a curated catalog

Live

Customers save 20% today on a curated catalog at low list prices. The narrow operational surface is what keeps the gap honest, and the same surface scales straight into Phase 2.

Q2 2026 - our own inference stack on H100/H200

Planned

Self-hosted serving on dedicated GPUs using SGLang + continuous batching, EAGLE-3 speculative decoding, FP8 quantization via DeepGEMM, and SageAttention / ThunderMLA custom kernels. At that point system_fingerprint becomes stable (it changes only when we rev the stack), and repeatable-seed workflows start working properly. Target: 30-50% below current prices on DeepSeek V3.

H2 2026 - colocated data center + AIDC partnerships

Future

Move from rented (Vast.ai) to self-owned or colocated racks. Partner with AI-datacenter operators where that makes sense. The goal is the cheapest reliable inference for open-source models on the planet - full stack, our engineering.

About this page

Service rows run client-side probes from your browser. Model rows reflect a real 1-token probe sent server-side every 3 minutes from our backend. Historical bars show the results of recent probes stored in this browser's localStorage; cleared if you switch devices.

Public uptime tracking began 2026-04-16. For a contractual SLA and third-party-monitored history, contact us.

QuickSilver Pro system status

Services

Model availability

Roadmap - how we become a real inference company

Now - launched on a curated catalog

Q2 2026 - our own inference stack on H100/H200

H2 2026 - colocated data center + AIDC partnerships

About this page