🔥Price drop Qwen3‑Coder‑480B‑A35B‑Instruct:
• $0.40/M input tokens
• $1.60/M output tokens
One of the best open coding models — now more accessible via DeepInfra.
(Best prices, as always.)
DeepInfra
648 posts
Fast ML inference. Run top AI models using a simple API.
- Just got some new GPUs so we are lowering our prices for 7b modes to $0.13 / 1M tokens. We also offer the lower price for the latest mixtral-8x7b of $0.27 / 1M tokens. Deep Infra will always provide the most cost effective inference service.
- GLM-4.5 is here — latest drop from @Zai_org 🚀 Built for agentic workflows: reasoning, coding, tools. ✅ GLM-4.5 → 355B total / 32B active → $0.60 / $2.20 per Mtoken ✅ GLM-4.5-Air → 106B total / 12B active → $0.20 / $1.10 Smart models, smart prices. Cheapest at DeepInfra!
- 🚀OlmOCR on DeepInfra🚀 🔥 New LLM-based OCR model by @allen_ai 💸 Scrape 1000-page PDFs for just $0.15 📊 300x cheaper than competitor price
- Moonshot AI's Kimi 2 is now live on DeepInfra, as always at the best price of $0.55/$2.20, full tool call and context support. Best open source non-reasoning model available according to multiple benchmarks. Running on Nvidia Blackwell🇺🇸.
- Up to 100 tps Moonshot AI's Kimi K2, as always at the best price of $0.55/$2.20. Zero data retention, generous rate limits.
- Qwen3‑Coder now 200 TPS on DeepInfra at the best prices of $0.30/M input and $1.20/M output
- Deepseek R1 is now live on the DeepInfra inference platform. 🌎 Hosted in the US with zero data retention. 💸 Always the best price: $0.85/$2.50 per 1M in/out tokens. Get started now!
- Claude 4 Opus & Sonnet now live on DeepInfra. Run them via our OpenAI-compatible API — fast, scalable, and infra-friendly.
- 🚀 We now have a Turbo version of Qwen3‑Coder at $0.30/M input tokens $1.20/M output tokens. ⚡️Same accuracy (within 1% of original) ⚡️2× faster & cheaper One of the best open coding models - now faster & more affordable on DeepInfra 👇
- Gemini 2.5 Pro & Flash are now live on DeepInfra. OpenAI-compatible API. Full control over reasoning. ⚡ Flash: $0.105 / $2.45 🚀 Pro: $0.875 / $7.00 Cheapest on the market (prove us wrong).
- Qwen3-235B-A22B-Instruct-2507 is now live on DeepInfra. 🔧 Upgraded version of the original 235B “non-thinking” model 🧠 Better at reasoning, math, comprehension, tool use 💰 $0.13 / $0.60 per Mtoken (in/out) The scale is real. #Qwen3 #LLMs #InferenceInfra #DeepInfra
- We just broke 1000 TPS on our Llama 4 Maverick Turbo API endpoint! Hosted in the US on @nvidia Blackwell, delivering blazing-fast performance for your AI needs. Ready to scale?
- We have the best prices for DeepSeek R1, the best AI model on the market, to just $0.75/$2.40 per 1M tokens. 🎉 🔥More capacity 🚀Higher rate limits 🇺🇸Hosted on H200 in the US and EU 🇪🇺 Unleash the Deep AI Infrastructure at scale.







