GPU-Bridge API Documentation
GPU-Bridge is an orchestration layer for AI workloads: one API for 30 services and 99 models across 6 GPU backends, with automatic failover, upfront pricing, and x402 payments for autonomous agents. Base URL: https://api.gpubridge.io
POST /run for the universal multi-modal contract, POST /inference for dedicated LLM requests, and /mcp if your client prefers MCP tools over raw HTTP.Quick Start
First successful request in under 2 minutes. Choose the flow that matches how you build.
Developers: API key + credits
# 1. Register (free, instant) curl -X POST https://api.gpubridge.io/account/register \ -H "Content-Type: application/json" \ -d '{"email":"you@example.com"}' # Returns: {"api_key":"gpub_...", ...} # 2. Add credits ($10 minimum) curl -X POST https://api.gpubridge.io/account/topup \ -H "Authorization: Bearer gpub_your_key" \ -H "Content-Type: application/json" \ -d '{"package":"credits_25"}' # 3. Run any service curl -X POST https://api.gpubridge.io/run \ -H "Authorization: Bearer gpub_your_key" \ -H "Content-Type: application/json" \ -H "X-Priority: fast" \ -H "Idempotency-Key: req_001" \ -d '{"service":"llm-4090","input":{"prompt":"Hello world"}}' # 4. Retrieve output (pass API key for full response) curl https://api.gpubridge.io/status/{job_id} \ -H "Authorization: Bearer gpub_your_key" Copy
AI agents: x402 on Base
# 1. Call without auth to receive HTTP 402 payment details curl -X POST https://api.gpubridge.io/run \ -H "Content-Type: application/json" \ -d '{"service":"ocr","input":{"image_url":"https://example.com/invoice.png"}}' # 2. Send USDC on Base, then retry with X-Payment curl -X POST https://api.gpubridge.io/run \ -H "X-Payment: base64({\"txHash\":\"0x...\",\"from\":\"0xYourWallet\"})" \ -H "Content-Type: application/json" \ -d '{"service":"ocr","input":{"image_url":"https://example.com/invoice.png"}}' Copy
POST /run for the universal multi-modal contract, POST /inference for dedicated LLM requests, and /mcp if your client wants MCP tools instead of raw HTTP.TypeScript example
const res = await fetch("https://api.gpubridge.io/run", { method: "POST", headers: { "Authorization": "Bearer gpub_your_key", "Content-Type": "application/json", "X-Priority": "fast", "Idempotency-Key": "req_001", }, body: JSON.stringify({ service: "llm-4090", input: { prompt: "Summarize this changelog" }, }), }); const job = await res.json(); console.log(job); Copy
Python example
import requests res = requests.post( "https://api.gpubridge.io/run", headers={ "Authorization": "Bearer gpub_your_key", "Content-Type": "application/json", "X-Priority": "fast", "Idempotency-Key": "req_001", }, json={ "service": "llm-4090", "input": {"prompt": "Summarize this changelog"}, }, timeout=30, ) print(res.json()) Copy
Authentication & Payments
GPU-Bridge supports two request-time access patterns and one account funding option:
Option A: API key + credits (developers)
Register once, top up credits, then send Authorization: Bearer gpub_... on each request. Best for apps, backends, and teams that want account balance, job history, refunds, and spending limits.
Authorization: Bearer gpub_your_api_key
Copy
Option B: x402 on Base (AI agents)
No account or API key required. Call the endpoint, receive HTTP 402 payment details, send USDC on Base, then retry with X-Payment. GPU-Bridge pre-validates the target service before consuming the payment proof. See x402 Protocol for the full flow.
X-Payment: base64({"txHash":"0x...","from":"0xYourWallet"})
Copy
Option C: Crypto top-up (account funding)
If you want account-based usage but prefer crypto, top up your credit balance with USDC on Base via POST /account/topup-crypto. This is a funding method for API-key usage, not a separate per-request auth flow. 0.5% fee vs 2.9% for card.
audio_url or image_url, use direct CDN links (e.g. soundhelix.com, imgbb.com). GitHub raw and Wikimedia URLs often return 403 from compute nodes.MCP Server
GPU-Bridge exposes a remote MCP endpoint at /mcp for Smithery and other MCP-compatible clients. Use MCP when you want tool-native AI compute instead of hand-writing HTTP calls.
| Tool | Description |
|---|---|
gpu_run | Run any GPU-Bridge service |
gpu_catalog | Browse the live service catalog with pricing and model info |
gpu_status | Check job status and retrieve results |
gpu_balance | Check balance, daily spend, and volume discount tier |
gpu_estimate | Estimate the cost of a service before running it |
POST /run
Universal orchestration endpoint. Use it when you want one contract across text, image, video, audio, vision, OCR, embeddings, reranking, and document parsing. GPU-Bridge selects the best available backend for the service and can reroute automatically when a backend degrades.
Request body
{
"service": "llm-4090",
"input": { "prompt": "Explain quantum computing", "max_tokens": 512 },
"webhook_url": "https://your-server.com/callback" // optional
}
Copy
Optional headers
| Header | Values | Description |
|---|---|---|
X-Priority | fast | cheap | Routing hint. fast prefers the lowest-latency healthy backend. cheap prefers the lowest-cost healthy backend. |
Idempotency-Key | Any unique string | Prevents duplicate jobs for API-key / credit-based requests. Reuse the same key when retrying after a network error. |
Idempotency-Key currently applies to API-key / credit-based requests. x402 clients should follow the x402 retry flow using the same request body and payment proof semantics.Response (202)
{
"job_id": "a1b2c3d4-...",
"service": "llm-4090",
"status": "pending",
"status_url": "/status/a1b2c3d4-...",
"estimated_cost_usd": 0.003
}
Copy
POST /inference
Dedicated LLM route with a simpler body than POST /run. Use this when you only need text generation and want a prompt-centric schema. GPU-Bridge still routes across available LLM backends behind the scenes.
Request body
{
"model": "deepseek-ai/DeepSeek-V3.2",
"prompt": "Summarize this changelog in 5 bullets",
"system": "You are a concise release assistant.",
"max_tokens": 512,
"temperature": 0.2,
"gpu": "4090"
}
Copy
Use GET /catalog for the live list of supported model IDs. The gpu field selects the GPU tier, not the provider.
Response
{
"job_id": "a1b2c3d4-...",
"status": "completed",
"status_url": "/status/a1b2c3d4-...",
"output": { "text": "..." },
"estimated_cost_usd": 0.0032,
"execution_time_seconds": 0.42
}
Copy
status: "completed". If GPU-Bridge routes to a non-inline path, you may receive pending and poll GET /status/:job_id.GET /status/:job_id
job_id itself acts as the retrieval token; treat it like a secret. Without the matching credential/token, you receive status, timing, and hints only.Poll until status is completed or failed. Some low-latency routes return inline in the initial response and may not require polling at all.
{
"id": "a1b2c3d4-...",
"status": "completed",
"output": { "text": "Quantum computing leverages..." },
"execution_time_seconds": 0.45
}
Copy
job_id can retrieve the result. Do not log or expose x402 job IDs publicly.GET /catalog
GET /catalog/estimate
{
"service": "llm-4090",
"estimated_seconds": 25,
"price_per_second": 0.0024,
"estimated_cost_usd": 0.06
}
Copy
Use this before submission when you want an upfront cost reference. For credit-based requests, final net cost may be lower after execution is reconciled. For x402, request validity is checked before payment is consumed.
GET /account/balance
curl https://api.gpubridge.io/account/balance \ -H "Authorization: Bearer gpub_your_key" Copy
{
"balance": 8.50,
"email": "you@example.com",
"daily_spend": 1.50,
"daily_limit": 50,
"volume_discount": { "tier": "Standard", "discount_percent": 0 }
}
Copy
POST /account/topup
| Package | Price | Credits | Bonus |
|---|---|---|---|
credits_10 | $10 | $10.00 | — |
credits_25 | $25 | $26.25 | +5% |
credits_50 | $50 | $55.00 | +10% |
credits_100 | $100 | $115.00 | +15% |
curl -X POST https://api.gpubridge.io/account/topup \ -H "Authorization: Bearer gpub_your_key" \ -H "Content-Type: application/json" \ -d '{"package":"credits_25"}' # Returns: {"checkout_url":"https://checkout.stripe.com/..."} Copy
POST /account/topup-crypto
curl -X POST https://api.gpubridge.io/account/topup-crypto \ -H "Authorization: Bearer gpub_your_key" \ -H "Content-Type: application/json" \ -d '{"package":"credits_25"}' # Returns payment address — send USDC, credits added automatically Copy
GET /account/jobs
POST /account/spending-limit
curl -X POST https://api.gpubridge.io/account/spending-limit \ -H "Authorization: Bearer gpub_your_key" \ -H "Content-Type: application/json" \ -d '{"daily_limit":100}' Copy
POST /account/auto-topup
'{"enabled":true,"threshold":1.00,"package":"credits_10"}' Copy
API Key Management
Self-service key recovery is intentionally limited for security. If you still have a valid key, rotate it with POST /account/regenerate-key. If you lost your key, use POST /account/recover to get the support-assisted recovery flow.
Text & Intelligence
| Service | Key | Input | From |
|---|---|---|---|
| LLM Inference 33 models across 6 backends | llm-4090 | {"prompt":"...","model":"deepseek-ai/DeepSeek-V3.2","max_tokens":512} | $0.003 |
| Text Embeddings 1024-dim vectors | embedding-l4 | {"text":"..."} | $0.01 |
| Visual Q&A Moondream2 | llava-4090 | {"image_url":"...","prompt":"..."} | $0.05 |
| Image Captioning BLIP | caption | {"image_url":"..."} | $0.01 |
| CLIP Interrogator Image to text prompt | clip | {"image_url":"..."} | $0.02 |
| Document Reranking Jina, for RAG | rerank | {"query":"...","documents":["..."],"top_n":3} | $0.001 |
Image & Video
| Service | Key | Input | From |
|---|---|---|---|
| Image Generation 13 models across Together + Replicate | image-4090 | {"prompt":"...","model":"black-forest-labs/FLUX.2-dev"} | $0.003 |
| Video Generation | video | {"prompt":"...","image_url":"..."} | $0.30 |
| Inpainting | inpaint | {"image_url":"...","mask_url":"...","prompt":"..."} | $0.04 |
| ControlNet | controlnet | {"image_url":"...","prompt":"..."} | $0.05 |
| Image-to-Image | img2img | {"image_url":"...","prompt":"..."} | $0.04 |
| Image Variations | image-variation | {"image_url":"..."} | $0.04 |
| AI Portraits | photomaker | {"image_url":"...","prompt":"..."} | $0.05 |
| Sticker Maker | sticker | {"image_url":"...","prompt":"..."} | $0.02 |
| Product Ads | ad-inpaint | {"image_url":"...","prompt":"..."} | $0.05 |
| Image Animation | animate | {"image_url":"..."} | $0.10 |
| Video Enhancement Up to 4K | video-enhance | {"video_url":"...","resolution":"1080p","fps":60} | $0.50 |
Audio & Speech
| Service | Key | Input | From |
|---|---|---|---|
| Speech-to-Text Whisper, sub-second | whisper-l4 | {"audio_url":"..."} | $0.05 |
| Diarized STT Speaker separation | whisperx | {"audio_url":"..."} | $0.05 |
| Text-to-Speech 40+ voices | tts-l4 | {"text":"...","voice":"af_alloy"} | $0.02 |
| Expressive TTS XTTS v2, sound effects | bark | {"text":"Hello [laughter]"} | $0.03 |
| Music Generation | musicgen-l4 | {"prompt":"...","duration_seconds":10} | $0.05 |
| Voice Cloning | voice-clone | {"audio_url":"..."} | $0.10 |
Utilities & Document
| Service | Key | Input | From |
|---|---|---|---|
| Background Removal | rembg-l4 | {"image_url":"..."} | $0.01 |
| Image Upscale 2x or 4x | upscale-l4 | {"image_url":"...","scale":4} | $0.04 |
| Face Restoration CodeFormer | face-restore | {"image_url":"..."} | $0.02 |
| OCR Florence-2 | ocr | {"image_url":"..."} | $0.01 |
| Segmentation SAM-2 | segmentation | {"image_url":"..."} | $0.02 |
| PDF/Doc Parsing Marker | pdf-parse | {"file_url":"...","mode":"fast"} | $0.05 |
| NSFW Detection Content moderation | nsfw-detect | {"image_url":"..."} | $0.005 |
Pricing & Billing
- Pay per request. No monthly minimums and zero cost when idle.
- Estimate before you run: call
GET /catalog/estimateor inspectGET /catalogfor live pricing. - Credit-based requests: some services are pre-charged from an estimate and reconciled after completion. Excess is returned automatically to your credit balance.
- Failure behavior: credit-based requests are refunded automatically on failure. x402 requests are pre-validated before the payment proof is consumed.
- Volume discounts: 5% at $100 spent, 10% at $500, 15% at $1,000+.
GPU-Bridge presents one pricing surface even when requests route across different backends behind the scenes.
x402 Protocol
For AI agent developers building autonomous clients. Spec-compliant with x402.org.
| Parameter | Value |
|---|---|
| Network | Base (Chain ID 8453) |
| Asset | USDC (6 decimals) |
| Contract | 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913 |
| Recipient | 0xB0FdC6030B9f30652e8B221B8090d443Dd3C6381 |
| Payment window | 300 seconds (5 minutes) |
| Confirmations | 1 (≤$10), 5 (>$10) |
| Anti-replay | Each tx hash used once |
| OFAC screening | Chainalysis Sanctions Oracle |
Flow
- Call any endpoint without auth → receive HTTP 402
- Parse
accepts[0].maxAmountRequired(USDC units) andaccepts[0].payTo - Send USDC on Base to
payTo(amount ≥maxAmountRequired) - Base64-encode
{"txHash":"0x...","from":"0x..."}asX-Paymentheader - Retry the same request → job executes
402 response format
{
"x402Version": 1,
"accepts": [{
"scheme": "exact",
"network": "base",
"maxAmountRequired": "10000", // varies per service
"payTo": "0xB0FdC6030B9f30652e8B221B8090d443Dd3C6381",
"asset": "0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913",
"maxTimeoutSeconds": 300
}]
}
Copy
Error Format
GPU-Bridge does not yet use a single rigid error envelope for every endpoint. Treat error responses as a common base shape with optional helper fields.
{
"error": "Unknown service: foo",
"hint": "Use GET /catalog to see all available services.",
"details": [],
"available_services": ["llm-4090", "image-4090", "..."]
}
Copy
- error — always present, human-readable message
- hint — optional remediation text
- details — optional validation issues, common on 400 responses
- available_services — optional list when a service key is invalid
error as required and treat all other fields as optional.Important Notes
- Media URLs must be publicly accessible. Use CDN links (soundhelix.com, imgbb.com). GitHub raw and Wikimedia URLs often return 403 from compute nodes.
- Webhooks are delivered via POST with 3 retries and exponential backoff. URL must be HTTPS and must not point to private networks.
- Job results are retained for 72 hours after completion. Older jobs keep billing metadata, but their input/output payloads are purged.
- Rate limits apply at both the API key and IP layers. Build retries with backoff and use
Idempotency-Keyfor credit-based requests. - Failover is automatic. GPU-Bridge can reroute across healthy backends and open circuit breakers when a backend degrades.
- x402 job IDs act as retrieval tokens for x402 results. Treat them like secrets and do not expose them publicly.
- Live availability changes over time. Use
GET /cataloginstead of hardcoding service or model lists from this page.