← Back to Home

API Documentation

Lightning-native AI infrastructure for developers and agents

Quick Start

LightningProx provides pay-per-use access to AI models via Lightning Network micropayments. No accounts. No API keys. Payment authorizes each request.

How It Works

  1. Send a request to the API
  2. Receive a Lightning invoice
  3. Pay the invoice (any Lightning wallet)
  4. Retry the request with the payment proof

Example: Your First Request

Step 1: Get Invoice
curl -X POST https://lightningprox.com/v1/messages \ -H "Content-Type: application/json" \ -d '{ "model": "claude-opus-4-5-20251101", "max_tokens": 100, "messages": [{"role": "user", "content": "What is Lightning Network?"}] }'
Response (Payment Required)
{ "error": "payment_required", "message": "Please pay the Lightning invoice to proceed", "payment": { "charge_id": "abc-123-xyz", "payment_request": "lnbc...", "amount_sats": 5, "amount_usd": 0.005 } }
Step 2: Pay & Retry
curl -X POST https://lightningprox.com/v1/messages \ -H "Content-Type: application/json" \ -H "X-Payment-Hash: abc-123-xyz" \ -d '{ "model": "claude-opus-4-5-20251101", "max_tokens": 100, "messages": [{"role": "user", "content": "What is Lightning Network?"}] }'

Spend Tokens

Instead of paying per request, you can prepay with a larger Lightning payment and get a reusable spend token. Use it for multiple requests without generating new invoices each time.

Best for Power Users & Agents
Spend tokens eliminate the invoice-pay-retry cycle. Pay once, then make requests instantly until your balance runs out.

Step 1: Top Up

Choose how many sats you want to load (10 - 100,000):

curl -X POST https://lightningprox.com/v1/topup \ -H "Content-Type: application/json" \ -d '{ "amount_sats": 500, "duration_hours": 72 }'
Response
{ "charge_id": "abc-123-xyz", "payment_request": "lnbc...", "amount_sats": 500, "amount_usd": 0.50, "duration_hours": 72, "next_step": "/v1/tokens" }

Step 2: Pay & Create Token

After paying the invoice, create a spend token:

curl -X POST https://lightningprox.com/v1/tokens \ -H "Content-Type: application/json" \ -d '{ "charge_id": "YOUR_CHARGE_ID", "duration_hours": 72 }'
Response
{ "token": "lnpx_a1b2c3d4e5f6...", "balance_sats": 500, "expires_at": "2026-02-12T19:00:00Z", "status": "active" }

Step 3: Use Your Token

Include X-Spend-Token in your requests — no more invoices:

curl -X POST https://lightningprox.com/v1/messages \ -H "Content-Type: application/json" \ -H "X-Spend-Token: lnpx_a1b2c3d4e5f6..." \ -d '{ "model": "claude-opus-4-5-20251101", "max_tokens": 500, "messages": [{"role": "user", "content": "Explain Lightning Network"}] }' # Instant response — cost deducted from token balance

Check Balance

curl https://lightningprox.com/v1/balance \ -H "X-Spend-Token: lnpx_a1b2c3d4e5f6..."
Response
{ "balance_sats": 485, "balance_usd": 0.485, "requests_left_estimate": 97, "expires_at": "2026-02-12T19:00:00Z", "status": "active" }

MCP Server

The LightningProx MCP Server lets AI agents in Claude Desktop, Claude Code, and Cursor query AI models and pay via Lightning — all through the Model Context Protocol.

Your AI Agent Gets a Lightning Wallet
Install the MCP server and your AI can list models, check pricing, generate invoices, and make AI requests — all with automatic Lightning payments.

Install

go install github.com/unixlamadev-spec/lightningprox-mcp/cmd/mcp-server@latest

Configure Claude Desktop

Add to your claude_desktop_config.json:

{ "mcpServers": { "lightningprox": { "command": "mcp-server", "args": [] } } }

Available Tools

Tool Description
ask_ai Query an AI model (pay with invoice or spend token)
get_invoice Generate a Lightning invoice for a request
check_balance Check spend token balance
list_models List available models with pricing
get_pricing Estimate cost before committing

GitHub: github.com/unixlamadev-spec/lightningprox-mcp

Python / LangChain

Use the official LangChain integration for seamless Python development.

Install
pip install langchain-lightningprox
Usage
from langchain_lightningprox import LightningProxLLM llm = LightningProxLLM( spend_token="lnpx_your_token_here" ) response = llm.invoke("Explain quantum computing in one sentence.") print(response)

GitHub: github.com/unixlamadev-spec/langchain-lightningprox
PyPI: pypi.org/project/langchain-lightningprox

Spend Tokens
Get a spend token at lightningprox.com/topup — pay once with Lightning, use for multiple requests.

JavaScript / Node.js

For JavaScript agents and bots, use the reference implementation.

Clone the Example
git clone https://github.com/unixlamadev-spec/lightningprox-agent-example cd lightningprox-agent-example npm install
Configure
# .env LIGHTNINGPROX_SPEND_TOKEN=lnpx_your_token_here
Run
node agent.js

GitHub: github.com/unixlamadev-spec/lightningprox-agent-example

OpenAI-Compatible SDK

Drop-in replacement for the OpenAI npm package. Change two lines — everything else stays identical.

Install
npm install lightningprox-openai
Migrate from OpenAI SDK
// Before: import OpenAI from 'openai' const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY }) // After: import OpenAI from 'lightningprox-openai' const client = new OpenAI({ apiKey: process.env.LIGHTNINGPROX_SPEND_TOKEN }) // Everything else stays identical: const response = await client.chat.completions.create({ model: 'claude-opus-4-5-20251101', messages: [{ role: 'user', content: 'Hello' }] }) console.log(response.choices[0].message.content)
Zero migration cost. All 19 models available. Errors surface cleanly: 402 = insufficient balance, 429 = rate limit. No provider API keys required.

npm: npmjs.com/package/lightningprox-openai

Supported Models

Model Provider Type Input Cost Output Cost
claude-opus-4-5-20251101 Anthropic Chat $3/1M tokens $15/1M tokens
gpt-4-turbo OpenAI Chat $10/1M tokens $30/1M tokens
meta-llama/Llama-3.3-70B-Instruct-Turbo Together.ai Chat $0.88/1M tokens $0.88/1M tokens
meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo Together.ai Chat $0.88/1M tokens $0.88/1M tokens
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 Together.ai Chat $0.36/1M tokens $0.36/1M tokens
mistralai/Mixtral-8x7B-Instruct-v0.1 Together.ai Chat $0.60/1M tokens $0.60/1M tokens
deepseek-ai/DeepSeek-V3 Together.ai Chat $1.25/1M tokens $1.25/1M tokens
mistral-large-latest Mistral Chat $2/1M tokens $6/1M tokens
mistral-medium-latest Mistral Chat $0.40/1M tokens $2/1M tokens
mistral-small-latest Mistral Chat $0.10/1M tokens $0.30/1M tokens
open-mistral-nemo Mistral Chat $0.15/1M tokens $0.15/1M tokens
codestral-latest Mistral Code $0.30/1M tokens $0.90/1M tokens
devstral-latest Mistral Agentic Code $0.40/1M tokens $1.20/1M tokens
pixtral-large-latest Mistral Vision $2/1M tokens $6/1M tokens
magistral-medium-latest Mistral Reasoning $2/1M tokens $5/1M tokens
gemini-2.5-flash Google Chat $0.15/1M tokens $0.60/1M tokens
gemini-2.5-pro Google Chat
Requires thinking_config — automatically handled
$1.25/1M tokens $10/1M tokens
gemini-3-flash-preview Google Chat $0.20/1M tokens $0.80/1M tokens
gemini-3-pro-preview Google Chat $2/1M tokens $8/1M tokens

Prices include a 20% markup to cover infrastructure and Lightning fees. All models accessible via spend token — no provider API keys required.

For AI Agents

LightningProx is designed for autonomous agents that need to pay for services programmatically. No human intervention required.

Payment = Authentication
Unlike traditional APIs, there are no API keys to manage. The Lightning payment itself authorizes the request. This makes LightningProx ideal for autonomous systems.

Agent Integration Flow

  1. Agent sends request → receives invoice
  2. Agent pays invoice via Lightning wallet (LNbits, Strike, LND, etc.)
  3. Agent retries with X-Payment-Hash header
  4. Agent receives AI response

Service Discovery

Agents can discover LightningProx capabilities programmatically:

curl https://lightningprox.com/api/capabilities

Returns machine-readable JSON with models, pricing, and integration steps.

Streaming Responses

LightningProx supports streaming responses for real-time token-by-token output. This provides a much better user experience for chat applications.

Why Streaming?
Streaming responses feel 5x faster. Instead of waiting 10 seconds for a complete response, users see text appearing immediately, word by word.

Enable Streaming

Add "stream": true to your request:

curl -X POST https://lightningprox.com/v1/messages \ -H "Content-Type: application/json" \ -H "X-Payment-Hash: YOUR_CHARGE_ID" \ -d '{ "model": "claude-opus-4-5-20251101", "max_tokens": 500, "stream": true, "messages": [{"role": "user", "content": "Write a haiku about Bitcoin"}] }'

Streaming Response Format

Responses are sent as Server-Sent Events (SSE):

data: {"type":"content_block_delta","delta":{"text":"Digital"}} data: {"type":"content_block_delta","delta":{"text":" gold"}} data: {"type":"content_block_delta","delta":{"text":" flows"}} data: {"type":"message_stop"} data: [DONE]

JavaScript Example

const response = await fetch('https://lightningprox.com/v1/messages', { method: 'POST', headers: { 'Content-Type': 'application/json', 'X-Payment-Hash': chargeId }, body: JSON.stringify({ model: 'claude-opus-4-5-20251101', max_tokens: 500, stream: true, messages: [{ role: 'user', content: prompt }] }) }); const reader = response.body.getReader(); const decoder = new TextDecoder(); while (true) { const { done, value } = await reader.read(); if (done) break; const chunk = decoder.decode(value); const lines = chunk.split('\n'); for (const line of lines) { if (line.startsWith('data: ') && !line.includes('[DONE]')) { const data = JSON.parse(line.slice(6)); if (data.delta?.text) { process.stdout.write(data.delta.text); } } } }

Python Example

import requests response = requests.post( 'https://lightningprox.com/v1/messages', headers={ 'Content-Type': 'application/json', 'X-Payment-Hash': charge_id }, json={ 'model': 'claude-opus-4-5-20251101', 'max_tokens': 500, 'stream': True, 'messages': [{'role': 'user', 'content': prompt}] }, stream=True ) for line in response.iter_lines(): if line: line = line.decode('utf-8') if line.startswith('data: ') and '[DONE]' not in line: import json data = json.loads(line[6:]) if 'delta' in data and 'text' in data['delta']: print(data['delta']['text'], end='', flush=True)
Note: Streaming requests cannot be cached. The 50% cache discount only applies to non-streaming requests with identical queries.

Pricing

Rate Limits & Safety

Endpoints

Endpoint Method Description
/v1/messages POST AI completion (via payment hash or spend token)
/v1/tokens POST Create a prepaid spend token from a paid invoice
/v1/topup POST Generate a Lightning invoice for any sats amount (10 - 100,000)
/v1/balance GET Check spend token balance (X-Spend-Token header)
/api/capabilities GET Service discovery for agents
/health GET Service health status

Support