API Documentation

Lightning-native AI infrastructure for developers and agents

Quick Start

LightningProx provides pay-per-use access to AI models via Lightning Network micropayments. No accounts. No API keys. Payment authorizes each request.

How It Works

Send a request to the API
Receive a Lightning invoice
Pay the invoice (any Lightning wallet)
Retry the request with the payment proof

Example: Your First Request

Step 1: Get Invoice

        curl -X POST https://lightningprox.com/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-5-20251101",
    "max_tokens": 100,
    "messages": [{"role": "user", "content": "What is Lightning Network?"}]
  }'
    

Response (Payment Required)

        {
  "error": "payment_required",
  "message": "Please pay the Lightning invoice to proceed",
  "payment": {
    "charge_id": "abc-123-xyz",
    "payment_request": "lnbc...",
    "amount_sats": 5,
    "amount_usd": 0.005
  }
}
    

Step 2: Pay & Retry

        curl -X POST https://lightningprox.com/v1/messages \
  -H "Content-Type: application/json" \
  -H "X-Payment-Hash: abc-123-xyz" \
  -d '{
    "model": "claude-opus-4-5-20251101",
    "max_tokens": 100,
    "messages": [{"role": "user", "content": "What is Lightning Network?"}]
  }'
    

Spend Tokens

Instead of paying per request, you can prepay with a larger Lightning payment and get a reusable spend token. Use it for multiple requests without generating new invoices each time.

Best for Power Users & Agents
Spend tokens eliminate the invoice-pay-retry cycle. Pay once, then make requests instantly until your balance runs out.

Step 1: Top Up

Choose how many sats you want to load (10 - 100,000):

        curl -X POST https://lightningprox.com/v1/topup \
  -H "Content-Type: application/json" \
  -d '{
    "amount_sats": 500,
    "duration_hours": 72
  }'
    

Response

        {
  "charge_id": "abc-123-xyz",
  "payment_request": "lnbc...",
  "amount_sats": 500,
  "amount_usd": 0.50,
  "duration_hours": 72,
  "next_step": "/v1/tokens"
}
    

Step 2: Pay & Create Token

After paying the invoice, create a spend token:

        curl -X POST https://lightningprox.com/v1/tokens \
  -H "Content-Type: application/json" \
  -d '{
    "charge_id": "YOUR_CHARGE_ID",
    "duration_hours": 72
  }'
    

Response

        {
  "token": "lnpx_a1b2c3d4e5f6...",
  "balance_sats": 500,
  "expires_at": "2026-02-12T19:00:00Z",
  "status": "active"
}
    

Step 3: Use Your Token

Include X-Spend-Token in your requests — no more invoices:

        curl -X POST https://lightningprox.com/v1/messages \
  -H "Content-Type: application/json" \
  -H "X-Spend-Token: lnpx_a1b2c3d4e5f6..." \
  -d '{
    "model": "claude-opus-4-5-20251101",
    "max_tokens": 500,
    "messages": [{"role": "user", "content": "Explain Lightning Network"}]
  }'
# Instant response — cost deducted from token balance
    

Check Balance

        curl https://lightningprox.com/v1/balance \
  -H "X-Spend-Token: lnpx_a1b2c3d4e5f6..."
    

Response

        {
  "balance_sats": 485,
  "balance_usd": 0.485,
  "requests_left_estimate": 97,
  "expires_at": "2026-02-12T19:00:00Z",
  "status": "active"
}
    

MCP Server

The LightningProx MCP Server lets AI agents in Claude Desktop, Claude Code, and Cursor query AI models and pay via Lightning — all through the Model Context Protocol.

Your AI Agent Gets a Lightning Wallet
Install the MCP server and your AI can list models, check pricing, generate invoices, and make AI requests — all with automatic Lightning payments.

Install

go install github.com/unixlamadev-spec/lightningprox-mcp/cmd/mcp-server@latest

Configure Claude Desktop

Add to your claude_desktop_config.json:

        {
  "mcpServers": {
    "lightningprox": {
      "command": "mcp-server",
      "args": []
    }
  }
}
    

Available Tools

Tool	Description
ask_ai	Query an AI model (pay with invoice or spend token)
get_invoice	Generate a Lightning invoice for a request
check_balance	Check spend token balance
list_models	List available models with pricing
get_pricing	Estimate cost before committing

GitHub: github.com/unixlamadev-spec/lightningprox-mcp

Python / LangChain

Use the official LangChain integration for seamless Python development.

Install

pip install langchain-lightningprox

Usage

        from langchain_lightningprox import LightningProxLLM

llm = LightningProxLLM(
    spend_token="lnpx_your_token_here"
)

response = llm.invoke("Explain quantum computing in one sentence.")
print(response)
    

GitHub: github.com/unixlamadev-spec/langchain-lightningprox
PyPI: pypi.org/project/langchain-lightningprox

Spend Tokens
Get a spend token at lightningprox.com/topup — pay once with Lightning, use for multiple requests.

JavaScript / Node.js

For JavaScript agents and bots, use the reference implementation.

Clone the Example

        git clone https://github.com/unixlamadev-spec/lightningprox-agent-example
cd lightningprox-agent-example
npm install
    

Configure

        # .env
LIGHTNINGPROX_SPEND_TOKEN=lnpx_your_token_here
    

Run

node agent.js

GitHub: github.com/unixlamadev-spec/lightningprox-agent-example

OpenAI-Compatible SDK

Drop-in replacement for the OpenAI npm package. Change two lines — everything else stays identical.

Install

npm install lightningprox-openai

Migrate from OpenAI SDK

        // Before:
import OpenAI from 'openai'
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY })

// After:
import OpenAI from 'lightningprox-openai'
const client = new OpenAI({ apiKey: process.env.LIGHTNINGPROX_SPEND_TOKEN })

// Everything else stays identical:
const response = await client.chat.completions.create({
  model: 'claude-opus-4-5-20251101',
  messages: [{ role: 'user', content: 'Hello' }]
})
console.log(response.choices[0].message.content)
    

Zero migration cost. All 19 models available. Errors surface cleanly: 402 = insufficient balance, 429 = rate limit. No provider API keys required.

npm: npmjs.com/package/lightningprox-openai

Supported Models

Model	Provider	Type	Input Cost	Output Cost
claude-opus-4-5-20251101	Anthropic	Chat	$3/1M tokens	$15/1M tokens
gpt-4-turbo	OpenAI	Chat	$10/1M tokens	$30/1M tokens
meta-llama/Llama-3.3-70B-Instruct-Turbo	Together.ai	Chat	$0.88/1M tokens	$0.88/1M tokens
meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo	Together.ai	Chat	$0.88/1M tokens	$0.88/1M tokens
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8	Together.ai	Chat	$0.36/1M tokens	$0.36/1M tokens
mistralai/Mixtral-8x7B-Instruct-v0.1	Together.ai	Chat	$0.60/1M tokens	$0.60/1M tokens
deepseek-ai/DeepSeek-V3	Together.ai	Chat	$1.25/1M tokens	$1.25/1M tokens
mistral-large-latest	Mistral	Chat	$2/1M tokens	$6/1M tokens
mistral-medium-latest	Mistral	Chat	$0.40/1M tokens	$2/1M tokens
mistral-small-latest	Mistral	Chat	$0.10/1M tokens	$0.30/1M tokens
open-mistral-nemo	Mistral	Chat	$0.15/1M tokens	$0.15/1M tokens
codestral-latest	Mistral	Code	$0.30/1M tokens	$0.90/1M tokens
devstral-latest	Mistral	Agentic Code	$0.40/1M tokens	$1.20/1M tokens
pixtral-large-latest	Mistral	Vision	$2/1M tokens	$6/1M tokens
magistral-medium-latest	Mistral	Reasoning	$2/1M tokens	$5/1M tokens
gemini-2.5-flash	Google	Chat	$0.15/1M tokens	$0.60/1M tokens
gemini-2.5-pro	Google	Chat Requires thinking_config — automatically handled	$1.25/1M tokens	$10/1M tokens
gemini-3-flash-preview	Google	Chat	$0.20/1M tokens	$0.80/1M tokens
gemini-3-pro-preview	Google	Chat	$2/1M tokens	$8/1M tokens

Prices include a 20% markup to cover infrastructure and Lightning fees. All models accessible via spend token — no provider API keys required.

For AI Agents

LightningProx is designed for autonomous agents that need to pay for services programmatically. No human intervention required.

Payment = Authentication
Unlike traditional APIs, there are no API keys to manage. The Lightning payment itself authorizes the request. This makes LightningProx ideal for autonomous systems.

Agent Integration Flow

Agent sends request → receives invoice
Agent pays invoice via Lightning wallet (LNbits, Strike, LND, etc.)
Agent retries with X-Payment-Hash header
Agent receives AI response

Service Discovery

Agents can discover LightningProx capabilities programmatically:

curl https://lightningprox.com/api/capabilities

Returns machine-readable JSON with models, pricing, and integration steps.

Streaming Responses

LightningProx supports streaming responses for real-time token-by-token output. This provides a much better user experience for chat applications.

Why Streaming?
Streaming responses feel 5x faster. Instead of waiting 10 seconds for a complete response, users see text appearing immediately, word by word.

Enable Streaming

Add "stream": true to your request:

        curl -X POST https://lightningprox.com/v1/messages \
  -H "Content-Type: application/json" \
  -H "X-Payment-Hash: YOUR_CHARGE_ID" \
  -d '{
    "model": "claude-opus-4-5-20251101",
    "max_tokens": 500,
    "stream": true,
    "messages": [{"role": "user", "content": "Write a haiku about Bitcoin"}]
  }'
    

Streaming Response Format

Responses are sent as Server-Sent Events (SSE):

        data: {"type":"content_block_delta","delta":{"text":"Digital"}}

data: {"type":"content_block_delta","delta":{"text":" gold"}}

data: {"type":"content_block_delta","delta":{"text":" flows"}}

data: {"type":"message_stop"}

data: [DONE]

JavaScript Example

        const response = await fetch('https://lightningprox.com/v1/messages', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'X-Payment-Hash': chargeId
  },
  body: JSON.stringify({
    model: 'claude-opus-4-5-20251101',
    max_tokens: 500,
    stream: true,
    messages: [{ role: 'user', content: prompt }]
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  const chunk = decoder.decode(value);
  const lines = chunk.split('\n');
  
  for (const line of lines) {
    if (line.startsWith('data: ') && !line.includes('[DONE]')) {
      const data = JSON.parse(line.slice(6));
      if (data.delta?.text) {
        process.stdout.write(data.delta.text);
      }
    }
  }
}
    

Python Example

        import requests

response = requests.post(
    'https://lightningprox.com/v1/messages',
    headers={
        'Content-Type': 'application/json',
        'X-Payment-Hash': charge_id
    },
    json={
        'model': 'claude-opus-4-5-20251101',
        'max_tokens': 500,
        'stream': True,
        'messages': [{'role': 'user', 'content': prompt}]
    },
    stream=True
)

for line in response.iter_lines():
    if line:
        line = line.decode('utf-8')
        if line.startswith('data: ') and '[DONE]' not in line:
            import json
            data = json.loads(line[6:])
            if 'delta' in data and 'text' in data['delta']:
                print(data['delta']['text'], end='', flush=True)
    

Note: Streaming requests cannot be cached. The 50% cache discount only applies to non-streaming requests with identical queries.

Pricing

Pay per request — no subscriptions or minimums
Cached responses — 50% discount on identical queries
Typical cost — 5-50 sats per request (~$0.005-0.05)

Rate Limits & Safety

Rate limit: 10 requests/minute per IP
Daily spend cap: $20/day (protects you)
Monthly spend cap: $150/month
Max tokens: 4096 per request

Endpoints

Endpoint	Method	Description
/v1/messages	POST	AI completion (via payment hash or spend token)
/v1/tokens	POST	Create a prepaid spend token from a paid invoice
/v1/topup	POST	Generate a Lightning invoice for any sats amount (10 - 100,000)
/v1/balance	GET	Check spend token balance (X-Spend-Token header)
/api/capabilities	GET	Service discovery for agents
/health	GET	Service health status