Skip to content

enkei0x/espai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ESPAI Logo

Unified AI API Client for ESP32

PlatformIO Registry Arduino C++17 License: MIT ESP32 Tests

πŸš€ Bring the power of GPT, Claude, Gemini and local LLMs to your ESP32 projects!

ESPAI is a lightweight, easy-to-use Arduino library that lets you integrate OpenAI, Anthropic, Google Gemini and Ollama APIs into your ESP32 projects. Supports ESP32, ESP32-S2, ESP32-S3, and ESP32-C3. Build smart IoT devices, voice assistants, and AI-powered gadgets with just a few lines of code.

#include <ESPAI.h>
using namespace ESPAI;

OpenAIProvider ai("sk-your-key");  // or AnthropicProvider / GeminiProvider / OllamaProvider

std::vector<Message> messages = { Message(Role::User, "Hello from ESP32!") };

Response res = ai.chat(messages, ChatOptions());

Serial.println(res.content);

✨ Features

  • 🎯 Simple API β€” Clean, intuitive interface for chat completions
  • πŸ”’ Secure by Default β€” Embedded root CA certificates for proper TLS verification
  • 🌊 Streaming Support β€” Real-time token-by-token responses via SSE
  • πŸ› οΈ Tool Calling β€” Function calling for agentic workflows with unified schema across all providers
  • πŸ’¬ Conversation History β€” Built-in multi-turn context management with auto-pruning and serialization
  • πŸ”„ Multiple Providers β€” OpenAI, Anthropic (Claude), Google Gemini, Ollama and any OpenAI-compatible API through a single unified interface
  • 🏠 Local LLMs β€” Run models locally with Ollama, no API key or internet required
  • πŸ“¦ Lightweight β€” Minimal memory footprint, optimized for ESP32
  • ⚑ Async Support β€” Non-blocking FreeRTOS-based async chat and streaming
  • πŸ” Auto Retry β€” Built-in retry with exponential backoff for rate limits and server errors
  • πŸ§ͺ Well Tested β€” 451+ native unit tests, CI-ready
  • πŸ—οΈ Clean Architecture β€” Layered design with separated HTTP transport, providers, and conversation management

πŸ† Why ESPAI?

  • Secure by design β€” Embedded root CA certificates for proper TLS verification out of the box
  • Production-ready β€” 451+ unit tests running natively, so you can refactor and ship with confidence
  • Conversation memory β€” Built-in multi-turn history with automatic pruning and JSON serialization
  • Write once, run on any provider β€” Define tools once, unified schema works across OpenAI, Claude, Gemini, and Ollama
  • Your choice of tooling β€” First-class support for both PlatformIO and Arduino IDE
  • Modern C++17 β€” Namespaced, clean layered architecture that's easy to extend and debug

πŸ“¦ Installation

PlatformIO (Recommended)

Add to your platformio.ini:

lib_deps =
    enkei0x/ESPAI@^0.8.0

Arduino IDE

  1. Download the latest release from GitHub Releases
  2. In Arduino IDE: Sketch β†’ Include Library β†’ Add .ZIP Library
  3. Select the downloaded ZIP file

Manual Installation

cd ~/Arduino/libraries  # or PlatformIO lib folder
git clone https://github.com/enkei0x/espai.git

πŸš€ Quick Start

#include <WiFi.h>
#include <ESPAI.h>

using namespace ESPAI;

OpenAIProvider openai("sk-your-api-key");

void setup() {
    Serial.begin(115200);

    // Connect to WiFi
    WiFi.begin("your-ssid", "your-password");
    while (WiFi.status() != WL_CONNECTED) delay(500);

    // Send a message
    std::vector<Message> messages;
    messages.push_back(Message(Role::User, "Hello! What's 2+2?"));

    Response response = openai.chat(messages, ChatOptions());

    if (response.success) {
        Serial.println(response.content);
    }
}

void loop() {}

πŸ“– Examples

Example Description
BasicChat Simple request/response
StreamingChat Real-time streaming output
ConversationHistory Multi-turn conversations
ToolCalling Function calling workflow
CustomOptions All configuration options
ErrorHandling Retry logic and error handling
StreamingToolCalling Tool calling with streaming responses
AnthropicChat Using Anthropic Claude models
GeminiChat Using Google Gemini models
AsyncChat Non-blocking async requests (FreeRTOS)

🌊 Streaming Responses

Get responses token-by-token for a better user experience:

openai.chatStream(messages, options, [](const String& chunk, bool done) {
    Serial.print(chunk);  // Print each token as it arrives
    if (done) Serial.println("\n--- Done! ---");
});

πŸ› οΈ Tool Calling

Let the AI call functions in your code:

// Define a tool
Tool tempTool;
tempTool.name = "get_temperature";
tempTool.description = "Get current temperature from sensor";
tempTool.parametersJson = R"({"type":"object","properties":{}})";

ai.addTool(tempTool);

// Send message - AI may request to call the tool
Response response = ai.chat(messages, options);

if (ai.hasToolCalls()) {
    // Add assistant message with tool calls to history
    messages.push_back(ai.getAssistantMessageWithToolCalls(response.content));

    for (const auto& call : ai.getLastToolCalls()) {
        // Execute your function
        String result = "{\"temperature\": 23.5}";
        messages.push_back(Message(Role::Tool, result, call.id));
    }
    // Get final response
    response = ai.chat(messages, options);
}

πŸ’¬ Conversation History

Maintain context across multiple turns:

Conversation conv;
conv.setSystemPrompt("You are a helpful IoT assistant.");
conv.setMaxMessages(20);  // Auto-prune old messages

// User asks something
conv.addUserMessage("Turn on the lights");
Response resp = openai.chat(conv.getMessages(), options);
conv.addAssistantMessage(resp.content);

// Follow-up question (context preserved)
conv.addUserMessage("Make them brighter");
resp = openai.chat(conv.getMessages(), options);

⚑ Async Requests

Run AI requests in the background without blocking your main loop:

AIClient client(Provider::OpenAI, "sk-your-api-key");

// Fire-and-forget with callback
client.chatAsync("Hello!", [](const Response& resp) {
    Serial.println(resp.content);
});

// Or poll manually
ChatRequest* req = client.chatAsync("Hello!");
while (!req->isComplete()) {
    req->poll();
    // ... do other work
}
Serial.println(req->getResult().content);

βš™οΈ Configuration

ChatOptions

Parameters are only sent to the API when explicitly set. If you leave a parameter at its default, the provider's own default is used.

ChatOptions options;
options.temperature = 0.7;            // Creativity (0.0 - 2.0)
options.maxTokens = 1024;             // Max response length
options.maxCompletionTokens = 4096;   // OpenAI reasoning models (o1, o3); priority over maxTokens
options.topP = 0.9;                   // Nucleus sampling
options.frequencyPenalty = 0.5;       // Reduce repetition (-2.0 - 2.0)
options.presencePenalty = 0.3;        // Encourage new topics (-2.0 - 2.0)
options.model = "gpt-4.1-mini";      // Model override
options.systemPrompt = "...";        // System instructions

Provider Setup

// OpenAI
OpenAIProvider openai("sk-...");
openai.setModel("gpt-4o");
openai.setTimeout(30000);

// Anthropic (Claude)
AnthropicProvider claude("sk-ant-...");
claude.setModel("claude-sonnet-4-20250514");

// Google Gemini
GeminiProvider gemini("AIza...");
gemini.setModel("gemini-2.5-flash");

// Ollama (local, no API key needed)
OllamaProvider ollama;
ollama.setModel("llama3.2");

// Any OpenAI-compatible API (Groq, DeepSeek, Together AI, etc.)
OpenAICompatibleConfig config;
config.name = "Groq";
config.baseUrl = "https://api.groq.com/openai/v1/chat/completions";
config.apiKey = "gsk-...";
config.model = "llama-3.3-70b-versatile";
OpenAICompatibleProvider groq(config);

πŸ“Š Memory Usage

ESPAI is optimized for constrained environments:

Component RAM Usage
Provider instance ~200 bytes
Per message ~50 bytes + content
SSL connection ~40 KB (one-time)

πŸ’‘ Tip: Use streaming for long responses to reduce peak memory usage.

To save flash, disable unused providers:

#define ESPAI_PROVIDER_ANTHROPIC 0
#define ESPAI_PROVIDER_GEMINI 0
#define ESPAI_PROVIDER_OLLAMA 0
#include <ESPAI.h>

πŸ”§ Troubleshooting

Common Issues

"Connection failed"

  • Check WiFi connection
  • Verify API endpoint is reachable
  • Ensure HTTPS/SSL is working

"Authentication error"

  • Verify your API key is correct
  • Check API key has proper permissions

"Out of memory"

  • Reduce maxTokens
  • Use streaming instead of buffered responses
  • Clear conversation history periodically

"Timeout"

  • Increase timeout: provider.setTimeout(60000)
  • Check network stability

πŸ—ΊοΈ Roadmap

  • OpenAI provider
  • Anthropic (Claude) provider
  • Google Gemini provider
  • Ollama provider (local LLMs)
  • OpenAI-compatible base (Groq, DeepSeek, Together AI, LM Studio, OpenRouter, etc.)
  • Streaming support
  • Tool/function calling
  • Conversation history management
  • Plain HTTP transport (for local providers)
  • Vision support (image inputs)
  • Embeddings API

Have a feature request? Open an issue!


🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments


Made with ❀️ for the ESP32 community

About

Unified AI API library for ESP32/Arduino - OpenAI, Anthropic Claude, Google Gemini, Ollama with streaming, tool calling, async and conversation memory

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages