Skip to content

johanjanssens/frankenonnx

Repository files navigation

FrankenONNX

Local AI inference in PHP — run ONNX models in-process via FrankenPHP + ONNX Runtime.

Models are loaded once at startup and shared across all requests. Inference is a function call, not a network request — sub-10ms for text, ~15ms for images. No APIs, no Python, no microservices.

Note: This is a companion repo for my FrankenPHP conference talks. It's meant as inspiration and a reference implementation, not a production framework. Feel free to explore, fork, and adapt the patterns for your own projects.

Talks

No talks yet — but here's the abstract if you want to present this:

Running AI Models Natively in PHP

What if PHP could run AI models directly — no APIs, no Python, no microservices? With ONNX Runtime and FrankenPHP, this is now possible. Sentiment analysis in 5 ms. Text embeddings in 3 ms. Object detection in 15 ms. Text-to-speech in 200 ms. All running in-process, using pre-trained models from HuggingFace, with a two-line PHP API. No network latency, no API keys, no external dependencies at runtime.

This talk shows how to bridge PHP to ONNX Runtime via a Go inference layer, why in-process inference changes the economics of AI features, and how any PHP developer can add ML capabilities to their application.

See talk.md for the full narrative, demo walkthrough, and architecture breakdown.

FrankenONNX Demo

Models

Four demo models, all open-source from HuggingFace:

Model Task Input Output Time
DistilBERT SST-2 Sentiment analysis Text Positive/Negative + score ~5 ms
all-MiniLM-L6-v2 Text embeddings Text 384-dim float vector ~3 ms
YOLOv8n Object detection JPEG/PNG bytes Bounding boxes + labels ~15 ms
Piper lessac Text to speech Text WAV audio (16-bit PCM) ~200 ms

Models run on CPU — no GPU required. Files are downloaded once to dist/models/ (local) or baked into the Docker image.

How It Works

FrankenONNX combines three key pieces:

  • FrankenPHP — embeds PHP in a Go process, giving us a Go server that serves PHP pages
  • ONNX Runtime — Microsoft's high-performance inference engine for .onnx models, with hardware-specific optimizations (SIMD, CoreML, CUDA)
  • hugot — Go library for HuggingFace transformer pipelines (tokenization + inference)

Because FrankenPHP runs PHP inside Go, and Go has mature ONNX Runtime bindings, the Go process becomes the bridge — a thin C extension connects PHP to Go, and Go connects to ONNX Runtime.

                PHP                           Go Host                      ONNX Runtime
   ┌───────────────────────┐    ┌──────────────────────────────┐    ┌───────────────────────┐
   │                       │    │                              │    │                       │
   │  $m = ONNX::load(..)  │    │  ┌─────────┐  ┌───────────┐ │    │  ┌─────────────────┐  │
   │  $m->run($input)      ├───►│  │ C ext   ├─►│ Go onnx/  │ │    │  │ sentiment.onnx  │  │
   │                       │    │  │ (CGo)   │  │ registry  ├─┼───►│  │ embedding.onnx  │  │
   │  // => PHP array      │◄───┤  │         │◄─┤           │ │    │  │ yolov8n.onnx    │  │
   │                       │    │  └─────────┘  └─────┬─────┘ │◄───┤  │                 │  │
   │                       │    │               ┌─────┴─────┐ │    │  └─────────────────┘  │
   │                       │    │               │   hugot   │ │    │                       │
   │                       │    │               │ pipelines │ │    │  libonnxruntime.dylib │
   │                       │    │               └───────────┘ │    │  (C++ engine)         │
   └───────────────────────┘    └──────────────────────────────┘    └───────────────────────┘
        FrankenPHP                    Model Registry                    ONNX Runtime
     (PHP embedded in Go)          (lazy-loading singletons)        (inference engine)
  1. Models are .onnx files downloaded from HuggingFace with their tokenizer configs
  2. The registry (onnx/) lazy-loads models on first use, keeps them for the process lifetime
  3. PHP code calls model inference through the FrankenPHP\ONNX class
  4. Results are JSON-encoded in Go and decoded to PHP arrays by the C extension

Key difference from FrankenWASM/FrankenAsync

Models are process-lifetime singletons — no per-request context needed. FrankenWASM needs per-request WASM plugin instances (sandbox isolation). FrankenAsync needs per-request task managers. FrankenONNX needs neither — models are global read-only resources. The CGo bridge is simpler: no thread index, no request context, just load(name) and run(name, input).

FrankenPHP Fork

FrankenONNX requires a fork of FrankenPHP that adds frankenphp.RegisterExtension() for registering C Zend extensions from Go — the same fork used by FrankenWASM and FrankenAsync.

Unlike those projects, FrankenONNX does not need frankenphp.Thread() or frankenphp_thread_index() since models are global singletons, not per-request state.

The fork is referenced via a replace directive in go.mod:

replace github.com/dunglas/frankenphp v1.11.2 => ../frankenphp

Quick Start

Docker (recommended)

docker build -t frankenonnx .
docker run -p 8082:8082 frankenonnx

The multi-stage Dockerfile handles everything — PHP build, ONNX Runtime + libtokenizers download, model download from HuggingFace (~415 MB), and the host binary. Models are baked into the image. Open http://localhost:8082 to see the demos.

The PHP build stage uses static-php-cli which can download pre-built libraries from GitHub instead of compiling from source. This requires a GitHub token to avoid API rate limits:

GITHUB_TOKEN=$(gh auth token) docker build \
    --secret id=github_token,env=GITHUB_TOKEN \
    -t frankenonnx .

Without the token the build still works — it just compiles all libraries from source, which takes longer.

Local Build

Prerequisites

  • macOS Apple Silicon (ARM64)
  • Go 1.24+
  • The FrankenPHP fork cloned as a sibling directory (../frankenphp)

Build & Run

make php        # Build PHP 8.3 (ZTS, embed) via static-php-cli (one-time)
make env        # Generate env.yaml with CGO flags from the PHP build
make ort        # Download ONNX Runtime + libtokenizers to dist/
make models     # Download model files to ~/.frankenonnx/models/
make run        # Build the host binary + start the server on :8082

The PHP build is cached in build/.php/ — subsequent runs skip the build if libphp.a exists. To rebuild PHP from scratch:

make php-clean  # Remove cached downloads and build artifacts
make php        # Rebuild
make env        # Regenerate env.yaml

make ort downloads pre-built binaries from GitHub releases (~30 MB total):

  • libonnxruntime.dylib — ONNX Runtime shared library
  • libtokenizers.a — Rust-based tokenizer (used by hugot)

make models downloads model files from HuggingFace (~350 MB total):

  • sentiment/ — DistilBERT SST-2 (model + tokenizer)
  • embedding/ — all-MiniLM-L6-v2 (model + tokenizer)

Manual Setup

If you prefer to use your own PHP build, create an env.yaml manually:

HOME: "/Users/you"
GOPATH: "/Users/you/go"
GOFLAGS: "-tags=nowatcher,ORT"
CGO_ENABLED: "1"
CGO_CFLAGS: "-I/path/to/php/include ..."
CGO_CPPFLAGS: "-I/path/to/php/include ..."
CGO_LDFLAGS: "-L/path/to/php/lib -lphp -L/path/to/frankenonnx/dist -lonnxruntime ..."

The CGO flags must point to your PHP build's include headers and libraries. PHP must be built with ZTS (--enable-zts) and embed (--enable-embed). The -tags ORT build tag is required by hugot's ONNX Runtime backend.

GoLand

Install the EnvFile plugin, then in your Run Configuration enable EnvFile and add env.yaml to load the CGO flags automatically.

Environment Variables

Variable Default Description
FRANKENONNX_DOC_ROOT demo PHP document root directory
FRANKENONNX_ORT_LIB system default Path to libonnxruntime.dylib
FRANKENONNX_MODELS_DIR ~/.frankenonnx/models Directory containing model files

PHP API

use FrankenPHP\ONNX;

// Load a model (lazy-loads on first call, singleton for process lifetime)
$model = ONNX::load('sentiment');

// Run inference
$result = $model->run($input);

Sentiment Analysis

$model = ONNX::load('sentiment');
$result = $model->run('FrankenPHP is amazing!');
// => [['label' => 'POSITIVE', 'score' => 0.9999]]

$result = $model->run('This is terrible.');
// => [['label' => 'NEGATIVE', 'score' => 0.9995]]

Text Embeddings

$model = ONNX::load('embedding');
$vec = $model->run('Hello world');
// => [0.0623, -0.0418, 0.1201, ...] (384 floats)

// Cosine similarity between two texts
$vecA = $model->run('I love programming');
$vecB = $model->run('I enjoy coding');
// cosine_similarity($vecA, $vecB) => 0.89

Object Detection

$model = ONNX::load('yolov8n');
$result = $model->run(file_get_contents('photo.jpg'));
// => [['label' => 'person', 'confidence' => 0.95,
//      'x' => 0.1, 'y' => 0.2, 'w' => 0.3, 'h' => 0.4], ...]

Error Handling

  • ONNX::load() throws \RuntimeException if model name is unknown or loading fails
  • ->run() throws \RuntimeException on inference failure
  • No false/null returns — always throws on failure

Project Structure

frankenonnx/
├── main.go                 # HTTP server + FrankenPHP init
├── Makefile                # Build orchestration
├── go.mod
├── env.yaml                # Generated CGO flags (via make env)
├── phpext/
│   ├── phpext.c            # Zend extension — module lifecycle
│   ├── phpext.h            # Module declarations
│   ├── phpext.go           # CGo exports (registered via init())
│   ├── phpext_cgo.h        # CGo header binding
│   ├── onnxmodel.c         # FrankenPHP\ONNX class — method implementations
│   └── onnxmodel.h         # Class declarations + arginfo
├── onnx/
│   ├── registry.go         # Model registry + lazy loading
│   ├── nlp.go              # hugot pipelines (sentiment, embedding)
│   ├── yolo.go             # YOLOv8n via onnxruntime_go
│   └── tts.go              # Piper VITS text-to-speech
├── scripts/
│   └── download-models.sh  # Download models from HuggingFace
├── dist/                   # Built binary + native libs
│   ├── frankenonnx         # The compiled binary
│   ├── libonnxruntime.dylib
│   └── libtokenizers.a
├── build/
│   └── php/
│       └── Makefile         # PHP build via static-php-cli (ZTS + embed)
└── demo/
    ├── index.php            # Landing page with card grid
    ├── style.php            # Shared CSS (dark theme support)
    ├── _header.php          # Shared header template
    ├── _footer.php          # Shared footer template
    ├── sentiment/
    │   └── index.php        # Sentiment analysis demo
    └── embedding/
        └── index.php        # Text embeddings + cosine similarity demo

Native Dependencies

FrankenONNX depends on two native libraries, downloaded to dist/ by make ort:

Library Version Source Purpose
libonnxruntime.dylib 1.22.0 microsoft/onnxruntime ONNX model inference engine
libtokenizers.a 1.26.0 daulet/tokenizers Rust-based HuggingFace tokenizer

License

Code is MIT — see LICENSE.md. Talk material is licensed under CC BY 4.0 — free to share and adapt with attribution.

About

In-process ONNX inference for FrankenPHP — sentiment, embeddings, object detection, and TTS

Topics

Resources

Stars

Watchers

Forks

Contributors