FrankenONNX

Local AI inference in PHP — run ONNX models in-process via FrankenPHP + ONNX Runtime.

Models are loaded once at startup and shared across all requests. Inference is a function call, not a network request — sub-10ms for text, ~15ms for images. No APIs, no Python, no microservices.

Note: This is a companion repo for my FrankenPHP conference talks. It's meant as inspiration and a reference implementation, not a production framework. Feel free to explore, fork, and adapt the patterns for your own projects.

Talks

No talks yet — but here's the abstract if you want to present this:

Running AI Models Natively in PHP

What if PHP could run AI models directly — no APIs, no Python, no microservices? With ONNX Runtime and FrankenPHP, this is now possible. Sentiment analysis in 5 ms. Text embeddings in 3 ms. Object detection in 15 ms. Text-to-speech in 200 ms. All running in-process, using pre-trained models from HuggingFace, with a two-line PHP API. No network latency, no API keys, no external dependencies at runtime.

This talk shows how to bridge PHP to ONNX Runtime via a Go inference layer, why in-process inference changes the economics of AI features, and how any PHP developer can add ML capabilities to their application.

See talk.md for the full narrative, demo walkthrough, and architecture breakdown.

Models

Four demo models, all open-source from HuggingFace:

Model	Task	Input	Output	Time
DistilBERT SST-2	Sentiment analysis	Text	Positive/Negative + score	~5 ms
all-MiniLM-L6-v2	Text embeddings	Text	384-dim float vector	~3 ms
YOLOv8n	Object detection	JPEG/PNG bytes	Bounding boxes + labels	~15 ms
Piper lessac	Text to speech	Text	WAV audio (16-bit PCM)	~200 ms

Models run on CPU — no GPU required. Files are downloaded once to dist/models/ (local) or baked into the Docker image.

How It Works

FrankenONNX combines three key pieces:

FrankenPHP — embeds PHP in a Go process, giving us a Go server that serves PHP pages
ONNX Runtime — Microsoft's high-performance inference engine for .onnx models, with hardware-specific optimizations (SIMD, CoreML, CUDA)
hugot — Go library for HuggingFace transformer pipelines (tokenization + inference)

Because FrankenPHP runs PHP inside Go, and Go has mature ONNX Runtime bindings, the Go process becomes the bridge — a thin C extension connects PHP to Go, and Go connects to ONNX Runtime.

                PHP                           Go Host                      ONNX Runtime
   ┌───────────────────────┐    ┌──────────────────────────────┐    ┌───────────────────────┐
   │                       │    │                              │    │                       │
   │  $m = ONNX::load(..)  │    │  ┌─────────┐  ┌───────────┐ │    │  ┌─────────────────┐  │
   │  $m->run($input)      ├───►│  │ C ext   ├─►│ Go onnx/  │ │    │  │ sentiment.onnx  │  │
   │                       │    │  │ (CGo)   │  │ registry  ├─┼───►│  │ embedding.onnx  │  │
   │  // => PHP array      │◄───┤  │         │◄─┤           │ │    │  │ yolov8n.onnx    │  │
   │                       │    │  └─────────┘  └─────┬─────┘ │◄───┤  │                 │  │
   │                       │    │               ┌─────┴─────┐ │    │  └─────────────────┘  │
   │                       │    │               │   hugot   │ │    │                       │
   │                       │    │               │ pipelines │ │    │  libonnxruntime.dylib │
   │                       │    │               └───────────┘ │    │  (C++ engine)         │
   └───────────────────────┘    └──────────────────────────────┘    └───────────────────────┘
        FrankenPHP                    Model Registry                    ONNX Runtime
     (PHP embedded in Go)          (lazy-loading singletons)        (inference engine)

Models are .onnx files downloaded from HuggingFace with their tokenizer configs
The registry (onnx/) lazy-loads models on first use, keeps them for the process lifetime
PHP code calls model inference through the FrankenPHP\ONNX class
Results are JSON-encoded in Go and decoded to PHP arrays by the C extension

Key difference from FrankenWASM/FrankenAsync

Models are process-lifetime singletons — no per-request context needed. FrankenWASM needs per-request WASM plugin instances (sandbox isolation). FrankenAsync needs per-request task managers. FrankenONNX needs neither — models are global read-only resources. The CGo bridge is simpler: no thread index, no request context, just load(name) and run(name, input).

FrankenPHP Fork

FrankenONNX requires a fork of FrankenPHP that adds frankenphp.RegisterExtension() for registering C Zend extensions from Go — the same fork used by FrankenWASM and FrankenAsync.

Unlike those projects, FrankenONNX does not need frankenphp.Thread() or frankenphp_thread_index() since models are global singletons, not per-request state.

The fork is referenced via a replace directive in go.mod:

replace github.com/dunglas/frankenphp v1.11.2 => ../frankenphp

Quick Start

Docker (recommended)

docker build -t frankenonnx .
docker run -p 8082:8082 frankenonnx

The multi-stage Dockerfile handles everything — PHP build, ONNX Runtime + libtokenizers download, model download from HuggingFace (~415 MB), and the host binary. Models are baked into the image. Open http://localhost:8082 to see the demos.

The PHP build stage uses static-php-cli which can download pre-built libraries from GitHub instead of compiling from source. This requires a GitHub token to avoid API rate limits:

GITHUB_TOKEN=$(gh auth token) docker build \
    --secret id=github_token,env=GITHUB_TOKEN \
    -t frankenonnx .

Without the token the build still works — it just compiles all libraries from source, which takes longer.

Local Build

Prerequisites

macOS Apple Silicon (ARM64)
Go 1.24+
The FrankenPHP fork cloned as a sibling directory (../frankenphp)

Build & Run

make php        # Build PHP 8.3 (ZTS, embed) via static-php-cli (one-time)
make env        # Generate env.yaml with CGO flags from the PHP build
make ort        # Download ONNX Runtime + libtokenizers to dist/
make models     # Download model files to ~/.frankenonnx/models/
make run        # Build the host binary + start the server on :8082

The PHP build is cached in build/.php/ — subsequent runs skip the build if libphp.a exists. To rebuild PHP from scratch:

make php-clean  # Remove cached downloads and build artifacts
make php        # Rebuild
make env        # Regenerate env.yaml

make ort downloads pre-built binaries from GitHub releases (~30 MB total):

libonnxruntime.dylib — ONNX Runtime shared library
libtokenizers.a — Rust-based tokenizer (used by hugot)

make models downloads model files from HuggingFace (~350 MB total):

sentiment/ — DistilBERT SST-2 (model + tokenizer)
embedding/ — all-MiniLM-L6-v2 (model + tokenizer)

Manual Setup

If you prefer to use your own PHP build, create an env.yaml manually:

HOME: "/Users/you"
GOPATH: "/Users/you/go"
GOFLAGS: "-tags=nowatcher,ORT"
CGO_ENABLED: "1"
CGO_CFLAGS: "-I/path/to/php/include ..."
CGO_CPPFLAGS: "-I/path/to/php/include ..."
CGO_LDFLAGS: "-L/path/to/php/lib -lphp -L/path/to/frankenonnx/dist -lonnxruntime ..."

The CGO flags must point to your PHP build's include headers and libraries. PHP must be built with ZTS (--enable-zts) and embed (--enable-embed). The -tags ORT build tag is required by hugot's ONNX Runtime backend.

GoLand

Install the EnvFile plugin, then in your Run Configuration enable EnvFile and add env.yaml to load the CGO flags automatically.

Environment Variables

Variable	Default	Description
`FRANKENONNX_DOC_ROOT`	`demo`	PHP document root directory
`FRANKENONNX_ORT_LIB`	system default	Path to `libonnxruntime.dylib`
`FRANKENONNX_MODELS_DIR`	`~/.frankenonnx/models`	Directory containing model files

PHP API

use FrankenPHP\ONNX;

// Load a model (lazy-loads on first call, singleton for process lifetime)
$model = ONNX::load('sentiment');

// Run inference
$result = $model->run($input);

Sentiment Analysis

$model = ONNX::load('sentiment');
$result = $model->run('FrankenPHP is amazing!');
// => [['label' => 'POSITIVE', 'score' => 0.9999]]

$result = $model->run('This is terrible.');
// => [['label' => 'NEGATIVE', 'score' => 0.9995]]

Text Embeddings

$model = ONNX::load('embedding');
$vec = $model->run('Hello world');
// => [0.0623, -0.0418, 0.1201, ...] (384 floats)

// Cosine similarity between two texts
$vecA = $model->run('I love programming');
$vecB = $model->run('I enjoy coding');
// cosine_similarity($vecA, $vecB) => 0.89

Object Detection

$model = ONNX::load('yolov8n');
$result = $model->run(file_get_contents('photo.jpg'));
// => [['label' => 'person', 'confidence' => 0.95,
//      'x' => 0.1, 'y' => 0.2, 'w' => 0.3, 'h' => 0.4], ...]

Error Handling

ONNX::load() throws \RuntimeException if model name is unknown or loading fails
->run() throws \RuntimeException on inference failure
No false/null returns — always throws on failure

Project Structure

frankenonnx/
├── main.go                 # HTTP server + FrankenPHP init
├── Makefile                # Build orchestration
├── go.mod
├── env.yaml                # Generated CGO flags (via make env)
├── phpext/
│   ├── phpext.c            # Zend extension — module lifecycle
│   ├── phpext.h            # Module declarations
│   ├── phpext.go           # CGo exports (registered via init())
│   ├── phpext_cgo.h        # CGo header binding
│   ├── onnxmodel.c         # FrankenPHP\ONNX class — method implementations
│   └── onnxmodel.h         # Class declarations + arginfo
├── onnx/
│   ├── registry.go         # Model registry + lazy loading
│   ├── nlp.go              # hugot pipelines (sentiment, embedding)
│   ├── yolo.go             # YOLOv8n via onnxruntime_go
│   └── tts.go              # Piper VITS text-to-speech
├── scripts/
│   └── download-models.sh  # Download models from HuggingFace
├── dist/                   # Built binary + native libs
│   ├── frankenonnx         # The compiled binary
│   ├── libonnxruntime.dylib
│   └── libtokenizers.a
├── build/
│   └── php/
│       └── Makefile         # PHP build via static-php-cli (ZTS + embed)
└── demo/
    ├── index.php            # Landing page with card grid
    ├── style.php            # Shared CSS (dark theme support)
    ├── _header.php          # Shared header template
    ├── _footer.php          # Shared footer template
    ├── sentiment/
    │   └── index.php        # Sentiment analysis demo
    └── embedding/
        └── index.php        # Text embeddings + cosine similarity demo

Native Dependencies

FrankenONNX depends on two native libraries, downloaded to dist/ by make ort:

Library	Version	Source	Purpose
`libonnxruntime.dylib`	1.22.0	microsoft/onnxruntime	ONNX model inference engine
`libtokenizers.a`	1.26.0	daulet/tokenizers	Rust-based HuggingFace tokenizer

License

Code is MIT — see LICENSE.md. Talk material is licensed under CC BY 4.0 — free to share and adapt with attribution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FrankenONNX

Talks

Models

How It Works

Key difference from FrankenWASM/FrankenAsync

FrankenPHP Fork

Quick Start

Docker (recommended)

Local Build

Prerequisites

Build & Run

Manual Setup

GoLand

Environment Variables

PHP API

Sentiment Analysis

Text Embeddings

Object Detection

Error Handling

Project Structure

Native Dependencies

License

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.claude/skills		.claude/skills
build/php		build/php
demo		demo
models		models
onnx		onnx
phpext		phpext
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go
screenshot.png		screenshot.png
talk.md		talk.md

Folders and files

Latest commit

History

Repository files navigation

FrankenONNX

Talks

Models

How It Works

Key difference from FrankenWASM/FrankenAsync

FrankenPHP Fork

Quick Start

Docker (recommended)

Local Build

Prerequisites

Build & Run

Manual Setup

GoLand

Environment Variables

PHP API

Sentiment Analysis

Text Embeddings

Object Detection

Error Handling

Project Structure

Native Dependencies

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages