DEPLOY ANYWHERE. ONE ARTIFACT, EVERY ENVIRONMENT.

The same 50 MB Rust binary runs in Docker, Kubernetes, bare metal, or an air-gapped subnet, with PostgreSQL as the only runtime dependency and a profile YAML as the only environment delta.

▸ Single-Artifact Perimeter
▸ Linker-Section Extensions
▸ Git-Resident Profiles
▸ CLI And Server Share A Binary
▸ Shared-Nothing Replicas
▸ Air-Gap As Configuration

Single-Artifact Perimeter

An ops team running governed Claude in production usually inherits a stack of seven services. A web tier, a session cache, a rate-limit store, a job runner, a search index, a log shipper, an identity service. Each one is a deploy path, an upgrade cadence, and an on-call surface. A CISO who approves that stack for one environment then has to re-approve each moving part for staging, for an air-gapped VM, and for a developer laptop, because the variance between deployments is where governance drift hides.

systemprompt.io collapses that stack into one process. The HTTP server, the job scheduler, the template engine, the JWT middleware, the tiered rate limiter, analytics, and cost tracking all link into the same Rust binary. PostgreSQL is the only runtime dependency. Session state lives in the database. Rate-limit state lives in process memory. No Redis, no Kafka, no Elasticsearch sits in the dependency graph, so the CISO's approval covers one artifact rather than seven services.

The entry point is nine lines. Two function calls bring up routes, schemas, jobs, health checks, and the CLI. A linker pin keeps extension registrations from being stripped under link-time optimisation, then the CLI dispatcher runs. A staff engineer can read the full bootstrap in under a minute and map every deploy target back to the same two calls. The full source of the entry point is named in the reference below.

One Artifact, Approved Once — A single compiled Rust binary plus PostgreSQL is the whole stack. No Redis cluster to approve, no Kafka broker to scan, no Elasticsearch index to review. The same artifact that ships to production ships to staging and to a laptop, so an air-gap sign-off does not repeat per environment.
No Sidecars In The Deployment — No service mesh, no proxy container, no init container, no log forwarder. The job scheduler runs inside the binary on an async cron runtime, so no separate scheduler process needs monitoring and no second binary can drift out of sync with the main one during an incident.
In-Process Rate Limiter — The in-process rate limiter holds a keyed token bucket per auth tier. State lives in process memory, so adding a replica does not mean operating a Redis cluster for coordination, and a runaway agent on one node cannot drain a shared counter for every other node.

Source

Linker-Section Extensions

A team adding a custom capability to a typical AI stack writes a separate service, a Dockerfile, a deploy pipeline, and an interface contract, and then carries the contract drift forever. Every upgrade has to re-certify the host plus every plugin against a new ABI, and a production incident can land on either side of the service boundary. The build-vs-buy answer for a CTO is whether they own that coordination or push it down to the linker.

systemprompt.io pushes it down to the linker. A custom capability is a crate that implements the extension trait and registers itself with a registration macro. The macro emits a factory into a linker section at compile time. At startup the registry walks that section once and instantiates each extension. No classpath scan, no dynamic loader, no host-plugin ABI to version. A staff engineer verifies the mechanism by opening the trait and the macro in the references below. The wire between host and extension is a compiled symbol, not a process boundary.

The same pattern ships in the box. The template links three extensions into the binary (web, marketplace, email), and the host provides the infrastructure extensions for database, logging, analytics, files, users, AI, MCP, OAuth, content, agents, and the scheduler. A custom extension follows the same recipe. Implement the trait, call the macro, recompile, redeploy the same artifact. The CISO approving the rebuild is approving one binary, not a host plus a plugin catalogue.

Compile-Time Discovery — The registration macro writes a factory into a linker section, and the registry walks that section at startup. No classpath scan, no reflection, no runtime plugin loader that a supply-chain compromise could substitute a malicious crate into post-build.
Typed Extension Trait — The extension trait defines the contribution surfaces for schemas, jobs, routes, providers, renderers, and dependencies. Each extension implements only the surfaces it contributes, and the rest fall through to default implementations, so a one-purpose extension does not pay the cost of an empty method stub for every unused surface.
Template Links Web, Marketplace, Email — The systemprompt-web template links the web, marketplace, and email library extensions into the same binary as the host. A custom extension follows the same recipe (implement the trait, call the macro, recompile), and the deploy target stays the same artifact on the same host.

Source

Git-Resident Profiles

Environment drift is the slow killer of self-hosted infrastructure. Local config diverges from staging, staging diverges from production, and a config-only change ships an outage nobody spots in review because the diff lives in a config server, not in the repo. A CTO who asks "what is different between prod and the air-gapped cluster" should get the answer from git log, not from a screenshot.

systemprompt.io binds the binary to a profile through one environment variable. At startup, the profile loader reads the chosen profile name, loads the matching YAML file, and stores the result in a global one-time cell. Every subsystem reads from that single value for its configuration. Rate-limit tiers, JWT issuer and audience, log level, database connection, and storage paths all resolve from the same profile. Switching environments means changing the variable and restarting the process. The binary does not change, so an air-gap approval granted against one SHA is still valid against the staging run of the same SHA.

A profile is a directory checked into version control. The local profile disables rate limits and turns logging up for developer ergonomics. The production profile sets tiered rate limits, JSON logging, and a JWT configuration with issuer, audiences, and expiration. Per-region or per-tenant profiles follow the same shape. A CISO asking "what changed in the air-gapped profile last quarter" reads a git diff, not a change-ticket trail.

One Variable Selects The Configuration — A single environment variable selects the profile, and the profile loader stores it in a one-time global at startup, so every subsystem reads from the same value. No per-service config server, no risk that the JWT middleware and the rate limiter disagree about which environment they are in.
Profiles In Git, Diffs In Review — A profile is a YAML file carrying JWT issuer, audiences, expiration, per-tier rate limits, log level, and environment. Changes are visible at review time, not at incident time, so a security review pulls the profile diff from the same pull request as the code diff.
Per-Tenant, Per-Region, Same Binary — Each profile carries its own tenant_id, database connection, and rate-limit tiers. The same binary can serve a directory of tenants or regions, so a white-label deployment is a profile directory plus a shared binary, not a branded fork of the codebase.

Source

CLI And Server Share A Binary

Most stacks ship one binary for the server, another for the CLI, and a third dashboard image to operate it. Three things to package, three to version, three to keep in sync on upgrade. A staff engineer running an incident at 3am has to remember which binary on which host has which subcommands. systemprompt.io is one artifact playing all three roles, so the command a developer runs on a laptop is the command an operator runs in production.

The same binary parses subcommands and dispatches them. systemprompt services start --foreground brings up the HTTP server. systemprompt admin agents list runs against the database. systemprompt infra logs view queries the log store. The CLI is not a wrapper around the server. It is the server invoked with a different subcommand, so an admin task in production and a debug run on a laptop hit the same code paths and the same profile loader.

The production container reflects that. The image starts from debian:bookworm-slim, installs two system libraries (libpq5 for the PostgreSQL client and libssl3 for TLS), copies the pre-built binary into /app/bin/, sets a healthcheck hitting the health endpoint, and runs the entrypoint. No Rust toolchain in the image, no multi-stage compile. A CISO doing a supply-chain review sees a binary and two library packages, not a compiler and a build cache, which keeps the attack surface of the production image small.

CLI And Server Share A Binary — The CLI dispatcher parses the subcommand and routes into either the API server or an admin operation. Same binary, same code paths, same profile loader, so an operator tightening a rate limit from the CLI hits the same configuration surface as the running server.
Slim Production Image — The container is debian:bookworm-slim with libpq5 and libssl3 and the pre-built binary copied in. No Rust compiler, no build cache, no multi-stage layers an attacker could squat on. A supply-chain review covers two system libraries and one binary.
Readiness Signal For Kubernetes — The server flips an atomic readiness flag and broadcasts when it is accepting connections, so a Kubernetes readiness probe does not have to time-box its guess. A deploy rolls forward as soon as the binary is actually serving, not when a sleep timer says it should be.

Source

Shared-Nothing Replicas

Scaling a typical AI gateway means scaling the cache and the session store with it. Add a replica, add capacity to Redis, watch for hot keys. A CTO signing off on an in-cluster deployment ends up approving three distributed systems instead of one, and every one of them is a separate on-call surface. systemprompt.io takes those moving parts off the table by keeping per-request state stateless and per-process state local, so horizontal scaling is N binaries behind a load balancer.

JWT validation runs inside the request. The JWT service constructs a local decoding key from the profile secret once at startup, and every request verifies signature and expiry against that key without touching I/O. No session store sits in front of the binary and no external auth service is called per request, so any replica can serve any request and a lost pod does not strand a session on a remote cache.

Rate limiting is local too. Each replica meters its own share of traffic through the in-process tiered limiter, with separate token buckets per auth tier so a batch of MCP tool calls cannot crowd out an admin invocation at the same replica. A CTO who needs globally coordinated limits can still layer an upstream reverse proxy, but the binary does not require one to deploy safely.

Health and readiness ride on the same binary. The health endpoint verifies database connectivity for liveness probes. The readiness layer flips an atomic flag and broadcasts when the server starts accepting connections, and the database layer manages its own connection pool so no external pooler (PgBouncer, pgpool) has to be deployed and approved alongside.

Stateless JWT Validation — The JWT service verifies signature and expiry against a local decoding key loaded once from the profile secret. No database lookup per request, no session store, no external auth call, so a request latency budget does not spend milliseconds talking to a token service before doing any work.
Per-Tier Local Rate Limiting — The in-process tiered limiter holds one token bucket per auth tier, sized by the profile. Replicas meter independently, so a batch of MCP calls on one replica cannot drain an admin budget on another, and adding a replica does not also add pressure on a Redis cluster.
Health And Readiness In-Binary — The liveness endpoint checks database connectivity. The readiness layer broadcasts when the HTTP listener is accepting connections. Probe wiring is one healthcheck line in the Dockerfile, not a sidecar that could be version-skewed against the server it is checking.

Source

Air-Gap As Configuration

A regulated team picking AI infrastructure asks one question early. Can this run somewhere we trust, including a network with no internet route? A vendor whose air-gapped build is a separate code branch forces the CISO to re-approve that branch against a different supply chain. systemprompt.io answers the question by being a binary, a database, and a profile, and nothing else. The same artifact runs in every environment, and air-gapping is a profile choice, not a fork.

The binary runs on any Linux x86_64 host with libpq5 and libssl3 available. The container image is the same binary on debian:bookworm-slim. No cloud-specific packaging, no environment-specific build path, so the CISO who approved the production SHA has approved the staging SHA and the air-gapped SHA as well.

Air-gapped deployment is a configuration, not a source-code fork. The binary is its own token issuer. Admin tokens are signed locally with HS256 using the profile secret, so tokens verify in-process without Auth0, Okta, or an external identity service in the loop. Logging writes to PostgreSQL. The only outbound network calls are to PostgreSQL and to whichever AI providers the profile explicitly configures, so an auditor can enumerate the perimeter of a deployment from one profile file, not from a network capture.

Docker And Kubernetes — The debian:bookworm-slim image ships with a healthcheck hitting the health endpoint. No init containers, no service mesh, no sidecar, so a Kubernetes deployment manifest is one container image and a PostgreSQL connection string.
Bare Metal And VMs — Copy the binary to a Linux host with libpq5 and libssl3 installed, and run it in the foreground. No container runtime required, so a regulated environment that forbids Docker can run the same artifact the cloud deployment runs.
Air-Gapped Networks — Admin tokens are signed locally with the profile secret, so no external identity provider is required. Logging goes to PostgreSQL, and outbound network calls are scoped to PostgreSQL plus whichever AI providers the profile explicitly configures. An auditor enumerates the perimeter from one file.

Source

Founder-led. Self-service first.

No sales team. No demo theatre. The template is free to evaluate — if it solves your problem, we talk.

Who we are

One founder, one binary, full IP ownership. Every line of Rust, every governance rule, every MCP integration — written in-house. Two years of building AI governance infrastructure from first principles. No venture capital dictating roadmap. No advisory board approving features.

How to engage

Self-Service

Evaluate

Clone the template from GitHub. Run it locally with Docker or compile from source. Full governance pipeline.

When Ready

Talk

Once you have seen the governance pipeline running, book a meeting to discuss your specific requirements — technical implementation, enterprise licensing, or custom integrations.

Own

Deploy

The binary and extension code run on your infrastructure. Perpetual licence, source-available under BSL-1.1, with support and update agreements tailored to your compliance requirements.

View the template on GitHub →

One binary. One database. Your infrastructure.

Clone the template, link your extensions into the same binary, and deploy the same artifact to every environment.

Evaluate Free Read the Docs

DEPLOY ANYWHERE. ONE ARTIFACT, EVERY ENVIRONMENT.

Single-Artifact Perimeter

Linker-Section Extensions

Git-Resident Profiles

CLI And Server Share A Binary

Shared-Nothing Replicas

Air-Gap As Configuration

Explore More Features

One Binary, Complete Stack

One CLI, Every Environment

Compliance

Get the build log · plus the Enterprise Factsheet

Founder-led. Self-service first.

Who we are

How to engage

Evaluate

Talk

Deploy

One binary. One database. Your infrastructure.