Tenki Sandbox just landed on Crabbox alongside AWS, Azure, E2B, and Modal as a supported provider. If you're already using Crabbox, you can now spin up Tenki sandboxes directly.
Here's a quick test we did: Opus 4.8 vs Fable 5
This is what Tenki Code Reviewer found:
Opus-written PR:
3 high-severity bugs across 22 files and 1,615 lines of code
Fable-written PR:
zero issues across 22 files and 1,134 lines of code
Same repository, same prompts.
Small
Teams treat their CI pipeline like a junk drawer.
Every quarter someone adds a step. Nobody ever removes one.
Then you wonder why builds take 20 minutes.
Building Jardinero: a control plane for autonomous engineering agents.
It's a TypeScript orchestrator living in a long-lived @TenkiCloud sandbox. Three workflows so far.
1) The Log Reviewer runs hourly and post-deploy. It sweeps staging and prod through the Grafana MCP,
Most code reviewer benchmarks publish a per-PR catch rate: "did the tool flag at least one bug in this PR?" Because of that, the top reviewers look almost identical.
It's also the wrong metric. Catch one bug in a nine-bug PR and you score the same as a reviewer that caught all
With self-hosted sandboxes, you can run agents in any environment you control: your own infrastructure, or managed providers like Cloudflare, Daytona, Modal, or Vercel.