Security through transparency
Why we open-sourced our AI sandbox
Most AI security advice right now is theory without tooling. You read whitepapers about alignment and governance, but when you need to train your team on preventing Prompt Injection attacks, the ecosystem is surprisingly empty.
We had a specific problem: hundreds of employees—from Senior DevOps Engineers to Legal Counsel—were using LLMs daily. We needed to train them on actual risks.
The existing options were pretty bad. We could buy an expensive vendor platform with rigid licensing, or run traditional security exercises that would intimidate a Marketing Manager unfamiliar with a command line.
So we built our own solution: a stateless, containerized, bring-your-own-key sandbox that runs locally and generates its own challenges. It worked better than expected, so we’re open-sourcing it. You can find the code on GitHub.
The architecture: Stateless and simple
We had three engineering constraints:
Zero config: Run on a laptop with a simple Docker command. No database migrations, no complicated authentication.
Model agnostic: Work with OpenAI, Claude, Gemini, or local models like Ollama.
Key safety: Never store API keys in a back-end database.
We settled on a Sidecar Proxy pattern. Nginx bundles the front-end and back-end into a single exposed port, eliminating CORS headaches and simplifying deployment.
The logic is simple:
User enters API key in browser
Browser sends key to back-end
Back-end holds it in memory for that session
Container stops, data vanishes
No persistence, no database, no security liability.
Dynamic challenges that never run out
Most training platforms become boring after you’ve seen the content. You solve a specific scenario once, learn that specific answer, and you’re done. You get to learn one solution instead of the underlying principle.
We didn’t want to manually write 50 scenarios. Instead, we wrote a Meta-Prompt to let the AI generate them. We built an endpoint where you specify your role and difficulty level. The back-end constructs a prompt instructing the LLM to act as a security educator and design a custom challenge.
This single function transformed a static game into an infinite training engine. Someone on the Legal team can generate ten different contract review scenarios—each with unique instructions and vulnerabilities—without us ever writing new code.
One interface for two audiences
Security training usually fails because it ignores user experience. By using natural language as the interface, we made this work for everyone.
For non-technical users: We removed the technical complexity. No terminal, no specialized hacking tools. The interface is a chat window. If they can use ChatGPT, they can use this. They learn that hacking AI systems isn’t about writing code, it’s about logic and language.
For security professionals: This functions as a rapid prototyping lab. The LLM abstraction layer means a Red Teamer can spin this up to benchmark different models. You can run the same attack against ChatGPT and Claude side-by-side to compare which system prompt holds up better. It’s a dedicated, isolated sandbox for testing.
Why we’re open-sourcing it
Security through obscurity is a failing strategy, especially with AI. Bad actors already have environments for testing. We defenders need ours too.
By releasing this platform, we want to:
Lower training costs: High-quality security training shouldn’t require expensive vendor contracts. Small businesses and students deserve the same tools as giant corporations.
Crowdsource defense: We want the community to build and share better scenarios. When people share the toughest challenges they generate, everyone stays ahead of emerging jailbreak techniques.
Standardize the skillset: Just as people learned not to click suspicious links, not trusting unverified AI output needs to become a universal reflex.
AI is moving too fast for gatekeeping. Whether you’re writing code, contracts, or marketing copy, you’re now on the front lines. You deserve a place to practice and learn safely.
The repository is open. The sandbox is yours.
Christopher Hernandez is a Security AI and SOC Team Lead at Deriv.
Follow our official LinkedIn page for company updates and upcoming events.
Join our team to work on projects like this.








