Secure Your Keys, Track Your Costs: any-llm Managed Platform Enters Open Beta
Secure, encrypted LLM API key management across OpenAI, Anthropic, Google providers. Track costs, set budgets, avoid vendor lock-in. Free beta access now.
The new plugin system transforms mcpd from a tool-server manager into an extensible enforcement and transformation layer—where authentication, validation, rate limiting, and custom logic live in one governed pipeline.
The year 2025 has been a busy one at Mozilla.ai. From hosting live demos and speaking at conferences, to releasing our latest open-source tools, we have made a lot of progress and more exploration this year.
Leverage the JVM's polyglot capabilities to create a self-contained, enterprise-optimized server-side blueprint that combines the performance benefits of WebAssembly with the reliability and maturity of Java's ecosystem.
any-llm managed platform adds end-to-end encrypted API key storage and usage tracking to the any-llm ecosystem. Keys are encrypted client-side, never visible to us, while you monitor token usage, costs, and budgets in one place. Supports OpenAI, Anthropic, Google, and more.
Encoderfile compiles encoders into single-binary executables with no runtime dependencies, giving teams deterministic, auditable, and lightweight deployments. Built on ONNX and Rust, Encoderfile is designed for environments where latency, stability, and correctness matter most.
With mcpd-proxy, teams no longer juggle multiple MCP configs. Run all servers behind one proxy and give every developer the same zero-config access inside their IDE.
Gain visibility and control over your LLM usage. any-llm-gateway adds budgeting, analytics, and access management to any-llm, giving teams reliable oversight for every provider.
AI Agents extend large language models beyond text generation. They can call functions, access internal and external resources, perform deterministic operations, and even communicate with other agents. Yet, most existing guardrails weren’t built to protect these operations.
Run any model like OpenAI, Claude, Mistral, or llama.cpp from one interface. any-llm v1.0 delivers production-ready stability, standardized reasoning output, and auto provider detection for seamless use across cloud and local models.
Subscribe to get the latest news and ideas from our team
Mozilla.ai is adopting llamafile to advance open, local, privacy-first AI—and we’re inviting the community to help shape its future.
Building AI agents is hard, not just due to LLMs, but also because of tool selection, orchestration frameworks, evaluation, safety, etc. At Mozilla.ai, we’re building tools to facilitate agent development, and we noticed that guardrails for filtering unsafe outputs also need a unified interface.
🤝This is the first "guest post" in Mozilla.ai's blog (congratulations Baris!). His experiment, built upon the ideas of Mozilla.ai’s WASM agents blueprint, extends the concept of in-browser agents with local inference, multi-language runtimes, and full browser-native execution. As we are exploring ways to
mcpd is to agents what requirements.txt is to applications: a single config to declare, pin, and run the tools your agents need, consistently across local, CI, and production.
After OpenAI’s ChatGPT release, the default standard for communication was the OpenAI Completion API. However, with the new “reasoning models”, a critical piece of the output isn’t handled by that OpenAI specification, leaving each provider to decide how to handle the new model capabilities.
When it comes to using LLMs, it’s not always a question of which model to use: it’s also a matter of choosing who provides the LLM and where it is deployed. Today, we announce the release of any-llm, a Python library that provides a simple unified interface to access the most popular providers.
One of the main barriers to a wider adoption of open-source agents is the dependency on extra tools and frameworks that need to be installed before the agents can be run. In this post, we show how to write agents as HTML files, which can just be opened and run in a browser.
Generative AI models are highly sensitive to input phrasing. Even small changes to a prompt or switching between models can lead to different results. Adding to the complexity, LLMs often act as black-boxes, making it difficult to understand how specific prompts influence their behavior.
Imagine you could effortlessly navigate the universe of LLMs, always knowing which one is the perfect fit for your specific query. Today, this is a very difficult challenge. So, how do you efficiently manage and use LLMs for various tasks? This is where LLM Routing emerges as a crucial strategy.
We recently discussed the increasing need to test applications that make use of AI with tests that target problems specific to AI models. But an immediate follow-up question then arises: What specific problems? How is that testing different?
Since LLMs exploded into public awareness, we have witnessed their integration into a vast array of applications. However, this also introduces new complexities, especially in testing. At Mozilla.ai, we did some research on the need to introduce formal testing to the end to end app.
Since the launch of ChatGPT in 2022, generative AI and LLMs have rapidly entered everyday life. The viral adoption of these tools was unprecedented, and in some ways contentious. In order to grant greater capabilities to LLMs, they can be integrated into a framework that’s referred to as an “Agent”.