Run LLMs on-device with LiteRT-LM
Production-ready, open-source inference framework designed to deliver high-performance, cross-platform LLM deployments on edge devices
Why LiteRT-LM?
Cross-platform
Deploy LLMs across Android, iOS, Web, and Desktop.
Hardware accelerated
Maximize performance with GPU and NPU acceleration.
Broad GenAI Capabilities
Support for popular LLMs as well as multi-modality (Vision, Audio) and Tool Use.
Supported Models
Run the latest open models optimized for the edge, including Gemma-3n, Gemma-3, FunctionGemma, TranslateGemma, Qwen3, Phi-4, and more.
Join the Community
GitHub
Contribute to the source code, report issues, and see examples.
Hugging Face
Download pre-converted models and join the discussion.