An open-source Android app showcasing on-device AI inference with Gemma 4 and LiteRT-LM. Chat, understand images, and control your phone — all running locally with zero internet after the initial model download. Entirely vibe coded with GitHub Copilot.
No cloud APIs. No subscriptions. No data leaving your device. This is private, portable AI running on your phone's hardware.
Keywords: Gemma 4, on-device LLM, Android AI, LiteRT-LM, offline AI assistant, on-device inference, Jetpack Compose, function calling, multimodal AI, object detection, OCR, image captioning, visual question answering, speech to text, phone automation, Material 3, Kotlin, open source
- Chat — Natural conversation with real-time token streaming and visible thinking/reasoning, powered by Gemma 4 E2B running entirely on-device
- See — Multimodal image understanding from camera or gallery: describe scenes, detect objects with bounding boxes, read text (OCR), answer visual questions
- Control your phone — 22 toggleable tools via LiteRT-LM's native ToolSet API: send SMS, make calls, set alarms, toggle flashlight, adjust volume/brightness, navigate, control media, and more
- Voice input — On-device speech recognition for hands-free interaction
- Persistent conversations — Chat history saved locally, multiple conversations supported
git clone https://github.com/ajay-sainy/GemOfGemma.git
cd GemOfGemma
./gradlew installDebugRequirements: Android Studio, JDK 17+, Android device with 4GB+ RAM, ~3GB storage.
On first launch, the app downloads Gemma 4 E2B from HuggingFace (~2.5 GB, one-time). After that, it runs fully offline — no internet needed.
The app uses LiteRT-LM to run Google's Gemma 4 model directly on Android hardware. Key technical highlights:
- Streaming inference via
Conversation.sendMessageAsync()— tokens appear in real-time - Native function calling via LiteRT-LM's
ToolSetAPI with@Toolannotations - Thinking mode with
Channel("thinking")— visible chain-of-thought reasoning - Format-based response parsing — model outputs
```jsonwithbox_2dfor object detection (following Google's official approach) - Multi-module architecture —
:app,:ui,:ai,:core,:actions,:camera,:voice,:accessibility
The Gemma model is subject to the Gemma Terms of Use. This project's source code is Apache 2.0.
Contributions welcome — open an issue first to discuss, then submit a PR.
Google DeepMind (Gemma) · Google AI Edge (LiteRT-LM) · Jetpack Compose




