Gem of Gemma 💎

On-Device AI Assistant for Android — Powered by Gemma 4

An open-source Android app showcasing on-device AI inference with Gemma 4 and LiteRT-LM. Chat, understand images, and control your phone — all running locally with zero internet after the initial model download. Entirely vibe coded with GitHub Copilot.

No cloud APIs. No subscriptions. No data leaving your device. This is private, portable AI running on your phone's hardware.

Keywords: Gemma 4, on-device LLM, Android AI, LiteRT-LM, offline AI assistant, on-device inference, Jetpack Compose, function calling, multimodal AI, object detection, OCR, image captioning, visual question answering, speech to text, phone automation, Material 3, Kotlin, open source

📸 Screenshots

What It Can Do

Chat — Natural conversation with real-time token streaming and visible thinking/reasoning, powered by Gemma 4 E2B running entirely on-device
See — Multimodal image understanding from camera or gallery: describe scenes, detect objects with bounding boxes, read text (OCR), answer visual questions
Control your phone — 22 toggleable tools via LiteRT-LM's native ToolSet API: send SMS, make calls, set alarms, toggle flashlight, adjust volume/brightness, navigate, control media, and more
Voice input — On-device speech recognition for hands-free interaction
Persistent conversations — Chat history saved locally, multiple conversations supported

Getting Started

git clone https://github.com/ajay-sainy/GemOfGemma.git
cd GemOfGemma
./gradlew installDebug

Requirements: Android Studio, JDK 17+, Android device with 4GB+ RAM, ~3GB storage.

On first launch, the app downloads Gemma 4 E2B from HuggingFace (~2.5 GB, one-time). After that, it runs fully offline — no internet needed.

How It Works

The app uses LiteRT-LM to run Google's Gemma 4 model directly on Android hardware. Key technical highlights:

Streaming inference via Conversation.sendMessageAsync() — tokens appear in real-time
Native function calling via LiteRT-LM's ToolSet API with @Tool annotations
Thinking mode with Channel("thinking") — visible chain-of-thought reasoning
Format-based response parsing — model outputs ```json with box_2d for object detection (following Google's official approach)
Multi-module architecture — :app, :ui, :ai, :core, :actions, :camera, :voice, :accessibility

Model License

The Gemma model is subject to the Gemma Terms of Use. This project's source code is Apache 2.0.

Contributing

Contributions welcome — open an issue first to discuss, then submit a PR.

Acknowledgments

Google DeepMind (Gemma) · Google AI Edge (LiteRT-LM) · Jetpack Compose

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.copilot		.copilot
.github		.github
.squad		.squad
Gemma4Research		Gemma4Research
accessibility		accessibility
actions		actions
ai		ai
app		app
camera		camera
core		core
gradle		gradle
screenshots		screenshots
ui		ui
voice		voice
$file		$file
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.gradle.kts		build.gradle.kts
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gem of Gemma 💎

On-Device AI Assistant for Android — Powered by Gemma 4

📸 Screenshots

What It Can Do

Getting Started

How It Works

Model License

Contributing

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Gem of Gemma 💎

On-Device AI Assistant for Android — Powered by Gemma 4

📸 Screenshots

What It Can Do

Getting Started

How It Works

Model License

Contributing

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages