A mobile application that runs large language models (LLMs) directly on Android devices. Designed to showcase expertise in mobile machine learning optimization and deployment, the app loads quantized models (e.g., Qwen2.5-1.5B) and executes them entirely on-device—no internet connection required. It also features built-in performance monitoring to measure inference speed, memory usage, and response quality.
jwalith/Quantized-LLM
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|