Our smartphones are basically just pocket computers, and with that come capabilities that match computers in many different ways. From using our smartphones for productivity to using them for games, there's a lot that you can do to tap into the full potential of your phone. One such thing you can do is run an LLM entirely on your smartphone, without an internet connection. The only downside is that you'll need one of the best phones to run one of these models.

If you're looking to run any AI model on a smartphone, the first thing you need to know is that to execute basically any model, you need a lot of RAM. That philosophy is why you need a lot of VRAM when working with applications like Stable Diffusion, and it applies to text-based models, too. Basically, these models will typically be loaded onto RAM for the duration of the workload, and it's a lot faster than executing from storage.

RAM is faster for a couple of reasons, but the two most important are that it's lower latency, since it's closer to the CPU, and it has higher bandwidth. It's necessary to load large language models (LLM) onto RAM due to these properties, but the next question that typically follows is exactly how much RAM is used by these models. Vicuna-7B is one of the most popular models anyone can run, and it's an LLM trained on a dataset of 7 billion parameters that can be deployed on an Android smartphone via MLC LLM, a universal app that aids in LLM deployment. It takes about 6GB of RAM to interact with it on an Android smartphone, which isn't a high bar to clear these days.

If you're interested in running an LLM on your smartphone, read on, because you might be surprised just how easy it is.

LM Studio home page, showing the download buttons and information about it
Run local LLMs with ease on Mac and Windows thanks to LM Studio

If you want to run LLMs on your PC or laptop, it's never been easier to do thanks to the free and powerful LM Studio. Here's how to use it

How to use MLC to run LLMs on your smartphone

It's a very basic application

To download and run LLMs on your smartphone, you can download MLC LLM, a program that will deploy and load models for you. Those models can then run inside of the app, and the app will handle the loading of those models into RAM and executing them for you. In a sense, it's a bit like LM Studio, though just for your smartphone instead.

Out of the box, MLC LLM supports installing Vicuna 7B and RedPajama 3B, but you can give it a URL to load a model manually if you want it to give another a try. There are a lot of different models out there that you can find on Hugging Face that might be worth a shot that will range in size, but your best bet is to ensure that you don't go above a 7B parameter model. Otherwise, those may be too large for your phone to handle, as they can require a lot of RAM.

To set up an LLM on your smartphone, do the following:

  1. Download and install MLC LLM
  2. Download one of the models it shows you, or add it manually from Hugging Face
  3. Open it, and wait for it to load

That's it! It makes the process incredibly simple to install and get an LLM running on your smartphone. If you want to give it a try, then this is by far the best way. MLC has an app on iOS too, and the app is available on the App Store. There are hundreds of models out there on Hugging Face, so check them out, and see what you can find!

Two blue jays on a building generated in stable diffusion
Best AI applications: Tools that you can run on Windows, macOS, or Linux

If you want to play with some AI tools on your computer, then you can use some of these AI tools to do just that.