Add import_models command to discover local MLX models. by ivanfioravanti · Pull Request #6 · simonw/llm-mlx

ivanfioravanti · 2025-02-15T23:15:07Z

Very basic import from local hugging face cache folder for all models containing mlx-community.

Closes #5

simonw · 2025-02-15T23:21:36Z

llm_mlx.py


+    @mlx.command()
+    def import_models():
+        cache_dir = Path(os.environ.get("HF_HOME", os.path.expanduser("~/.cache/huggingface")))


Using HF_HOME is neat.

I play with it when loading files from external disk.

simonw · 2025-02-16T02:15:43Z

I tested this like so:

mv "$(uv run llm mlx models-file)" /tmp/llm-mlx.json

uv run llm mlx models

This confirmed I now have no models.

Then with this PR:

uv run llm mlx import-models

Imported 82 models from Hugging Face cache

That sounds like too many!

uv run llm models -q mlx

Outputs:

MlxModel: mlx-community/whisper-small.en-mlx/snapshots/52a88bf6e98b114a210c21bb83e22d6e1505cb73
MlxModel: mlx-community/Llama-3.3-70B-Instruct-4bit
MlxModel: mlx-community/SmolLM-135M-Instruct-4bit/snapshots
MlxModel: mlx-community/whisper-small.en-mlx/blobs
MlxModel: mlx-community/whisper-tiny
MlxModel: mlx-community/nanoLLaVA-1.5-8bit/refs
MlxModel: mlx-community/whisper-large-v3-turbo
MlxModel: mlx-community/whisper-small.en-mlx/snapshots
MlxModel: mlx-community/Meta-Llama-3-8B-Instruct-4bit
MlxModel: mlx-community/DeepSeek-R1-Distill-Llama-8B/refs
MlxModel: mlx-community/Llama-3.3-70B-Instruct-4bit/refs
MlxModel: mlx-community/Llama-3.2-3B-Instruct-bf16
MlxModel: mlx-community/whisper-tiny/snapshots/54773fb11b9b7640b1a2ce4f8b55b2ce44239589
MlxModel: mlx-community/Mistral-7B-Instruct-v0.3-4bit/snapshots/a4b8f870474b0eb527f466a03fbc187830d271f5
MlxModel: mlx-community/whisper-large-v3-turbo/refs
MlxModel: mlx-community/Llama-3.2-3B-Instruct-4bit/snapshots/7a82cca14319d695658275cf5e3b98c012bb2f87
MlxModel: mlx-community/llava-phi-3-mini-4bit
MlxModel: mlx-community/distil-whisper-large-v3
MlxModel: mlx-community/whisper-small.en-mlx/refs
MlxModel: mlx-community/Mistral-7B-Instruct-v0.3-4bit/refs
MlxModel: mlx-community/Qwen2.5-Coder-32B-Instruct-8bit
MlxModel: mlx-community/idefics2-8b-4bit
MlxModel: mlx-community/OpenELM-3B-instruct-8bit
MlxModel: mlx-community/DeepSeek-R1-Distill-Llama-8B/snapshots
MlxModel: mlx-community/DeepSeek-R1-Distill-Llama-8B/blobs
MlxModel: mlx-community/whisper-tiny/blobs
MlxModel: mlx-community/SmolLM-135M-Instruct-4bit/blobs
MlxModel: mlx-community/Qwen2.5-0.5B-Instruct-4bit/snapshots
MlxModel: mlx-community/OpenELM-270M-Instruct
MlxModel: mlx-community/Llama-3.3-70B-Instruct-4bit/blobs
MlxModel: mlx-community/distil-whisper-large-v3/blobs
MlxModel: mlx-community/Qwen2.5-0.5B-Instruct-4bit
MlxModel: mlx-community/nanoLLaVA-1.5-8bit/blobs
MlxModel: mlx-community/pixtral-12b-8bit
MlxModel: mlx-community/SmolLM-135M-Instruct-4bit/snapshots/642e06afe3fab57fd6cc518637c471af0a569e1e
MlxModel: mlx-community/Mistral-7B-Instruct-v0.1-4bit-mlx
MlxModel: mlx-community/SmolVLM-Instruct-bf16
MlxModel: mlx-community/Qwen2.5-0.5B-Instruct-4bit/refs
MlxModel: mlx-community/whisper-large-v3-turbo/snapshots
MlxModel: mlx-community/Llama-3.2-3B-Instruct-4bit (aliases: ml, l32)
MlxModel: mlx-community/distil-whisper-large-v3/refs
MlxModel: mlx-community/Mistral-7B-Instruct-v0.3-4bit/blobs
MlxModel: mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit/blobs
MlxModel: mlx-community/OpenELM-270M-Instruct/blobs
MlxModel: mlx-community/Llama-3.2-3B-Instruct-4bit/refs
MlxModel: mlx-community/Qwen2.5-0.5B-Instruct-4bit/blobs
MlxModel: mlx-community/OpenELM-270M-Instruct/snapshots/7cb5ebd2e82067793db75003630ed2442a16a29d
MlxModel: mlx-community/SmolLM-135M-Instruct-4bit/refs
MlxModel: mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit/refs
MlxModel: mlx-community/DeepSeek-R1-Distill-Llama-8B/snapshots/7caba1ee941e9f0100394ebd5fe5a193a51304fb
MlxModel: mlx-community/Llama-3.2-3B-Instruct-4bit/snapshots
MlxModel: mlx-community/QVQ-72B-Preview-4bit
MlxModel: mlx-community/llava-1.5-7b-4bit
MlxModel: mlx-community/whisper-tiny/refs
MlxModel: mlx-community/whisper-small.en-mlx
MlxModel: mlx-community/Llama-3.2-3B-Instruct-4bit/blobs
MlxModel: mlx-community/whisper-large-v3-turbo/snapshots/beea265c324f07ba1e347f3c8a97aec454056a86
MlxModel: mlx-community/whisper-large-v2-mlx
MlxModel: mlx-community/nanoLLaVA-1.5-8bit/snapshots/547f734ef05c24b2fa73618a77f6e7fd76bf0f4d
MlxModel: mlx-community/distil-whisper-large-v3/snapshots/e1c3c155644be59f8b477c0186719442f7e3fbb0
MlxModel: mlx-community/Mistral-7B-Instruct-v0.3-4bit
MlxModel: mlx-community/whisper-large-v1-mlx
MlxModel: mlx-community/Qwen2.5-VL-7B-Instruct-8bit
MlxModel: mlx-community/Llama-3.3-70B-Instruct-4bit/snapshots/de2dfaf56839b7d0e834157d2401dee02726874d
MlxModel: mlx-community/Llama-3.3-70B-Instruct-4bit/snapshots
MlxModel: mlx-community/whisper-large-v3-turbo/blobs
MlxModel: mlx-community/Mistral-7B-Instruct-v0.3-4bit/snapshots
MlxModel: mlx-community/OpenELM-270M-Instruct/snapshots
MlxModel: mlx-community/SmolLM-135M-Instruct-4bit
MlxModel: mlx-community/distil-whisper-large-v3/snapshots
MlxModel: mlx-community/Qwen2.5-0.5B-Instruct-4bit/snapshots/a5339a4131f135d0fdc6a5c8b5bbed2753bbe0f3
MlxModel: mlx-community/OpenELM-270M-Instruct/refs
MlxModel: mlx-community/DeepSeek-R1-Distill-Llama-8B
MlxModel: mlx-community/Llama-3.2-11B-Vision-Instruct-4bit
MlxModel: mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit
MlxModel: mlx-community/nanoLLaVA-1.5-8bit/snapshots
MlxModel: mlx-community/whisper-tiny/snapshots
MlxModel: mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit/snapshots/f429cf7184400c416f1d61c4d9dd3f47912fccba
MlxModel: mlx-community/nanoLLaVA-1.5-8bit
MlxModel: mlx-community/phi-4-4bit
MlxModel: mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit/snapshots
MlxModel: mlx-community/Meta-Llama-3-8B-4bit

Some of those end with things like /snapshots which looks like a bug.

It's also picked up some non-LLM models like mlx-community/whisper-tiny.

simonw · 2025-02-16T02:17:04Z

I wonder if there's a good way to detect which of those models are compatible with LLM?

mlx-community/pixtral-12b-8bit and mlx-community/Llama-3.2-11B-Vision-Instruct-4bit are particularly tricky as those are vision models, which likely need https://github.com/Blaizzy/mlx-vlm

ivanfioravanti · 2025-02-16T17:39:59Z

I don't like hardcoding model_types in code, but I don't see other possibilities, what do you think?

simonw · 2025-02-16T18:29:31Z

Another option could be to make this command interactive - so it shows you a list of models b ur asks you to confirm before adding them, maybe even lets you key up and key down and select toggle the ones you want.

ivanfioravanti · 2025-02-16T21:31:45Z

@simonw implemented, similar to huggingface-cli delete-cache. I excluded whisper and some VL models to reduce list.
Models can be imported or even removed if already imported.

It seems ok now. WDYT?

simonw · 2025-02-17T00:13:32Z

llm_mlx.py

+                                    if model_type in [
+                                        "whisper",
+                                        "llava",
+                                        "paligemma",
+                                        "qwen2_vl",
+                                        "qwen2_5_vl",
+                                        "florence2",
+                                        "florence",
+                                    ]:
+                                        continue


This is very smart.

Thanks! I had 100+ models, I had to cut that list down.

simonw · 2025-02-17T00:14:28Z

llm_mlx.py

+    window_size = os.get_terminal_size().lines - 5
+
+    while True:
+        print("\033[H\033[J", end="")


Normally I'd suggest we use a cross-platform library like Rich here, but since this plugin only works on macOS there's no need for us to add another dependency.

simonw · 2025-02-17T00:16:27Z

% llm mlx import-models
Available models (↑/↓ to navigate, SPACE to select, ENTER to confirm, Ctrl+C to quit):
  ○ (llama) mlx-community/DeepSeek-R1-Distill-Llama-8B (already imported)
  ○ (llama) mlx-community/Llama-3.2-3B-Instruct-4bit (already imported)
  ○ (llama) mlx-community/Llama-3.3-70B-Instruct-4bit (already imported)
  ○ (llama) mlx-community/SmolLM-135M-Instruct-4bit (already imported)
> ○ (llava-qwen2) mlx-community/nanoLLaVA-1.5-8bit (already imported)
  ○ (mistral) mlx-community/Mistral-7B-Instruct-v0.3-4bit (already imported)
  ○ (mistral) mlx-community/Mistral-Small-24B-Instruct-2501-4bit (already imported)
  ○ (openelm) mlx-community/OpenELM-270M-Instruct (already imported)
  ○ (qwen2) mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit (already imported)
  ○ (qwen2) mlx-community/Qwen2.5-0.5B-Instruct-4bit (already imported)

Since ALL of those are already-imported the tool should probably quit without asking me to make a decision.

simonw · 2025-02-17T00:19:07Z

llm -m mlx-community/OpenELM-270M-Instruct hi

Error: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

I wonder if it's possible to detect that a model doesn't have chat templates setup and avoid suggesting it?

simonw · 2025-02-17T01:31:35Z

Tests failed in Python 3.9:

cat llm_mlx.py | llm -m o3-mini 'rewrite this file to not use case/match so it works in Python 3.9'

Response: https://gist.github.com/simonw/16deb5b5cc66973ff04b2a2f3eaf00d9

simonw · 2025-02-17T01:38:22Z

I think we can land this. Things like "don't show options if everything is installed already" are purely nice-to-haves.

Refs #6, #7

ivanfioravanti · 2025-02-17T13:13:53Z

I think we can land this. Things like "don't show options if everything is installed already" are purely nice-to-haves.

But you can use tool to remove already imported models too. Maybe it was better calling it manage-models

ivanfioravanti · 2025-02-19T08:54:40Z

llm -m mlx-community/OpenELM-270M-Instruct hi
Error: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

I wonder if it's possible to detect that a model doesn't have chat templates setup and avoid suggesting it?

I bet it is, I will check this in the weekend

Closes #16 Refs #11, #5, #6

simonw reviewed Feb 15, 2025

View reviewed changes

simonw mentioned this pull request Feb 16, 2025

Local MLX models already downloaded are not visibile #5

Closed

Add import_models command to discover local MLX models.

538911b

ivanfioravanti force-pushed the main branch from 7edba95 to 538911b Compare February 16, 2025 21:41

simonw reviewed Feb 17, 2025

View reviewed changes

Applied Black, replaced match get_key() with elif for Python 3.9

625b817

simonw added 2 commits February 16, 2025 17:41

Docs for llm import-models in README

dc311a0

Tweak help text

2bbd191

simonw merged commit 862c8fc into simonw:main Feb 17, 2025
5 checks passed

simonw added a commit that referenced this pull request Feb 17, 2025

Release 0.3

543348a

Refs #6, #7

kj-9 mentioned this pull request Feb 20, 2025

Add Command to Register MLX Models from Arbitrary Paths (e.g., LM Studio Downloads) #12

Open

simonw added the enhancement New feature or request label Apr 23, 2025

simonw added a commit that referenced this pull request Apr 23, 2025

llm mlx import-models is now llm mlx manage-models

86513c2

Closes #16 Refs #11, #5, #6

Uh oh!

Conversation

ivanfioravanti commented Feb 15, 2025

Uh oh!

simonw Feb 15, 2025

Choose a reason for hiding this comment

Uh oh!

ivanfioravanti Feb 16, 2025

Choose a reason for hiding this comment

Uh oh!

simonw commented Feb 16, 2025

Uh oh!

simonw commented Feb 16, 2025

Uh oh!

ivanfioravanti commented Feb 16, 2025

Uh oh!

simonw commented Feb 16, 2025

Uh oh!

ivanfioravanti commented Feb 16, 2025

Uh oh!

simonw Feb 17, 2025

Choose a reason for hiding this comment

Uh oh!

ivanfioravanti Feb 17, 2025

Choose a reason for hiding this comment

Uh oh!

simonw Feb 17, 2025

Choose a reason for hiding this comment

Uh oh!

simonw commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

simonw commented Feb 17, 2025

Uh oh!

simonw commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

simonw commented Feb 17, 2025

Uh oh!

Uh oh!

ivanfioravanti commented Feb 17, 2025

Uh oh!

ivanfioravanti commented Feb 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

simonw commented Feb 17, 2025 •

edited

Loading

simonw commented Feb 17, 2025 •

edited

Loading