Which model should we add next? #4

Mega4alik · 2025-09-04T17:53:59Z

Mega4alik
Sep 4, 2025
Maintainer

Any thoughts are appreciated

usametov · 2025-09-26T13:59:45Z

usametov
Sep 26, 2025

I will definitely try this out, but 1 token per second is slow.
Does it run any of Qwen7b models? Also, I would love to have Gemma3 models running on CPU with Intel integrated GPUs. Does it work with LoRA?

3 replies

Mega4alik Sep 26, 2025
Maintainer Author

Thanks for the question. Can you please tell me about how you would use Lora and qwen3-7b model, and how large is your input size ?

usametov Sep 26, 2025

I do not believe in large context windows, I know that working size is usually under 10K tokens. In regards to LoRA, I was just curious how it works with fine-tunes.

Mega4alik Sep 27, 2025
Maintainer Author

I do not believe in large context windows, I know that working size is usually under 10K tokens. In regards to LoRA, I was just curious how it works with fine-tunes.

To support the model, I have to add it manually. Technically, models fine-tuned using LoRA can be added

asehmi · 2025-09-26T18:57:50Z

asehmi
Sep 26, 2025

gemma3:27b

2 replies

Mega4alik Oct 1, 2025
Maintainer Author

@asehmi Check out the gemma3-12B sample with image https://github.com/Mega4alik/ollm/blob/main/example_multimodality.py Thanks!

asehmi Oct 1, 2025

Thanks for the example.
I assume that because the tokens/sec rate is very low, that the primary use cases are long running background tasks that don't require chat-like user input. So, agent-like tasks.

Mr-DS-ML-85 · 2026-02-01T11:40:03Z

Mr-DS-ML-85
Feb 1, 2026

add options for running 671B models or 100-200B or 405B models with 72GB vram(3 x rtx 3090)

0 replies

Mr-DS-ML-85 · 2026-02-23T14:42:04Z

Mr-DS-ML-85
Feb 23, 2026

support for quantized models and also
text to image
text to video
image to video
Whisper model
kokoro 82M
and also other models
want comfyUI support
also vision models like florance and LLAVA and llama vision

0 replies

heixiaoniu8-creator · 2026-04-23T11:34:58Z

heixiaoniu8-creator
Apr 23, 2026

Being able to upgrade this project to support inference via API services and tool calling, combined with OpenClaw, would be absolutely unbeatable.

0 replies

Which model should we add next? #4

Uh oh!

Mega4alik Sep 4, 2025 Maintainer

Replies: 5 comments · 5 replies

Uh oh!

Uh oh!

usametov Sep 26, 2025

Uh oh!

Mega4alik Sep 26, 2025 Maintainer Author

Uh oh!

usametov Sep 26, 2025

Uh oh!

Mega4alik Sep 27, 2025 Maintainer Author

Uh oh!

asehmi Sep 26, 2025

Uh oh!

Mega4alik Oct 1, 2025 Maintainer Author

Uh oh!

asehmi Oct 1, 2025

Uh oh!

Mr-DS-ML-85 Feb 1, 2026

Uh oh!

Mr-DS-ML-85 Feb 23, 2026

Uh oh!

heixiaoniu8-creator Apr 23, 2026

Mega4alik
Sep 4, 2025
Maintainer

Replies: 5 comments 5 replies

usametov
Sep 26, 2025

Mega4alik Sep 26, 2025
Maintainer Author

Mega4alik Sep 27, 2025
Maintainer Author

asehmi
Sep 26, 2025

Mega4alik Oct 1, 2025
Maintainer Author

Mr-DS-ML-85
Feb 1, 2026

Mr-DS-ML-85
Feb 23, 2026

heixiaoniu8-creator
Apr 23, 2026