Replies: 5 comments 5 replies
-
|
I will definitely try this out, but 1 token per second is slow. |
Beta Was this translation helpful? Give feedback.
-
|
gemma3:27b |
Beta Was this translation helpful? Give feedback.
-
|
add options for running 671B models or 100-200B or 405B models with 72GB vram(3 x rtx 3090) |
Beta Was this translation helpful? Give feedback.
-
|
support for quantized models and also |
Beta Was this translation helpful? Give feedback.
-
|
Being able to upgrade this project to support inference via API services and tool calling, combined with OpenClaw, would be absolutely unbeatable. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Any thoughts are appreciated
Beta Was this translation helpful? Give feedback.
All reactions