TheBigO

Results 8 comments of TheBigO

I can fully confirm, what @Vegolas said, I managed to create a local model runtime fully integrated into my software using the changes in the https://github.com/nomic-ai/gpt4all/tree/csharp-gguf branch.

@yhyu13 For example this one: curl -X POST -H "Content-Type: application/json" -H "Authorization: Nothing to see here" -d "{\"model\": \"ggml-v3-13b-hermes-q5_1.bin\", \"prompt\": \"is this working\"}" http://localhost:4891/v1 The connection is established, but...

> gpt4all_api seems to run ok: > > ``` > gpt4all_api | Checking for script in /app/prestart.sh > gpt4all_api | There is no script /app/prestart.sh > gpt4all_api | INFO: Will...

Tested with new version 2.5. Same behaviour.

> Is there currently any fix for this? I'm encountering the same issue and would like to know if there is a fix or workaround. Hi, finally got it working....

> What is the error? Hi, it is the same error, as described in issue #887 (opened just after mine report): ggml_vulkan: Found 1 Vulkan devices: Vulkan0: NVIDIA GeForce RTX...

By the way, it is independent of the context size in the ModelParameters. I tried different values, Vulkan always crashes. CUDA is fine. ![image](https://github.com/user-attachments/assets/e1a8baf2-22cb-4f3d-9bb0-21ee4120d38a)

> Seems like this is a [llama.cpp issue](https://github.com/ggerganov/llama.cpp/issues/8828), not a LLamaSharp issue. > > Not sure why installing CUDA impacts it, though. Are you sure it was not a coincidence?...