Conversation
3771685 to
ae2057a
Compare
|
Can we provide a subpackage to support openCL? Is it named |
|
Preferably single package that enable multiple features |
Packages to enable OpenCL for llama.cpp are: ocl-icd opencl-headers opencl-clhpp CLBlast is required: Build and install CLBlast: I dunno if you needed this information, but it would be nice if Termux simply handled the whole process. Thank you. Edit: In case it's needed, here's building instruction: CPU: GPU(OpenCL): It's notable that a model loaded from the ~/storage/downloads folder is signifcantly slower compared to loading it from the $HOME path. |
It is an expected behaviour, not a bug.
Related issues: ggml-org/llama.cpp#2292 Compiled deb files: llama-cpp-opencl_0.0.0-r854-fff0e0e-0_aarch64.deb.zip In my mobile, it cannot work, I guess $ LD_LIBRARY_PATH="/system/vendor/lib64" clinfo -l
Platform #0: QUALCOMM Snapdragon(TM)
`-- Device #0: QUALCOMM Adreno(TM)
$ LD_LIBRARY_PATH="/system/vendor/lib64" llama -i -ins --color -t $(nproc) --prompt-cache $PREFIX/tmp/prompt-cache -c 2048 --numa -m ~/ggml-model-q4_0.bin -ngl 1
main: build = 854 (fff0e0e)
main: seed = 1690178858
ggml_opencl: selecting platform: 'QUALCOMM Snapdragon(TM)'
ggml_opencl: selecting device: 'QUALCOMM Adreno(TM)'
ggml_opencl: device FP16 support: true
llama.cpp: loading model from /data/data/com.termux/files/home/ggml-model-q4_0.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 49954
llama_model_load_internal: n_ctx = 2048
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: freq_base = 10000.0
llama_model_load_internal: freq_scale = 1
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 0.08 MB
llama_model_load_internal: using OpenCL for GPU acceleration
llama_model_load_internal: mem required = 5258.03 MB (+ 1026.00 MB per state)
llama_model_load_internal: offloading 1 repeating layers to GPU
llama_model_load_internal: offloaded 1/33 layers to GPU
llama_model_load_internal: total VRAM used: 109 MB
llama_new_context_with_model: kv self size = 1024.00 MB
system_info: n_threads = 8 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |
main: attempting to load saved session from '/data/data/com.termux/files/usr/tmp/prompt-cache'
main: session file does not exist, will create
main: interactive mode on.
Reverse prompt: '### Instruction:
'
sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 2048, n_batch = 512, n_predict = -1, n_keep = 2
== Running in interactive mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to LLaMa.
- To return control without starting a new line, end your input with '/'.
- If you want to submit another line, end your input with '\'.
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -54
GGML_ASSERT: /data/data/com.termux/files/home/.termux-build/llama-cpp-opencl/src/ggml-opencl.cpp:1747: false
zsh: abort LD_LIBRARY_PATH="/system/vendor/lib64" llama -i -ins --color -t $(nproc) -c |
This bug is similar as ggml-org/llama.cpp#2341:
Have reported here |
|
Hi @Freed-Wu, can you test whether this package works fine when you are free? Thanks! |
|
I figured this must be merged before it's available in Termux. Is there some simple way to try this? |
You can download the The CI artifacts are packed into a |
|
It's functioning for sure. If it were up to me then I'd suggest changing the way The package builds llama.cpp with It appears Thank you. |
termux-docker x86_64 |
|
Please rebase this PR to the latest version |

new package: llama-cpp
Closes #17453
Closes #17468