Conversation
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
This reverts commit 411adba. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
This reverts commit 1d56301. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
it doesn't make sense tbh Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
|
@taronaeo Will this also fix the Docker build? |
Ooh, looks like it broke for some reason. But yes, this PR + the previous PR (#16664) should fix this. Let me double check tomorrow when I have more time :) |
* drop vxe feature * add nnpa feature Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
|
@rishiraj20 Can you assist to test this PR on AQLINUX1 and 2?
$ cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DGGML_NATIVE=OFF -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON
$ cmake --build build --config Release -t llama-cli -j$(nproc)
$ ls -la build/bin | grep libggml-cpu
-rwxr-xr-x. 1 root root 1167608 Nov 1 19:04 libggml-cpu-z15.so
-rwxr-xr-x. 1 root root 1167608 Nov 1 19:04 libggml-cpu-z16.so
$ build/bin/llama-cli -m /opt/hf_models/granite-3.3-2b-instruct-be.Q4_K_M.gguf -no-cnv --seed 42 -n 50 -p "Write me a dog walking business idea 1. " 2>&1 | lessHelp me paste the first few outputs from the top. It should print something like this at the top and it should run the prompt completely without problems. |
|
None of the CI failures look related to this PR. Merging in a few hours unless the CI failures are related. |
Indeed unrelated, go ahead. |
ref: #16664 (comment)
This PR introduces the CPU features detection for the s390x platform and allows for dynamic backend loading when compiled with
-DGGML_NATIVE=OFF -DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON.Tested
release.ymland it seems to be working as intended as well: https://github.com/ggml-org/llama.cpp/actions/runs/18814223900/job/53680143680.