ci: Setup self-hosted CI for Intel Linux Vulkan backend#20154
ci: Setup self-hosted CI for Intel Linux Vulkan backend#201540cc4m merged 3 commits intoggml-org:masterfrom
Conversation
|
Update: I was able to receive a token and confirm CI instance added |
|
Thanks, the workflow appears to run successfully: https://github.com/ggml-org/llama.cpp/actions/runs/22884579987/job/66394392012?pr=20154 @0cc4m @jeffbolznv FYI this is a coopmat runner |
FYI this specific workflow was run on another instance which was already added by the OpenVINO team (LNL). Vulkan workload will work on both instances but I haven't tested if OpenVINO workload works on the instance I added today (PTL) |
|
Thanks for pointing that out. If the OpenVINO workflows don't run on your runner, we will gate them with an extra tag. |
* 'master' of github.com:ggml-org/llama.cpp: (33 commits) convert : better mtp check and fix return [no ci] (ggml-org#20419) vulkan: fix SSM_CONV PP scaling with large ubatch sizes (ggml-org#20379) New conversations now auto-select the first loaded model (ggml-org#20403) ggml-virtgpu: Fix some build commands (ggml-org#20341) metal : avoid divisions in bin kernel (ggml-org#20426) ci: Setup self-hosted CI for Intel Linux Vulkan backend (ggml-org#20154) vulkan: fix l2_norm epsilon handling (ggml-org#20350) vulkan: fix OOB check in flash_attn_mask_opt (ggml-org#20296) vulkan: Fix ErrorOutOfHostMemory on Intel GPU when loading large models with --no-mmap (ggml-org#20059) opencl: use larger workgroup size for get_rows (ggml-org#20316) opencl: add cumsum op (ggml-org#18981) hip: compile debug builds with -O2 on hip to avoid a compiler bug (ggml-org#20392) common/parser: add GigaChatV3/3.1 models support (ggml-org#19931) model : add support for Phi4ForCausalLMV (ggml-org#20168) graph : add optional scale parameter to build_lora_mm [no ci] (ggml-org#20427) common : fix --n-cpu-moe, --cpu-moe for models with fused gate + up (ggml-org#20416) ggml-webgpu: Add supports for `GGML_OP_REPEAT` (ggml-org#20230) llama : enable chunked fused GDN path (ggml-org#20340) llama : whitespace cleanup (ggml-org#20422) ggml : add NVFP4 quantization type support (ggml-org#19769) ...
We would like add a single self-hosted CI runner for Intel Linux Vulkan backend related to #19213.
This will be focused for Vulkan backend only at the moment, but when OpenVINO backend gets merged we will use it for both backends.
An example workflow execution on the branch can be viewed at the following: