Skip to content

ci: Setup self-hosted CI for Intel Linux Vulkan backend#20154

Merged
0cc4m merged 3 commits intoggml-org:masterfrom
rillomas:setup-ci-for-intel-linux
Mar 12, 2026
Merged

ci: Setup self-hosted CI for Intel Linux Vulkan backend#20154
0cc4m merged 3 commits intoggml-org:masterfrom
rillomas:setup-ci-for-intel-linux

Conversation

@rillomas
Copy link
Contributor

@rillomas rillomas commented Mar 6, 2026

We would like add a single self-hosted CI runner for Intel Linux Vulkan backend related to #19213.
This will be focused for Vulkan backend only at the moment, but when OpenVINO backend gets merged we will use it for both backends.

An example workflow execution on the branch can be viewed at the following:

@github-actions github-actions bot added the devops improvements to build systems and github actions label Mar 6, 2026
@rillomas rillomas changed the title ci: Setup self-hosted ci for Intel linux Vulkan backend ci: Setup self-hosted ci for Intel Linux Vulkan backend Mar 6, 2026
@rillomas rillomas changed the title ci: Setup self-hosted ci for Intel Linux Vulkan backend ci: Setup self-hosted CI for Intel Linux Vulkan backend Mar 6, 2026
@rillomas rillomas marked this pull request as ready for review March 10, 2026 03:07
@rillomas rillomas requested a review from CISC as a code owner March 10, 2026 03:07
@rillomas
Copy link
Contributor Author

rillomas commented Mar 10, 2026

Hi @ggerganov we are ready to add an instance for Intel Vulkan backend. Will you please send me a runner token? I am located in Japan so hopefully your TZ is close to mine (since the runner tokens only last for an hour)

Update: I was able to receive a token and confirm CI instance added

@rillomas rillomas marked this pull request as draft March 10, 2026 04:41
@rillomas rillomas marked this pull request as ready for review March 10, 2026 07:08
@ggerganov
Copy link
Member

Thanks, the workflow appears to run successfully: https://github.com/ggml-org/llama.cpp/actions/runs/22884579987/job/66394392012?pr=20154

@0cc4m @jeffbolznv FYI this is a coopmat runner

@ggerganov ggerganov requested a review from 0cc4m March 10, 2026 07:21
@rillomas
Copy link
Contributor Author

Thanks, the workflow appears to run successfully: https://github.com/ggml-org/llama.cpp/actions/runs/22884579987/job/66394392012?pr=20154

FYI this specific workflow was run on another instance which was already added by the OpenVINO team (LNL). Vulkan workload will work on both instances but I haven't tested if OpenVINO workload works on the instance I added today (PTL)

@ggerganov
Copy link
Member

Thanks for pointing that out. If the OpenVINO workflows don't run on your runner, we will gate them with an extra tag.

@0cc4m 0cc4m merged commit 4cc6eb1 into ggml-org:master Mar 12, 2026
72 of 74 checks passed
tekintian added a commit to tekintian/llama.cpp that referenced this pull request Mar 12, 2026
* 'master' of github.com:ggml-org/llama.cpp: (33 commits)
  convert : better mtp check and fix return [no ci] (ggml-org#20419)
  vulkan: fix SSM_CONV PP scaling with large ubatch sizes (ggml-org#20379)
  New conversations now auto-select the first loaded model (ggml-org#20403)
  ggml-virtgpu: Fix some build commands (ggml-org#20341)
  metal : avoid divisions in bin kernel (ggml-org#20426)
  ci: Setup self-hosted CI for Intel Linux Vulkan backend (ggml-org#20154)
  vulkan: fix l2_norm epsilon handling (ggml-org#20350)
  vulkan: fix OOB check in flash_attn_mask_opt (ggml-org#20296)
  vulkan: Fix ErrorOutOfHostMemory on Intel GPU when loading large models with --no-mmap (ggml-org#20059)
  opencl: use larger workgroup size for get_rows (ggml-org#20316)
  opencl: add cumsum op (ggml-org#18981)
  hip: compile debug builds with -O2 on hip to avoid a compiler bug (ggml-org#20392)
  common/parser: add GigaChatV3/3.1 models support (ggml-org#19931)
  model : add support for Phi4ForCausalLMV (ggml-org#20168)
  graph : add optional scale parameter to build_lora_mm [no ci] (ggml-org#20427)
  common : fix --n-cpu-moe, --cpu-moe for models with fused gate + up (ggml-org#20416)
  ggml-webgpu: Add supports for `GGML_OP_REPEAT` (ggml-org#20230)
  llama : enable chunked fused GDN path (ggml-org#20340)
  llama : whitespace cleanup (ggml-org#20422)
  ggml : add NVFP4 quantization type support (ggml-org#19769)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops improvements to build systems and github actions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants