Skip to content

ci : run ui publish on ubuntu-slim#23818

Merged
ggerganov merged 2 commits into
masterfrom
cisc/ci-ui-publish-self-hosted-fast
May 28, 2026
Merged

ci : run ui publish on ubuntu-slim#23818
ggerganov merged 2 commits into
masterfrom
cisc/ci-ui-publish-self-hosted-fast

Conversation

@CISC

@CISC CISC commented May 28, 2026

Copy link
Copy Markdown
Member

Overview

Let's try not to stall Release. :)

Requirements

@CISC CISC requested a review from a team as a code owner May 28, 2026 13:17
@ggerganov

Copy link
Copy Markdown
Member

not sure which runners are under these tags?

Can you access this page: https://github.com/organizations/ggml-org/settings/actions/runners

@github-actions github-actions Bot added the devops improvements to build systems and github actions label May 28, 2026
@CISC

CISC commented May 28, 2026

Copy link
Copy Markdown
Member Author

not sure which runners are under these tags?

Can you access this page: https://github.com/organizations/ggml-org/settings/actions/runners

Nope.

@ggerganov

Copy link
Copy Markdown
Member

I'll think about this change - the jobs that produce any user artifacts are security sensitive and likely should not run on self-hosted or 3rd-party-hosted runners without the proper disclaimers.

Plus I am not sure yet that the change will help overall. I was also watching how the release was stalled for half an hour because of this tiny last job, but the fact that it was not being picked up means that the runners are plenty busy with other stuff already. If the job was unblocked, it would just add more jobs to the pool which is already too big.

For the development process, I think the PR jobs are higher priority compared to the release jobs because the maintainers would wait less when they work on something. The release are a bit "secondary" with the new concurrency model of the CI.

@ggerganov

Copy link
Copy Markdown
Member

However, this is problematic: https://github.com/ggml-org/llama.cpp/actions/runs/26560048666/job/78302518452#step:4:50

I think the cache for CUDA 13.3 got evicted. The vulkan ccaches seem excessive:

image

@ggerganov

Copy link
Copy Markdown
Member

I think we are good now. 15 releases incoming in quick succession 😄

image

@CISC

CISC commented May 28, 2026

Copy link
Copy Markdown
Member Author

I think we are good now. 15 releases incoming in quick succession 😄

Alright!

A possible alternative in current PR is moving to slim.

@ggerganov

Copy link
Copy Markdown
Member

Yes, just make sure the slim can handle it.

@CISC

CISC commented May 28, 2026

Copy link
Copy Markdown
Member Author

Yes, just make sure the slim can handle it.

Should do, the job just takes a few seconds, and nothing special in it.

@CISC CISC changed the title ci : run ui publish on self-hosted fast ci : run ui publish on ubuntu-slim May 28, 2026
@ggerganov ggerganov merged commit 3ef2369 into master May 28, 2026
3 checks passed
@ggerganov ggerganov deleted the cisc/ci-ui-publish-self-hosted-fast branch May 28, 2026 17:58
@CISC

CISC commented May 28, 2026

Copy link
Copy Markdown
Member Author

Yes, just make sure the slim can handle it.

Should do, the job just takes a few seconds, and nothing special in it.

Successful test: https://github.com/CISC/llama.cpp/actions/runs/26592454747/job/78354538399

gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request May 28, 2026
* origin/master: (32 commits)
hexagon: basic/generic op fusion support and RMS_NORM+MUL fusion (ggml-org#23835)
mtmd-debug: add color and rainbow mode (ggml-org#23829)
mtmd: fix gemma 4 projector pre_norm (ggml-org#23822)
opencl: move backend info printing into its own function (ggml-org#23702)
ci : run ui publish on ubuntu-slim (ggml-org#23818)
ui: fix audio and video modality detection (ggml-org#23756)
ci : releases use Github-hosted builds for the UI (ggml-org#23823)
app : improve help output (ggml-org#23805)
mtmd: n_head_kv defaults to n_head (ggml-org#23782)
mtmd: fix gemma 4 audio rms norm eps (ggml-org#23815)
ci : change Vulkan builds to Release to reduce ccache (ggml-org#23820)
arg: Add LLAMA_ARG_API_KEY_FILE environment variable for --api-key-file (ggml-org#23167)
test-llama-archs: fix table format [no release] (ggml-org#23810)
ggml: auto apply iGPU flag CUDA/HIP if integrated device (ggml-org#23007)
mmvq Optim: add MMVQ_PARAMETERS_TURING(mmvq_parameter_table_id) for … (ggml-org#23729)
CUDA: route batch>=4 quantized matmul to MMQ on AMD MFMA hardware (ggml-org#23227)
server: minor tweaks to use more cpp features (ggml-org#23785)
hexagon: minor refresh for HMX FA and MM (ggml-org#23796)
vulkan: fast path for walsh-hadamard transform (ggml-org#23687)
chat : add Granite 4.1 chat template (ggml-org#23518)
...
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
* run ui publish on self-hosted fast

* run on ubuntu-slim
turbo-tan pushed a commit to turbo-tan/llama.cpp-tq3 that referenced this pull request Jun 2, 2026
* run ui publish on self-hosted fast

* run on ubuntu-slim
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops improvements to build systems and github actions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants