Skip to content

ci : add sanitizer runs for server#19291

Merged
ggerganov merged 1 commit intomasterfrom
gg/ci-server-sanitize
Feb 3, 2026
Merged

ci : add sanitizer runs for server#19291
ggerganov merged 1 commit intomasterfrom
gg/ci-server-sanitize

Conversation

@ggerganov
Copy link
Member

Reenable the server sanitizer builds + runs. The thread sanitizer is quite slow, so remains disabled for now.

https://github.com/ggerganov/tmp2/actions/runs/21629674042

@ggerganov ggerganov requested a review from CISC as a code owner February 3, 2026 12:51
@github-actions github-actions bot added build Compilation issues devops improvements to build systems and github actions labels Feb 3, 2026
@ggerganov ggerganov merged commit 6a9bf2f into master Feb 3, 2026
73 of 78 checks passed
@ggerganov ggerganov deleted the gg/ci-server-sanitize branch February 3, 2026 20:41
@CISC
Copy link
Member

CISC commented Feb 3, 2026

Oh, interesting:
https://github.com/ggml-org/llama.cpp/actions/runs/21630912707/job/62343121265

2026-02-03T17:10:51.0689076Z The following tests FAILED:
2026-02-03T17:10:51.0689668Z 	 32 - test-thread-safety (Failed)                       main
2026-02-03T17:10:51.0690260Z 	 41 - test-barrier (Failed)                             main
2026-02-03T17:10:51.0690796Z Errors while running CTest

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got false-positive ASAN errors after this change (triggered by httplib.h), wondering if the sanitizer flag is still propagated correctly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something strange is definitely going on.

@CISC
Copy link
Member

CISC commented Feb 6, 2026

I wonder why these runners get randomly terminated, searching for it seem to suggest this is either because of high CPU or RAM usage, though not sure how this would be higher than on other runners.

@ngxson
Copy link
Contributor

ngxson commented Feb 7, 2026

If it's terminated due to high RAM, I think it will have a specific exit code (OOM kill code, usually 137)

I don't think it's terminated by high CPU though, I suspect that the compilation step can use 100% consistently, but the same can't be said with the server test. As we stop and start server instance in-between tests, in theory there will be a gap in CPU usage

@CISC
Copy link
Member

CISC commented Feb 7, 2026

I don't think it's terminated by high CPU though, I suspect that the compilation step can use 100% consistently, but the same can't be said with the server test. As we stop and start server instance in-between tests, in theory there will be a gap in CPU usage

It happens during compilation step, so most likely #19411 is the correct solution.

liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build Compilation issues devops improvements to build systems and github actions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants