Skip to content

Eval bug: Tool crashes when built with SYCL #21747

@MrDrMcCoy

Description

@MrDrMcCoy

Name and Version

llama-cli --version crashes, see below. Built from git:

git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean
root@a30e40de2e02:/opt/llama/build/bin# git log
commit 3fc65063d9c356510b86fc2f15ca8aea711bfc47 (grafted, HEAD ->
master, origin/master, origin/HEAD)

Operating systems

Linux

GGML backends

SYCL

Hardware

I have 2x Intel B70 (32GB).

Models

Can't get far enough to try a model when trying to get version or available devices crashes.

Problem description & steps to reproduce

I am attempting to build a container to serve my 2x Intel B70 cards via RPC, so I may spread the load with my other ROCm hosts. To this end, I modified and simplified the SYCL Dockerfile to build llama.cpp with additional RPC support. The build succeeds, and sycl-ls shows my devices, but llama-cli crashes on initialization.

Dockerfile
ARG ONEAPI_VERSION=2025.3.3-0-devel-ubuntu24.04

FROM docker.io/intel/deep-learning-essentials:${ONEAPI_VERSION} as base

ARG intel_arch=bmg_g21
ARG IGC_VERSION=v2.30.1
ARG IGC_VERSION_FULL=2_2.30.1+20950
ARG COMPUTE_RUNTIME_VERSION=26.09.37435.1
ARG COMPUTE_RUNTIME_VERSION_FULL=26.09.37435.1-0
ARG IGDGMM_VERSION=22.9.0
RUN --mount=type=cache,destination=/tmp/neo \
  cd /tmp/neo && wget -c \
  https://github.com/intel/intel-graphics-compiler/releases/download/${IGC_VERSION}/intel-igc-core-${IGC_VERSION_FULL}_amd64.deb \
  https://github.com/intel/intel-graphics-compiler/releases/download/${IGC_VERSION}/intel-igc-opencl-${IGC_VERSION_FULL}_amd64.deb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/intel-ocloc-dbgsym_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.ddeb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/intel-ocloc_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.deb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/intel-opencl-icd-dbgsym_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.ddeb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/intel-opencl-icd_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.deb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/libigdgmm12_${IGDGMM_VERSION}_amd64.deb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/libze-intel-gpu1-dbgsym_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.ddeb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/libze-intel-gpu1_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.deb \
  && dpkg --install *.deb

FROM base as build

RUN --mount=type=cache,destination=/var/lib/apt \
  --mount=type=cache,destination=/var/cache/apt \
  apt-get update \
  && apt-get dist-upgrade -y \
  && apt-get install -y \
    ccache \
    git \
    libgomp1 \
    libssl-dev \
    ninja-build

ARG CCACHE_DIR=/var/cache/ccache
ARG CFLAGS="${CFLAGS} -O3"
ARG CXXFLAGS="${CFLAGS} -O3"

ARG rebuild=''
ARG branch=master
RUN git clone --depth=1 --recurse-submodules --branch=${branch:-master} \
  https://github.com/ggml-org/llama.cpp /opt/llama
WORKDIR /opt/llama

RUN --mount=type=cache,destination=${CCACHE_DIR} \
  bash -c "source /opt/intel/oneapi/setvars.sh --force && \
    cmake -B build -G Ninja \
      -DCMAKE_BUILD_TYPE=Release \
      -DGGML_RPC=ON \
      -DGGML_SYCL=ON \
      -DGGML_SYCL_DEVICE_ARCH=${intel_arch} \
      -DCMAKE_C_COMPILER=icx \
      -DCMAKE_CXX_COMPILER=icpx \
    && cmake --build build --target rpc-server -j$(nproc) \
    && mkdir -vp /app \
    && cp -vrL build/bin/* /app/"

FROM base as app

COPY --from=build /app /app

WORKDIR /app
VOLUME /var/cache/llama
ENV ZES_ENABLE_SYSMAN=1
ENV UR_L0_ENABLE_RELAXED_ALLOCATION_LIMITS=1
ENV GGML_RPC_DEBUG=1
ENV LLAMA_CACHE=/var/cache/llama
ENV ONEAPI_DEVICE_SELECTOR="level_zero:0"

ENTRYPOINT ["/app/rpc-server"]
CMD ["--host", "0.0.0.0", "--cache"]
EXPOSE 50052

First Bad Commit

Unknown.

Relevant log output

Logs

sycl-ls output:

INFO: Output filtered by ONEAPI_DEVICE_SELECTOR environment variable, which is set to level_zero:*.
To see device ids, use the --ignore-device-selectors CLI option.

[level_zero:gpu] Intel(R) oneAPI Unified Runtime over Level-Zero V2, Intel(R) Graphics [0xe223] 20.2.0 [1.14.37435+1]
[level_zero:gpu] Intel(R) oneAPI Unified Runtime over Level-Zero V2, Intel(R) Graphics [0xe223] 20.2.0 [1.14.37435+1]

llama-cli --list-devices output:

/opt/llama/build/bin/libggml-base.so.0(+0x15ae8)[0x7fc30ac16ae8]
/opt/llama/build/bin/libggml-base.so.0(ggml_print_backtrace+0x285)[0x7fc30ac16ac5]
/opt/llama/build/bin/libggml-base.so.0(+0x2eee6)[0x7fc30ac2fee6]
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xbb0da)[0x7fc30aa3e0da]
/lib/x86_64-linux-gnu/libstdc++.so.6(_ZSt10unexpectedv+0x0)[0x7fc30aa28a55]
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xbb391)[0x7fc30aa3e391]
/opt/intel/oneapi/compiler/2025.3/lib/libsycl.so.8(+0x102557)[0x7fc302887557]
/opt/intel/oneapi/compiler/2025.3/lib/libsycl.so.8(+0x1f0e8b)[0x7fc302975e8b]
/opt/intel/oneapi/compiler/2025.3/lib/libsycl.so.8(+0x32f783)[0x7fc302ab4783]
/opt/intel/oneapi/compiler/2025.3/lib/libsycl.so.8(_ZN4sycl3_V17contextC2ERKSt6vectorINS0_6deviceESaIS3_EESt8functionIFvNS0_14exception_listEEERKNS0_13property_listE+0x5df)[0x7fc302ab37cf]
/opt/intel/oneapi/compiler/2025.3/lib/libsycl.so.8(_ZN4sycl3_V17contextC2ERKSt6vectorINS0_6deviceESaIS3_EERKNS0_13property_listE+0x44)[0x7fc302ab3164]
/opt/intel/oneapi/compiler/2025.3/lib/libsycl.so.8(+0x26a508)[0x7fc3029ef508]
/opt/intel/oneapi/compiler/2025.3/lib/libsycl.so.8(+0x21b25d)[0x7fc3029a025d]
/opt/intel/oneapi/compiler/2025.3/lib/libsycl.so.8(+0x392b73)[0x7fc302b17b73]
/opt/intel/oneapi/compiler/2025.3/lib/libsycl.so.8(_ZN4sycl3_V15queueC1ERKNS0_6deviceERKSt8functionIFvNS0_14exception_listEEERKNS0_13property_listE+0x3d)[0x7fc302b130dd]
/opt/llama/build/bin/libggml-sycl.so.0(_ZN4sycl3_V15queueC2ERKNS0_13property_listE+0x8b)[0x7fc30ae62d5b]
/opt/llama/build/bin/libggml-sycl.so.0(_ZSt11make_sharedIN4dpct10device_extEJRN4sycl3_V16deviceEEESt10shared_ptrINSt9enable_ifIXntsr8is_arrayIT_EE5valueES8_E4typeEEDpOT0_+0x98)[0x7fc30ae61208]
/opt/llama/build/bin/libggml-sycl.so.0(_ZN4dpct7dev_mgrC2Ev+0x10d)[0x7fc30ae5f08d]
/opt/llama/build/bin/libggml-sycl.so.0(+0x12b560)[0x7fc30ae2c560]
/opt/llama/build/bin/libggml-sycl.so.0(ggml_backend_sycl_reg+0x543)[0x7fc30ae2ef53]
/opt/llama/build/bin/libggml.so.0(_ZN21ggml_backend_registryC2Ev+0x1d)[0x7fc30d82588d]
/opt/llama/build/bin/libggml.so.0(+0x5ae2)[0x7fc30d823ae2]
/opt/llama/build/bin/libggml.so.0(ggml_backend_load_all_from_path+0x6d)[0x7fc30d82228d]
llama-cli[0x52baa4]
llama-cli[0x543795]
llama-cli[0x42df78]
/lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x7fc30a6821ca]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x7fc30a68228b]
llama-cli[0x42de45]
terminate called after throwing an instance of 'sycl::_V1::exception'
  what():  level_zero backend failed with error: 2147483646 (UR_RESULT_ERROR_UNKNOWN)

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions