Skip to content

GPU CI #138

@quasiben

Description

@quasiben

We've been chatting with folks from the ops teams within RAPIDS about getting access to the gpuCI infrastructure. gpuCI is the GPU based CI platform used for testing throughout the RAPIDS ecosystem. We've been asking for access for a couple reasons:

  1. We currently test GPU portions of Distributed only and the testing occurs in an out-of-bound manner. That is, we test GPU and UCX bits of Distributed in ucx-py and dask-cuda. This is better than no testing, however, it's only limited to distributed and only when developers push changes to dask-cuda/ucx-py
  2. The lack of gpu testing infrastructure for Dask has and can result in breakages. Additionally, because of the lack of GPU CI developers will be unaware something is broken until a user raises and issue . This occurred somewhat recently within Dask: Failing CuPy tests dask#7324 and Add numpy functions tri, triu_indices, triu_indices_from, tril_indices, tril_indices_from dask#6997 . The issues are currently being fixed but we'd like to improve this cycle moving forward.

Gaining access to gpuCI resolves both of these problems and will allow us to test incoming PRs to Dask ensuring GPU support is maintained without breakages and undue burdens.

While we are talking with OPs folks we've suggested that the testing matrix is a single row:

  • latest OS (ubuntu)
  • latest cudatoolkit (11.0/11.2)
  • latest stable CuPy in RAPIDS
  • latest stable cuDF in RAPIDS
  • latest NumPy (need NEP-35)

This service will start off as something maintainers can ping if they think a PR might need GPU testing. This might include changes to array/dataframe functions or new functionality. While this is not the ideal solution, it is a step towards getting better GPU test for Dask without much effort on the part of the maintainers

For this to work a bot gputester from gpuCI will need to have at least “triage” rights to monitor comments and respond with pass/fail notifications to the PR in question.

cc @pentschev @jrbourbeau

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions