Deploying Anyscale on a Managed Service for Kubernetes® cluster

Anyscale is a platform that helps scale AI workloads. Built on Ray, it adds observability, data governance, developer tools and optimization. You can run Anyscale in Nebius AI Cloud, deploying it on a Managed Service for Kubernetes cluster. To store application data and artifacts, you can set up a Network File System (NFS) server on a Compute virtual machine (VM), and an Object Storage bucket.

Costs

This tutorial includes the following chargeable resources:

NFS server:
- Compute VM (default: Non-GPU AMD EPYC Genoa, 4vcpu-16gb)
- Compute disk (default: Network SSD IO M3, 93 GiB)
Artifacts storage: Object Storage bucket
Anyscale deployment:
- Compute VMs as Managed Kubernetes nodes (default: one NVIDIA® H100 NVLink with Intel Sapphire Rapids, 1gpu-16vcpu-200gb VM; one Non-GPU AMD EPYC Genoa, 4vcpu-16gb VM)
- Compute disks for the VMs (default: two Network SSD disks, 1023 GiB and 128 GiB)

Prerequisites

Create an Anyscale account.
Install and configure the following tools:
- Nebius AI Cloud CLI
- Python and Anyscale CLI
- Terraform
- kubectl and Helm

Steps

Prepare the environment

Clone the GitHub repository and go to the anyscale directory:

git clone https://github.com/nebius/nebius-solutions-library.git
cd nebius-solutions-library/anyscale

Edit the environment.sh file to add the IDs of your tenant, project and the region of the project to the environment variables at the top of the file.
Execute environment.sh to export environment variables from it, so they persist for the current shell session:
```
source ./environment.sh
```
Make a copy of the configuration file:
```
cp default.yaml.tpl default.yaml
```
Create an SSH key for the NFS server and Anyscale nodes:
```
ssh-keygen -t ed25519
```

Create storage for Anyscale

The Terraform configuration in the prepare directory creates an Object Storage bucket to store workload artifacts and an NFS server to store Anyscale workspace data (user code, configuration files, etc.).

Edit default.yaml:
- In .ssh_public_key, paste the contents of the public SSH key (~/.ssh/id_ed25519.pub).
- In .nfs_server.nfs_size, set the size of the disk in GiB. It is a Network SSD IO M3 disk, its size must be a multiple of 93 GiB (e.g. 1023 GiB).

Apply the Terraform configuration from the prepare directory:

terraform -chdir=prepare init
terraform -chdir=prepare apply

Deploy Anyscale

The Terraform configuration in the deploy directory creates a Managed Kubernetes cluster and node groups, and deploys the Anyscale application on the cluster.

Register the Anyscale cloud:
```
./register.sh <cloud_name>
```
Replace <cloud_name> with the name for your Anyscale cloud deployment that will be shown in Anyscale console. The output contains a cloud deployment ID that starts with cldrsrc_. Save it for the next step.
Create an Anyscale API key in Anyscale console.
Edit default.yaml:
- In .anyscale.cloud_deployment_id, paste the cloud deployment ID that starts with cldrsrc_ which you obtained from the previous step.
- In .anyscale.anyscale_cli_token, paste the Anyscale API key.
- In k8s_cluster, configure the cluster. It should have at least one GPU node group with one node, and at least one non-GPU node. The default configuration creates an NVIDIA H100 node group with one single-GPU node and a non-GPU node group with one node. For details about the parameters, see the following articles and resources:
  - {cpu,gpu}_nodes_{platform,preset}: Types of virtual machines and GPUs in Nebius AI Cloud
  - enable_gpu_cluster, infiniband_fabric: Interconnecting GPUs in Managed Service for Kubernetes® clusters using InfiniBand™
    Single-GPU nodes do not support GPU clusters and InfiniBand™ interconnect. If your preset for GPU nodes is single-GPU, set enable_gpu_cluster to false and infiniband_fabric to an empty string.
  - gpu_nodes_driverfull_image: GPU drivers and other components
  - enable_{prometheus,loki}: section about Kubernetes observability in the Nebius AI Cloud solution library

Apply the Terraform configuration from the deploy directory:

terraform -chdir=deploy init
terraform -chdir=deploy apply

(Optional) Configure Anyscale

You can configure how Anyscale head node and worker nodes are selected. To do this, log in to Anyscale console, go to your workspace and then follow instructions in the next sections.

Force a non-GPU head node

Anyscale head node is a Kubernetes Pod that does not use GPUs. However, by default, it can be scheduled on a GPU node or a non-GPU node. You can force the head node to run on non-GPU nodes to save you the costs of provisioning GPU nodes. To do that, perform the following steps in your workspace in Anyscale console:

On the Compute resources panel, under Head node, click .
In the window that opens, expand Advanced config.
Under Instance config, paste the node selector specification:
```
{
  "spec": {
    "nodeSelector": {
      "node.kubernetes.io/instance-type": "cpu-d3"
    }
  }
}
```
The value of the node.kubernetes.io/instance-type annotation must match the platform specified in the .k8s_cluster.cpu_nodes_platform field of the anyscale/default.yaml file.
Click Save.

Configure the selection of worker nodes

Anyscale allows automatic and manual modes of selecting worker nodes for your workspaces. To choose between these modes in Anyscale console, on the Compute resources panel, under Worker nodes, select or clear the Auto-select worker nodes checkbox:

When the checkbox is selected, Anyscale tries to provision worker nodes automatically. This works well when you run Anyscale and other workloads at the same time in the Managed Kubernetes cluster, because Anyscale workloads only reserve as many GPUs as they require. However, since the number of worker nodes is scaled on demand, provisioning new worker nodes can take some time.
When the checkbox is not selected, you select worker nodes manually. This is recommended for workloads that need to scale up fast, or if you need granular control over GPU usage.

Test the deployment

For details on testing the Anyscale deployment, see Anyscale resources:

How to delete the created resources

Some of the created resources are chargeable. If you do not need them, delete these resources so Nebius AI Cloud does not charge for them:

In Anyscale console, delete all the workloads that use the deployment.
Delete all objects from the Anyscale bucket. Its name starts with anyscale-.
Delete the Managed Kubernetes cluster, NFS server and bucket by running the following commands in the anyscale directory of the cloned repository:
```
terraform -chdir=deploy destroy
terraform -chdir=prepare destroy
```

Serverless AI

Managed Service for MLflow

Applications in Nebius AI Cloud

Tutorials

Third-party integrations

Deploying Anyscale on a Managed Service for Kubernetes® cluster

Costs

Prerequisites

Steps

Prepare the environment

Create storage for Anyscale

Deploy Anyscale

(Optional) Configure Anyscale

Force a non-GPU head node

Configure the selection of worker nodes

Test the deployment

How to delete the created resources

Serverless AI

Managed Service for MLflow

Applications in Nebius AI Cloud

Tutorials

Third-party integrations

​Costs

​Prerequisites

​Steps

​Prepare the environment

​Create storage for Anyscale

​Deploy Anyscale

​(Optional) Configure Anyscale

​Force a non-GPU head node

​Configure the selection of worker nodes

​Test the deployment

​How to delete the created resources

Costs

Prerequisites

Steps

Prepare the environment

Create storage for Anyscale

Deploy Anyscale

(Optional) Configure Anyscale

Force a non-GPU head node

Configure the selection of worker nodes

Test the deployment

How to delete the created resources