Skip to content

[autoscaler][kubernetes] Example Cluster Config Returns KeyError #13667

@jacobhjkim

Description

@jacobhjkim

What is the problem?

Python version: 3.8
Kubernetes version: Client Version: v1.18.2 Server Version: v1.18.9-eks-d1db3c

ray-operator-pod returns an error when deploying an example ray cluster.

Traceback (most recent call last):
  File "/home/ray/anaconda3/bin/ray-operator", line 8, in <module>
    sys.exit(main())
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/operator/operator.py", line 123, in main
    cluster_config = operator_utils.cr_to_config(cluster_cr)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/operator/operator_utils.py", line 62, in cr_to_config
    config["available_node_types" = get_node_types(cluster_resource)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/operator/operator_utils.py", line 76, in get_node_types
    pod_type_copy, dictionary=NODE_TYPE_FIELDS)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/operator/operator_utils.py", line 98, in translate
    return {dictionary[field: configuration[field for field in dictionary}
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/operator/operator_utils.py", line 98, in <dictcomp>
    return {dictionary[field: configuration[field for field in dictionary}
KeyError: 'minWorkers'
stream closed

Reproduction (REQUIRED)

$ kubectl apply -f ray/python/ray/autoscaler/kubernetes/operator_configs/cluster_crd.yaml
$ kubectl create namespace ray
$ kubectl -n ray apply -f ray/python/ray/autoscaler/kubernetes/operator_configs/operator.yaml
$ kubectl -n ray apply -f ray/python/ray/autoscaler/kubernetes/operator_configs/example_cluster.yaml

This causes KeyError in the Ray operator.

If you add

    minWorkers: 1
    maxWorkers: 1
    rayResources: {}
    setupCommands: []

these four specs to example_cluster.yaml, then you can successfully deploy the example ray cluster on k8s with k8s operator.

  • I have verified my script runs in a clean environment and reproduces the issue.
  • I have verified the issue also occurs with the latest wheels.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething that is supposed to be working; but isn'ttriageNeeds triage (eg: priority, bug/not-bug, and owning component)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions