Skip to content

Function called from remote function is not defined #1607

@ludwigschmidt

Description

@ludwigschmidt

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): ami-3b6bce43 # Amazon Deep Learning AMI (Ubuntu)
  • Ray installed from (source or binary): pip3 install ray
  • Ray version: 0.3.1
  • Python version: 3.6
  • Exact command to reproduce: see below

Describe the problem

I first bring up a Ray cluster with ray create_or_update and then start a remote Jupyter notebook via

ssh -L 8899:localhost:8899 -i /Users/ludwig/.ssh/ray-autoscaler_us-west-2.pem ubuntu@34.213.245.91 jupyter notebook --port=8899

After that, I execute the following code blocks in the Jupyter notebook (which runs a kernel in the Python 3 environment)

import numpy as np
import ray
ray.init(redis_address="172.31.15.101:6379")
@ray.remote
def f(_):
    return my_square(np.random.randint(0, 10))

def my_square(x):
    return x * x
ray.get([f.remote(x) for x in range(1)])

This yields the following error:

Remote function __main__.f failed with:

Traceback (most recent call last):
  File "<ipython-input-3-31e8a6f51ae7>", line 3, in f
NameError: name 'my_square' is not defined


  You can inspect errors by running

      ray.error_info()

  If this driver is hanging, start a new one with

      ray.init(redis_address="172.31.15.101:6379")
  
---------------------------------------------------------------------------
RayGetError                               Traceback (most recent call last)
<ipython-input-4-693bc96baeb6> in <module>()
----> 1 ray.get([f.remote(x) for x in range(1)])

~/anaconda3/lib/python3.6/site-packages/ray/worker.py in get(object_ids, worker)
   2243             for i, value in enumerate(values):
   2244                 if isinstance(value, RayTaskError):
-> 2245                     raise RayGetError(object_ids[i], value)
   2246             return values
   2247         else:

RayGetError: Could not get objectid ObjectID(2c2e76ba4dd326dd7b73540ab7933c73420b6bbf). It was created by remote function __main__.f which failed with:

Remote function __main__.f failed with:

Traceback (most recent call last):
  File "<ipython-input-3-31e8a6f51ae7>", line 3, in f
NameError: name 'my_square' is not defined

After running the second code block (containing the function definitions), everything is fine.

I don't know the Ray internals, so I can only speculate about the reason. Could it be that the my_square function is not properly packaged with the remote function f because it is defined later?

Source code / logs

The source code and error output is above. For completeness, here is the Ray config file

# An unique identifier for the head node and workers of this cluster.
cluster_name: ludwig_test_1

# The minimum number of workers nodes to launch in addition to the head
# node. This number should be >= 0.
min_workers: 4

# The maximum number of workers nodes to launch in addition to the head
# node. This takes precedence over min_workers.
max_workers: 4

# The autoscaler will scale up the cluster to this target fraction of resource
# usage. For example, if a cluster of 10 nodes is 100% busy and
# target_utilization is 0.8, it would resize the cluster to 13. This fraction
# can be decreased to increase the aggressiveness of upscaling.
target_utilization_fraction: 0.8

# If a node is idle for this many minutes, it will be removed.
idle_timeout_minutes: 5

# Cloud-provider specific configuration.
provider:
    type: aws
    region: us-west-2
    availability_zone: us-west-2c

# How Ray will authenticate with newly launched nodes.
auth:
    ssh_user: ubuntu
# By default Ray creates a new private keypair, but you can also use your own.
# If you do so, make sure to also set "KeyName" in the head and worker node
# configurations below.
#    ssh_private_key: /path/to/your/key.pem

# Provider-specific config for the head node, e.g. instance type. By default
# Ray will auto-configure unspecified fields such as SubnetId and KeyName.
# For more documentation on available fields, see:
# http://boto3.readthedocs.io/en/latest/reference/services/ec2.html#EC2.ServiceResource.create_instances
head_node:
    InstanceType: m5.large
    ImageId: ami-3b6bce43  # Amazon Deep Learning AMI (Ubuntu)

    # You can provision additional disk space with a conf as follows
    # BlockDeviceMappings:
    #     - DeviceName: /dev/sda1
    #       Ebs:
    #           VolumeSize: 100

    # Additional options in the boto docs.

# Provider-specific config for worker nodes, e.g. instance type. By default
# Ray will auto-configure unspecified fields such as SubnetId and KeyName.
# For more documentation on available fields, see:
# http://boto3.readthedocs.io/en/latest/reference/services/ec2.html#EC2.ServiceResource.create_instances
worker_nodes:
    InstanceType: m5.large
    ImageId: ami-3b6bce43  # Amazon Deep Learning AMI (Ubuntu)

    # Run workers on spot by default. Comment this out to use on-demand.
    InstanceMarketOptions:
        MarketType: spot
        # Additional options can be found in the boto docs, e.g.
        #   SpotOptions:
        #       MaxPrice: MAX_HOURLY_PRICE

    # Additional options in the boto docs.

# Files or directories to copy to the head and worker nodes. The format is a
# dictionary from REMOTE_PATH: LOCAL_PATH, e.g.
file_mounts: {
#    "/path1/on/remote/machine": "/path1/on/local/machine",
#    "/path2/on/remote/machine": "/path2/on/local/machine",
}

# List of shell commands to run to set up nodes.
setup_commands:
    # Note: if you're developing Ray, you probably want to create an AMI that
    # has your Ray repo pre-cloned. Then, you can replace the pip installs
    # below with a git checkout <your_sha> (and possibly a recompile).
    - pip install -U ray==0.3.1

# Custom commands that will be run on the head node after common setup.
head_setup_commands:
    - pip install boto3==1.4.8  # 1.4.8 adds InstanceMarketOptions

# Custom commands that will be run on worker nodes after common setup.
worker_setup_commands: []

# Command to start ray on the head node. You don't need to change this.
head_start_ray_commands:
    - ray stop
    - ray start --head --redis-port=6379 --autoscaling-config=~/ray_bootstrap_config.yaml

# Command to start ray on worker nodes. You don't need to change this.
worker_start_ray_commands:
    - ray stop
    - ray start --redis-address=$RAY_HEAD_IP:6379

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions