Skip to content

[tune] AssertionError: Resource invalid #5648

@bask0

Description

@bask0

System information

Describe the problem

I run 5 trials with ray.tune. In one of the trials (each time), an error occurs at the end of training: AssertionError: Resource invalid: Resources(cpu=3, gpu=0.33, memory=0, object_store_memory=0, extra_cpu=0, extra_gpu=0, extra_memory=0, extra_object_store_memory=0, custom_resources={}, extra_custom_resources={}).

When I trace back the error, I end up in the following function (ray/tune/resources.py):

def is_nonnegative(self):
    all_values = [self.cpu, self.gpu, self.extra_cpu, self.extra_gpu]
    all_values += list(self.custom_resources.values())
    all_values += list(self.extra_custom_resources.values())
    return all(v >= 0 for v in all_values)

It seems custom_resources and extra_custom_resources are not defined. It is weird that the error only occurs in one run... Is this a bug, or any suggestions on how to fix?

Source code / logs

This is how I call tune.run

tune.run(
    ModelTrainerMT,
    resources_per_trial={
        'cpu': config['ncpu'],
        'gpu': config['ngpu'],
    },
    num_samples=1,
    config=best_config,
    local_dir=store,
    raise_on_failed_trial=True,
    verbose=1,
    with_server=False,
    ray_auto_init=False,
    scheduler=early_stopping_scheduler,
    loggers=[JsonLogger, CSVLogger],
    checkpoint_at_end=True,
    reuse_actors=True,
    stop={'epoch': 2 if args.test else config['max_t']}
)

Traceback

2019-09-06 09:56:45,526 ERROR trial_runner.py:557 -- Error processing event.
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 552, in _process_trial
    self.trial_executor.stop_trial(trial)
  File "/opt/conda/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 246, in stop_trial
    self._return_resources(trial.resources)
  File "/opt/conda/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 388, in _return_resources
    "Resource invalid: {}".format(resources))
AssertionError: Resource invalid: Resources(cpu=3, gpu=0.33, memory=0, object_store_memory=0, extra_cpu=0, extra_gpu=0, extra_memory=0, extra_object_store_memory=0, custom_resources={}, extra_custom_resources={})

Metadata

Metadata

Assignees

No one assigned

    Labels

    tuneTune-related issues

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions