DeepHyper: A Python Package for Massively Parallel Hyperparameter Optimization in Machine Learning#
DeepHyper is first and foremost a hyperparameter optimization (HPO) library. By leveraging this core HPO functionnality, DeepHyper also provides neural architecture search, multi-fidelity and ensemble capabilities. With DeepHyper, users can easily perform these tasks on a single machine or distributed across multiple machines, making it ideal for use in a variety of environments. Whether you’re a beginner looking to optimize your machine learning models or an experienced data scientist looking to streamline your workflow, DeepHyper has something to offer. So why wait? Start using DeepHyper today and take your machine learning skills to the next level!
The package is organized around the following modules:
deephyper.analysis: To analyse your results.deephyper.ensemble: To build ensembles of predictive models possibly with disentangled uncertainty quantification.deephyper.evaluator: To distribute the evaluation of tasks (e.g., training or inference).deephyper.hpo: To perform hyperparameter optimization (HPO) and neural architecture search (NAS).deephyper.predictor: To wrap predictive models from different libraries.deephyper.stopper: To apply multi-fidelity or early discarding strategies for hyperparameter optimization (HPO) and neural architecture search (NAS).
Quick Start#
The pip installation is recommended. It requires Python >= 3.10.
pip install deephyper
If you would like to install DeepHyper with its core set of machine learning features (Pytorch and Learning Curve Extrapolation) use the following command:
pip install "deephyper[core]"
More details about installation can be found on our Installation page.
We then present a simple example of how to use DeepHyper to optimize a black-box function with three hyperparameters: a real-valued parameter, a discrete parameter, and a categorical parameter.
To try this example, you can copy/paste the script and run it.
def run(job):
# The suggested parameters are accessible in job.parameters (dict)
x = job.parameters["x"]
b = job.parameters["b"]
if job.parameters["function"] == "linear":
y = x + b
elif job.parameters["function"] == "cubic":
y = x**3 + b
# Maximization!
return y
def test_quickstart(tmp_path):
from deephyper.hpo import HpProblem, CBO
from deephyper.evaluator import Evaluator
# define the variable you want to optimize
problem = HpProblem()
problem.add_hyperparameter((-10.0, 10.0), "x") # real parameter
problem.add_hyperparameter((0, 10), "b") # discrete parameter
problem.add_hyperparameter(["linear", "cubic"], "function") # categorical parameter
# define the evaluator to distribute the computation
evaluator = Evaluator.create(
run,
method="process",
method_kwargs={
"num_workers": 2,
},
)
# define your search and execute it
search = CBO(problem, log_dir=tmp_path, random_state=42)
results = search.search(evaluator, max_evals=100)
print(results)
if __name__ == "__main__":
test_quickstart(".")
Running the example will output the results shown below. The best parameters are for the cubic function with x = 9.99 and b = 10 which produces y = 1009.87.
p:b p:function p:x objective job_id job_status m:timestamp_submit m:timestamp_gather
0 7 cubic -1.103350 5.656803 0 DONE 0.011795 0.905777
1 3 cubic 8.374450 590.312101 1 DONE 0.011875 0.906027
2 6 cubic 4.680560 108.540056 2 DONE 0.917542 0.918856
3 9 linear 8.787395 17.787395 3 DONE 0.917645 0.929052
4 6 cubic 9.109560 761.948419 4 DONE 0.928757 0.938856
.. ... ... ... ... ... ... ... ...
96 9 cubic 9.998937 1008.681250 96 DONE 33.905465 34.311504
97 10 cubic 9.999485 1009.845416 97 DONE 34.311124 34.777270
98 10 cubic 9.996385 1008.915774 98 DONE 34.776732 35.236710
99 10 cubic 9.997400 1009.220073 99 DONE 35.236190 35.687774
100 10 cubic 9.999833 1009.949983 100 DONE 35.687380 36.111318
[101 rows x 12 columns]
Optimum values
function: cubic
x: 9.99958232225758
b: 10
y: 1009.8747019108424
Let us now provide step-by-step details about this example.
The black-box function named run (it could be named anything but by convention we call it the run-function) is defined by taking an input job that contains the different hyperparameters to optimize under job.parameters. In our case, the function takes three hyperparameters: x, b, and function. The run-function returns a value y that is computed based on the values of the hyperparameters. The value of y is the objective value that we want to maximize (by convention we do maximization, to do minimization simply return the negative of your objective). The run-function can be any computationally expensive function that you want to optimize. For example, it can be a simple Python execution, opening subprocesses, submitting a SLURM job, perfoming an HTTP request, etc. The search algorithms will learn to optimize the function just based on observed input hyperparameters and output values.
def run(job):
x = job.parameters["x"]
b = job.parameters["b"]
function = job.parameters["function"]
if function == "linear":
y = x + b
elif function == "cubic":
y = x**3 + b
return y
Then, we have the def optimize() function that defines the creation and execution of the search.
problem = HpProblem()
problem.add_hyperparameter((-10.0, 10.0), "x")
problem.add_hyperparameter((0, 10), "b")
problem.add_hyperparameter(["linear", "cubic"], "function")
We start by defining the hyperparameter names, types and allowed ranges.
For this we create a deephyper.hpo.HpProblem object.
We add to this problem three hyperparameters: "x", "b" and "function".
The "x" hyperparameter is defined by a continuous range between [-10, 10].
The float type of the bounds is important to infer the continuous type of the hyperparameter.
The "b" hyperparameter is defined by an integer range between [0, 10].
Similarly, the int type of the bounds is important to infer the discrete type of the hyperparameter.
Finally, the "function" hyperparameter is defined by a list of string values and it is therefore a categorical nominal hyperparameter (i.e., without order relation between its values).
The problem can be interactively printed print(problem) to review its definition:
Configuration space object:
Hyperparameters:
b, Type: UniformInteger, Range: [0, 10], Default: 5
function, Type: Categorical, Choices: {linear, cubic}, Default: linear
x, Type: UniformFloat, Range: [-10.0, 10.0], Default: 0.0
After the problem, we create a deephyper.evaluator.Evaluator object.
The Evaluator is in charge of asynchronously distributing the computation of multiple calls to the run-function.
It provides a simple interface Evaluator.submit(tasks) and tasks_done = Evaluator.gather() to perform asynchronous
calls to the run-function.
evaluator = Evaluator.create(
run,
method="process",
method_kwargs={
"num_workers": 2,
},
)
The method="process" let us choose among available parallel backends.
In this example, we picked the "process" method.
The method_kwargs let us configure the Evaluator a bit more.
The keys of method_kwargs directly match the possible argument of the corresponding Evaluator subclass.
Some of these arguments are common to all Evaluator methods and some are specific.
The deephyper.evaluator API reference can be used to review the available arguments of each subclass.
In our case, method="process" corresponds to the deephyper.evaluator.ProcessPoolEvaluator.
The only argument set is "num_workers": 2 to define two process-based workers for our Evaluator allowing 2 parallel calls to run-function on a CPU with at least two hardware threads.
Finally, comes the last piece of the puzzle.
We create a deephyper.search.CBO object for a Centralized Bayesian optimization with the problem and evaluator created previously.
All search methods are sublcasses of deephyper.hpo.Search.
For reproducibility of this example we also set the random_state=42.
Then, we execute the search by using the max_evals termination criterion to stop the search when max_evals results have been gathered.
search = CBO(problem, evaluator, random_state=42)
results = search.search(max_evals=100)
The returned results is a Pandas DataFrame object that is also checkpointed locally in the current directory under results.csv (default value of the log_dir="." argument of Search subclasses).
This DataFrame contains 1 row per run-function evaluation:
the columns that start with
p:are the hyperparameters.the
objectiveis the returned values of therun-function.the
job_idis theEvaluatorjob id of the evaluation (an integer incremented by order of job creation).the
job_statusis theEvaluatorjob status of the evaluation.the columns that start with
m:are metadata of each evaluations. Some are added by DeepHyper but they can also be returned by the user as part of therun-function returned value.
p:b p:function p:x objective job_id job_status m:timestamp_submit m:timestamp_gather
0 7 cubic -1.103350 5.656803 0 DONE 0.011795 0.905777
1 3 cubic 8.374450 590.312101 1 DONE 0.011875 0.906027
2 6 cubic 4.680560 108.540056 2 DONE 0.917542 0.918856
3 9 linear 8.787395 17.787395 3 DONE 0.917645 0.929052
4 6 cubic 9.109560 761.948419 4 DONE 0.928757 0.938856
.. ... ... ... ... ... ... ... ...
96 9 cubic 9.998937 1008.681250 96 DONE 33.905465 34.311504
97 10 cubic 9.999485 1009.845416 97 DONE 34.311124 34.777270
98 10 cubic 9.996385 1008.915774 98 DONE 34.776732 35.236710
99 10 cubic 9.997400 1009.220073 99 DONE 35.236190 35.687774
100 10 cubic 9.999833 1009.949983 100 DONE 35.687380 36.111318
Warning
By convention in DeepHyper, all search algorithms are MAXIMIZING the objective function. If you want to MINIMIZE the objective function, you can simply return the negative of your objective value.
The next steps to learn more about DeepHyper is to follow our Examples.
Table of Contents#
Get Started