-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Hyperband bracket assignment using hash in parallelization #3083
Description
I think there could be a problem when using Hyperband pruner in parallelization.
In Hyperband, each trial is assigned to the bracket by computing hash("{}{}".format(study.studyname, trial.number).
However, the hash gives different value for the same input when executed on different terminal.
A trial n, for example, could be assigned to the bracket 1 at process 1 and the bracket 2 at process 2 , which ruins the benefits of using Hyperband pruner instead of the SuccesiveHalving pruner.
Expected behavior
A trial needs to be assigned to the identical bracket among each process.
Environment
- Optuna version: 2.10.0
- Python version: 3.6.9
- OS: Linux
Error messages, stack traces, or logs
I attached an custom-made log that I'd added in the SuccessiveHalving code, which is a part of the Hyperband.
The trials 1, 2, 3, 4 were included in bracket 1 at both process 1 and 2
However, the trials 5, 7, 9 were included in the bracket only at process 2.
Trials for bracket 1 at process 1

Trials for the same bracket 1 at process 2

Steps to reproduce
- Add the following codes at the beginning of the def _get_competing_values( ) in the _successive_halving.py
for t in trials :
if rung_key in t.system_attrs:
compet = [t.system_attrs[rung_key]]
print(' Prune SH Get_competing_values: '+ rung_key, ' trial ',t.number,'competing ',compet)
- Check if the trials consisting of a bracket differs among each process
Additional context (optional)
As you can see here, it was due to the update that hash randomization is turned on after python version 3.3.
Of course, the randomization could be turned off by setting the PYTHONHASHSEED=0 when executing the python script.
Since I did not see any warning or suggestion about this problem, I would like to ask if my assertion was legit.