fix: replace json with pickle for storing lgbm params#190
fix: replace json with pickle for storing lgbm params#190pmandiola wants to merge 1 commit intooptuna:mainfrom
Conversation
|
This pull request has not seen any recent activity. |
|
Thank you for your PR! Let me leave some comments:
For example, what about the following: serializable_lgbm_params = {}
for k, v in lgbm_params.items():
try:
json.dumps([v])
serializable_lgbm_params[k] = v
except TypeError:
# We store only the name of an unserializable object.
serializable_lgbm_params[k] = v.__name__ |
|
Thanks @nabenabe0928 for reviewing the PR. My first fix attempt was exactly what you suggested, but it didn't work. When optimizing, I think the LightGBMTuner restores the parametes of the best trial after finishing all the trials for a specific step. So when running the tuner, the first 7 trials that search for Here is the error trace: |
|
@pmandiola |
|
@nabenabe0928 Could you review this PR? |
|
Sure, the code I tested is: One alternative solution could be to just store the oprimized parameters from the current Trial instead of the full lgbm_params. I tried it just changing line 271 and it seems to work (the tuning is running correctly) but I'm not sure if something else could be broken: |
|
@pmandiola |
|
This is what I did (skipping some previous details): |
Verification Codefrom __future__ import annotations
import optuna.integration.lightgbm as lgb
from lightgbm import early_stopping
from lightgbm import log_evaluation
import numpy as np
import sklearn.datasets
from sklearn.metrics import accuracy_score
from sklearn.model_selection import KFold
def custom_binary_objective(
y_true: np.ndarray, y_pred: lgb.Dataset
) -> tuple[np.ndarray, np.ndarray]:
preds = y_pred.get_label()
ps = 1.0 / (1.0 + np.exp(-preds))
res = y_true - ps
grad = -res / (ps * (1 - ps))
hess = -ps * (1 - ps) * (1 - 2 * y_true) / ((ps * (1 - ps)) ** 2)
return grad, hess
def custom_accuracy(
y_true: np.ndarray, y_pred: lgb.Dataset
) -> tuple[str, float, bool]:
preds = y_pred.get_label()
ps = np.round(1.0 / (1.0 + np.exp(-preds)))
return "custom_accuracy", accuracy_score(y_true, ps), True
if __name__ == "__main__":
data, target = sklearn.datasets.load_breast_cancer(return_X_y=True)
dtrain = lgb.Dataset(data, label=target)
params = {
"objective": custom_binary_objective,
"metric": "custom_accuracy",
"verbosity": -1,
"boosting_type": "gbdt",
}
tuner = lgb.LightGBMTunerCV(
params,
dtrain,
callbacks=[early_stopping(10), log_evaluation(10)],
feval=custom_accuracy,
)
tuner.run() |
|
Another approach for the bug fix: We need to check whether this change becomes a breaking change or not. |
|
@pmandiola |
|
Sure, happy to help! |
|
This pull request has not seen any recent activity. |
|
This pull request has not seen any recent activity. |
|
This pull request has not seen any recent activity. |
|
This pull request has not seen any recent activity. |
|
Don't stale. |
|
This pull request has not seen any recent activity. |
|
@pmandiola But many thanks to your contribution as well!! |
Motivation
Fixes #188, allowing the use of custom objective functions
Description of the changes
Replaces json.dumps and json.loads with pickle to store and retrieve the trials' lightgbm_params dictionary