-
Notifications
You must be signed in to change notification settings - Fork 32.5k
Description
First of all: I apologize for not using the bug/feature request templates, I believe I'm having more of a general question here.
I was wondering if there is any way in which the MLflowCallback can be used in conjunction with a cross-validation training procedure?
In each split, I am initializing a new model, and using a new training and test dataset. Also, I am initializing a new Trainer with respective TrainingArguments in each split. Ideally I would use a parent run, and log the run of each split as a nested child run. I have attached the relevant code snippet:
mlflow.set_tracking_uri(logging_uri_mlflow)
mlflow.set_experiment('NLP Baseline')
# Start a parent run so that all CV splits are tracked as nested runs
mlflow.start_run(run_name='Parent Run')
for indices in train_test_splits:
# Initialize tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_type)
model = AutoModelForSequenceClassification.from_pretrained(model_type, num_labels=len(unique_tags))
# Encode all text evidences, pad and truncate to max_seq_len
train_evidences = tokenizer(evidences_text[indices["train_idx"]].tolist(), truncation=True, padding=True)
test_evidences = tokenizer(evidences_text[indices["test_idx"]].tolist(), truncation=True, padding=True)
train_labels = labels[indices["train_idx"]].tolist()
test_labels = labels[indices["test_idx"]].tolist()
train_dataset = CustomDataset(encodings=train_evidences, labels=train_labels)
test_dataset = CustomDataset(encodings=test_evidences, labels=test_labels)
# Note that due to the randomization in the batches, the training/evaluation is slightly
# different every time
training_args = TrainingArguments(
# label_names
output_dir=output_dir,
num_train_epochs=epochs, # total number of training epochs
logging_steps=100,
report_to=["mlflow"], # log via mlflow
do_train=True,
do_predict=True,
)
# Initialize Trainer based on the training dataset
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
)
# Train
trainer.train()
# Make predictions for the test dataset
predictions = trainer.predict(test_dataset=test_dataset).predictions
predicted_labels = np.argmax(predictions, axis=1)
# Use macro average for now
f1_scores.append(f1_score(test_labels, predicted_labels, average="macro"))
logger.info(f'Mean f1-score: {np.mean(f1_scores)}')
logger.info(f'Std f1-score: {np.std(f1_scores)}')
# End parent run
mlflow.end_run()
However, this results in the following exception:
Exception: Run with UUID d2bf3cf7cc7b4e359f4c4db098604350 is already active. To start a new run, first end the current run with mlflow.end_run(). To start a nested run, call start_run with nested=True
I assume that the nested=True parameter is required in the self._ml_flow.start_run() call in the setup() function of MLflowCallback? I tried to remove the MLflowCallback from the Trainer and add a custom callback class that overrides the default TrainerCallback in the same way that MLflowCallback does, except for using self._ml_flow.start_run(nested=True). Still, that results in separate individual runs being logged, rather than a nested parent run with child runs.
Are there any best practices for using huggingface models with mlflow logging in a cross-validation procedure? Thanks a lot in advance for any advice or useful comments on that! 😄