Skip to content

Nested MLflow logging with cross-validation #11115

@helena-balabin

Description

@helena-balabin

First of all: I apologize for not using the bug/feature request templates, I believe I'm having more of a general question here.

I was wondering if there is any way in which the MLflowCallback can be used in conjunction with a cross-validation training procedure?

In each split, I am initializing a new model, and using a new training and test dataset. Also, I am initializing a new Trainer with respective TrainingArguments in each split. Ideally I would use a parent run, and log the run of each split as a nested child run. I have attached the relevant code snippet:

mlflow.set_tracking_uri(logging_uri_mlflow)
mlflow.set_experiment('NLP Baseline')

# Start a parent run so that all CV splits are tracked as nested runs
mlflow.start_run(run_name='Parent Run')

for indices in train_test_splits:
    # Initialize tokenizer and model
    tokenizer = AutoTokenizer.from_pretrained(model_type)
    model = AutoModelForSequenceClassification.from_pretrained(model_type, num_labels=len(unique_tags))

    # Encode all text evidences, pad and truncate to max_seq_len
    train_evidences = tokenizer(evidences_text[indices["train_idx"]].tolist(), truncation=True, padding=True)
    test_evidences = tokenizer(evidences_text[indices["test_idx"]].tolist(), truncation=True, padding=True)
    train_labels = labels[indices["train_idx"]].tolist()
    test_labels = labels[indices["test_idx"]].tolist()
    train_dataset = CustomDataset(encodings=train_evidences, labels=train_labels)
    test_dataset = CustomDataset(encodings=test_evidences, labels=test_labels)

    # Note that due to the randomization in the batches, the training/evaluation is slightly
    # different every time
    training_args = TrainingArguments(
        # label_names
        output_dir=output_dir,
        num_train_epochs=epochs,  # total number of training epochs
        logging_steps=100,
        report_to=["mlflow"],  # log via mlflow
        do_train=True,
        do_predict=True,
    )

    # Initialize Trainer based on the training dataset
    trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=train_dataset,
    )
    # Train
    trainer.train()

    # Make predictions for the test dataset
    predictions = trainer.predict(test_dataset=test_dataset).predictions
    predicted_labels = np.argmax(predictions, axis=1)
    # Use macro average for now
    f1_scores.append(f1_score(test_labels, predicted_labels, average="macro"))

logger.info(f'Mean f1-score: {np.mean(f1_scores)}')
logger.info(f'Std f1-score: {np.std(f1_scores)}')

# End parent run
mlflow.end_run()

However, this results in the following exception:
Exception: Run with UUID d2bf3cf7cc7b4e359f4c4db098604350 is already active. To start a new run, first end the current run with mlflow.end_run(). To start a nested run, call start_run with nested=True

I assume that the nested=True parameter is required in the self._ml_flow.start_run() call in the setup() function of MLflowCallback? I tried to remove the MLflowCallback from the Trainer and add a custom callback class that overrides the default TrainerCallback in the same way that MLflowCallback does, except for using self._ml_flow.start_run(nested=True). Still, that results in separate individual runs being logged, rather than a nested parent run with child runs.

Are there any best practices for using huggingface models with mlflow logging in a cross-validation procedure? Thanks a lot in advance for any advice or useful comments on that! 😄

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions