This project implements and compares baseline, LoRA, and LoRA + Dynamic Adapter models for sequence-to-sequence learning tasks using the CNN/DailyMail dataset. Below are the instructions to preprocess data, train models, evaluate performance, analyze errors, and launch the Streamlit app for results visualization.
Ensure you have the following installed:
- Python 3.7 or higher
- Required Python packages (install via
requirements.txtif available):pip install -r requirements.txt
- Data Preprocessing Preprocess the dataset to prepare it for model training:
python data_preprocessing.py --dataset_name cnn_dailymail --tokenizer_name t5-base- Model Training Train the models using the preprocessed dataset:
Baseline Model:
python model_training.py --model_name t5-small --tokenizer_name t5-smallLoRA Model:
python model_training.py --model_name t5-small --tokenizer_name t5-small --use_loraLoRA + Dynamic Adapter Model:
python model_training.py --model_name t5-small --tokenizer_name t5-small --use_lora --use_adapter- Evaluation Evaluate each model on the validation dataset: Baseline Model:
python evaluation.py --model_dir ./saved_model_baseline --tokenizer_name t5-small --output_file evaluation_baseline.txtLoRA Model:
python evaluation.py --model_dir ./saved_model_lora --tokenizer_name t5-small --output_file evaluation_lora.txtLoRA + Dynamic Adapter Model:
python evaluation.py --model_dir ./saved_model_lora_adapter --tokenizer_name t5-small --output_file evaluation_lora_adapter.txt- Error Analysis Perform error analysis on the LoRA + Dynamic Adapter model:
python error_analysis.py --model_dir ./saved_model_lora_adapter --tokenizer_name t5-small --output_file error_analysis_results.txt- Launch Streamlit App Visualize and compare results using the Streamlit app:
streamlit run app.py