Topic: Recommender Systems
Task: Predict what hotel a user is most likely to book
Description: The dataset contains information about a search query of a user for a hotel, the hotel properties that resulted and for the training set, whether the user clicked on the hotel and booked it. Source
Clone the repository
Install the required Python packages:
pip install -r requirements.txt
├── README.md
├── data
│ ├── preprocessed
│ ├── raw
│ └── submit
├── figures
├── notebooks
│ ├── eda.ipynb
│ ├── feature_engineering.ipynb
│ ├── feature_importance.ipynb
│ ├── models
│ ├── restructuring.ipynb
│ └── xgboost.ipynb
└── scripts
├── evaluate.py
├── models
├── train.py
└── tune_hyperparams.pyNavigate to the notebooks folder and open eda.ipynb to start the exploratory data analysis.
- Navigate to the notebooks folder.
- Open
feature_engineering.ipynb. - Update the file paths to the raw datasets for both training and testing data under the
data/raw/directory. - Execute the notebook separately for the training and testing datasets by updating the filepath for each set
To tune the hyperparameters, run the following command from the project's root:
python scripts/tune_hyperparams.py
To train the model, run the following command from the project's root directory:
python scripts/train.py
After training the model:
- Navigate to the scripts directory.
- Run evaluate.py to load the trained model and generate submission.csv in the data/submissions/ folder:
python scripts/evaluate.py
Nabila Siregar, Amir Sahrani, Sophie Engels