RACF (Recency Aware Collaborative Filtering) is an implementation of the "[Recency Aware Collaborative Filtering for Next Basket Recommendation]"(https://dl.acm.org/doi/abs/10.1145/3340631.3394850) original paper.
RACF is a set of collaborative filtering approaches for solving the problem of predicting the Next Basket Recommendation.
You can start playing with this implementation either by downloading the provided Jupyter Notebook "RecencyAwareCF.ipynb" or by CLI by cloning this repository.
After cloning this git repository it would be strutured following this schema :
- NextBasketRecomSys
- |-- src
- |--|-- main.py
- |--|-- NextBasketRecFramework.py
- |--|-- util.py
- |-- data
- |--|-- instacart
- |--|-- dunnhumby
- Python 3.8+ (older version has not been tested)
- Numpy 1.19+ (older version has not been tested)
- Sklearn 0.23+ (older version has not been tested)
- Pandas 1.0+ (older version has not been tested)
- Similaripy (Install it using the package installer Pip)
- Scipy 1.5 (older version has not been tested)
A requirement file is available.
Please first download the complete instacart dataset and release the CSV files under NextBasketRecomSys/data/instacart path
Next, now you ready to run main.py by specifying some argument (in the following order):
- Dataset path:
- --data_path : Path to the dataset folder (Default: ../data/instacart)
- Data preprocessing args:
- --item_threshold : to Remove all items the appears with less than this number of baskets. (default:10)
- --basket_threshold : to Remove all users with less than this number of baskets. (default:2)
- --subdata : to select a small subset of the current dataset to test the algorithms instantly. (default: 0.1)
- --verbose : to show data preprocessing progress Take True or False value (default:true)
- Methods args:
- --method_name : the chose which method to use from the framework implementation :
- UWPop : User-Wise Popularity method
- UPCF : User Popularity Collaborative Filtering method
- IPCF: Item Popularity Collaborative Filtering method
- Method's parameters:
- --recency : Default=0, in the paper they used the following values {1,5,25,100,inf)
- --asymmetry : Default=0, in the paper they used the following values {0,0.25,0.5,0.75,1}
- --locality : Default=1, in the paper they used the following values {1,5,10,50,100,1000}
- --top_k : rank-aware parameter, to select the number of items to recommend
- --method_name : the chose which method to use from the framework implementation :
Python3 main.py --methode_name IPCF --recency 5 --asymmetry 1 --locality 5 --top_k 10
First, This will randomly sample transactions associated with 10% users and filter out products with <10 transactions and users with less than 2 baskets. Once the preprocessing is done, it will use the Item-popularity-CF(IPCF@r) method to predict the next top 10 items to recommend using the following parameter (r=5, alpha=1, q=5).
And finally, it will show some the evaluation of the model in both test and train set.