PS3 code based for ECML-PKDD paper
requirements :
python = 3.6
scikit-learn
nltk
pandas
numpy
For simplicity we used jupyter notebook and use Founta dataset as it represent the most skewed dataset.
we run the experiment 10 times and take the average for the report in the paper.
For the experiment such as in Figure 3 (in the paper), please modify the dataset to have similar ratio as in the paper
.
If you use our implementation please cite the paper as
@inproceedings{Fajri2020PS3,
title={PS3:Partition-based Skew-Specialized Sampling for Batch Mode Active Learning in Imbalanced Text Data},
author={Ricky Maulana Fajri and Samaneh Khoshrou and Robert Peharz and Mykola Pechenizkiy},
booktitle={ECML-PKDD},
year={2020}
}