Skip to content

jl908069/interpretable-hate-speech

Repository files navigation

Towards Interpretable Detection of Hate Speech in Twitter

This project trains logistic regression models for hate speech detection using SemEval 2019 task 6 dataset OLID (Offensive Language Identification Dataset). Task A is to determine whether or not the tweet is offensive. Task B is to determine whether the offensive tweet is targeted. Data and links to task information and paper are available here.

util.py

Implements utility functions for loading data in Task A and B.

Example usage:

python util.py olid-training-v1.tsv

logreg.py

Trains logistic regression model with tf-idf vectors for task A and B. Returns following results:

  • classification report (using sklearn)
  • misclassified examples
  • confusion matrix
  • explainable results using shap package

Example usage:

python logreg.py --train_file olid-training-v1.tsv

feature_combination.py

Creates a FeatureVectorizer class to add sentiment, subjectivity, profanity, and user name features to the feature function. Then it trains a logistic regression model to evaluate the results on the following different feature combinations.

  • base_tfidf + sentiment feature(vaderSentiment package)
  • base_tfidf + subjectivity feature(textblob package)
  • base_tfidf + profanity feature(profanity-check package)
  • base_tfidf + @user feature (percentage of @USER in a tweet)

Example usage:

python feature_combination.py --train_file olid-training-v1.tsv

Results and Discussion

See Final_Report.pdf

About

Course project for ANLY 521 Computational Linguistics with Advanced Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages