Inspiration

According to Pew Research roughly four-in-ten Americans have personally experienced online harassment, and 62% consider it a major problem. Many want technology firms to do more, but they are divided on how to balance free speech and safety issues online.

Social media platforms are an especially fertile ground for online harassment, but these behaviors occur in a wide range of online venues. Frequently these behaviors target a personal or physical characteristic: 14% of Americans say they have been harassed online specifically because of their politics, while roughly one-in-ten have been targeted due to their physical appearance (9%), race or ethnicity (8%) or gender (8%). And although most people believe harassment is often facilitated by the anonymity that the internet provides, these experiences can involve acquaintances, friends or even family members.

What it does

Social media platform are getting very common part of our lives, we spend a lot of hours of our daily lives emailing, reading articles, sharing and liking post by our loved ones and acquaintance and also exploring posts by total strangers as well.

Our application aims to first take a large dataset (which in this case we took from kaggle) and then with the help of sentimental analysis and communication neural network to get the frequency of the words (for harassment and non-harassment) used. We also analysize if the one social media platform is experiencing more harassment incidents compared to others. With the statistics provided by our machine learning algorithms we are drawing some graphs in our front end and present an analysis to our user, this makes them aware of the safest online environment for them and lets them decide if they would like spending time there.

The next step we did was to design a well thought unbiased architecture of an extension which could be potentially used by our clients to stop seeing those message Our machine learning algorithm is getting trained by CNN.We are categorizing the harassment messages on the basis of tags like racist, sexist, homophobic, violent messages.

We tried to provide to create a new conversational experiences e.g. to help human moderators choose what to review, to help people reading comments to choose what they read, or to help authors get another perspective on what they are writing.

How we built it

We used machine learning algorithm called sentimental analysis to analyse the data and we used json and javascript for making chrome extension. Our UI is designed in html and css with javascript.

Challenges we ran into

Firstly, we had problem figuring out how to proceed initially with the project, but with some discussion with some of our mentors we decided on one way to move ahead with our project. We decided to majorly focus on two points, first analayzing the dataset we trained our machine learning model with and then to propose a solution to develop a system design that can make user experience online a much pleasant.

Although, we did have idea about how to make our extension we couldn't deploy it online due to not having access to API key of social media platforms it was hard for us to deploy the extension.

The algorithm we used is limited by both the comment its learned from and structure of underlying models.When we have millions of examples of comments to train a model, we can at least start on the problem, but even models that have seen tens of millions of comments have a long way to go.

Our model can be improved a lot ahead and imperfect models will misclassify as toxic a certain fraction of innocent comments and miss some forms of personal attacks.

We improve models by working with communities to audit their output, report examples, and retrain the models on the corrections. In future False Positive posts we’ll explore individual errors and mitigation strategies in detail, and share tools that can help others improve similar initiatives.

The key to leveraging early-stage machine learning is not to use it as a standalone solution, but as an assistant that can help people work more efficiently in their efforts to expand and improve community discussions.

Accomplishments that we're proud of

Even though it was very complex problem to solve because it requires us to research in depth about how to make user experience online safer, it was challenging to not compromise with data privacy. However, with our solution we wouldn't need to worry about data privacy as we are intending to remove the public messages which comes under harassment text.

What we learned

We learned a lot of new things about chrome extensions, AI bias, and API key along with privacy concern. We did notice that solving one problem can easy lead to create a problem in other domains if we are not careful enough and for that to not happen, we need to carefully design our system before implementing it and try to train our ML model as much as possible.

What's next for HarassBlocker

We would love to actually implement the extension which solve this problem with right method. There are many possible solution to this problem, however choosing the solution which is viable for longer run was very essential and we hope we were able to do that. In future, we are going to implement the solution and expand the application domain.

We will add more awareness related content on so if any person or developer wants to know how to deal with it, they will be able to figure it out.

Share this project:

Updates