Fake-o-Meter

Inspiration

What it does

Fake-o-meter is a Google Chrome extension which tells you how reliable the news story you are reading is with a score from 0(completely fake) to 10(completely reliable). To achieve this, we considered 3 main factors: overall style and consistency of the article, credibility of the source of the article and if other sources have discussed the topic of the article.

How we built it

In order to judge the consistency of the article, and how well the body describes the title, we trained a machine learning model on a dataset of 20000 fake and credible news articles. For each article, we extract the tf-idf vector of the title, tf-idf vector of the body and the cosine similarity between both as features for a linear regression. Intuitively, this tackles two different aspects.

Tf-idf vectors allow us to judge which words carry the most weight in a document, when put in perspective of the set which contains it.

Firstly, we notice that a lot of "fake" news exist in order to attract clicks. If the title and the body discuss vastly different topics, the article is likely unreliable. Secondly, we observe that "fake" news often share a writing style. By comparing the tf-idf vector of the article with the tf-idf vectors of articles in our dataset, we use that writing style to judge the credibility of the article.

To incorporate the credibility of the source, we built a whitelist of trusted news sources, which get a bonus to their score. Articles from sources with a good reputation have a higher likelihood of being trustworthy.

To compare the article to similar ones from other sources, we use the Google News API. We then extract the tf-idf vectors from them and compare with the original article using cosine similarity. If other news outlets have reported on similar stories, it is more likely to be a real one.

Challenges we ran into

For a dataset of our size, the tf-idf vectors produces feature vectors that were too large to convey meaningful information. To resolve this, we used dimensionality reduction to condense the feature vectors.

Accomplishments that we're proud of

The task of estimating the veracity of an argument is not an objective one, even for a human, so solving it is a fundamental problem of AI. Thus, rather than attempting to solve this task, we shift our focus on the societal interpretation of "fake" news by looking at factors such as repeated patterns of writing fake news, credibility of the source and the fact that if a story is true, it is likely to be corroborated by other sources.

What we learned

What's next for Fake-o-Meter

Contributors

Andrej Ivanov Carlos Gomes Hendrik Molder Mina Sewaha

Built With

Submitted to

HackUPC 2018

Created by

I worked on the backend of the product, including tasks such as developing the logistic regression model, the sourcing of similar articles with the Google News API and the integration of the backend with the frontend through a REST API

CarlosGomes98 Gomes
I worked the Google Chrome plugin and UI/UX. The plugin connects to the backend and displays data about the accuracy/likelihood of the article being fake.

Hendrik Mölder
Andrej Ivanov
Mina Sewaha

Updates

CarlosGomes98 Gomes started this project — Oct 21, 2018 02:26 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.