Inspiration

Fake news is a growing problem, what can we do?

What it does

It's a simple firefox add-on in the form of a floating button on the top right of the page. Whenever you want to verify the reliability of the content you're reading, you can click on it and the add-on will take care of everything.

How we built it

Given an article (some text + picture), how do we assess its reliability? This goes in two steps. First, we look at other articles on the web that share pictures with this article. This is done with reverse image search. Once we have this list of article, we can do some feature engineering to extract meaningful features.

  • Similarity with the other articles (topic extraction)
  • Spelling error of the article (an article with poor editorial quality might be a bad smell)
  • "Freshness of the pictures" if the article is on something that Trump said yesterday, why would the journalist put a 5 years old photo? Given this feature matrix, we can try several model. By lack of training data, we did a regression tree "by hand". To our surprise, the similarity with the other articles is a very good indicator.

Challenges we ran into

Making all those retrieval services work together, or fail gracefully. We have three different localhost + one Add-On! Also, finding the right parameter and thinking about cool features was a challenge.

Accomplishments that we're proud of

The service works quite well. Even without domain whitelisting/blacklisting, it is able to predict that articles on theonion.com are fake and that usually, articles on bbc or the times are safe. We were also able to prove that some article within conspiracy groups (e.g. reptilians) were fake. All of this with the apparent simplicity of a floating button. Even better: the whole algorithm is language-agnostic: it doesn't care that you're looking at english, polish or french, because it will filter articles based on this. Much wow!

What we learned

A lot about content extraction and data mining. Wrangling up together, some javascript, python etc.

What's next for GoodNews

Get some labelled data and try some actual model fitting. Who knows, we might be surprised!

Built With

Share this project:

Updates