Inspiration

This is done by emulating how a person might research an idea found online, where we would summarize what we see, pick out keywords, enter them into a search engine, read through the results, then determine what the information in the result means and how it compares to the original topic.

What it does

Summarise chosen text --> Find relevant information on Wikipedia --> Summarise Wikipedia (for similar length and writing style) --> Conclude about the information's validity.

How we built it

Use Transformers-prebuilt BART encoder-decoder to summarise the text input, then extract keywords from the summary and look up Wikipedia for relevant articles. Proceed to summarise the Wikipedia articles (with the same BART model) and use BERT to determine whether the information is entailment, contradict or neutral.

Challenges we ran into

We tried to build our own encoder-decoder at first, using bidirectional-LSTM layers and attention layers. However, the process is too computationally expensive to accomplish during the competition time. We have also thought about fine-tuning a pretrained BERT model for text summarisation using cnn-dailymail dataset. However, there is little documentation for this process, thus we were not able to execute this idea.

Accomplishments that we're proud of

We got our LSTM network to work, although the result isn't good enough. We are also proud that we now understand more about the architect behind cutting-edge NLP models.

What we learned

Bidirectional LSTM layer Attention layer Analyse and choose the suitable dataset Build a browser extension and its interactions with a server

What's next for WC - WikiChek

We ran out of time putting every piece together. Let's hope we might be able to pull it off just before the presentation.

Built With

  • html/css
  • python
  • tensorflow
  • transformers
Share this project:

Updates