This repository contains the code for the AI models used in the LogosDB project.
-
Extractive Summary: extract portions of sentences that have the highest importance in the text using Tf-idf and TextRank algorithms.
-
Abstractive Summary: finetune Google T5 model on CNN/DailyMail Dataset.
- Clone the repository
- Install the required packages (Different methods have different requirements, see below) 2.1 Keyword Summary:
pip install keybert2.2 Extractive Summary: For extractive summary requirements, install the following:
pip install cython nltk networkx numpy scikit-learnCompile the cython code:
cd extractive_sum/cython
python3 setup.py build_ext --inplaceTo use compiled Cython code, import the compiled module to your C++ code:
#include "extractive_sum/cython/summarizer.h"2.3 Abstractive Summary: Install requirements:
pip install transformers torch peft datasets pandas