Welcome to the Article Summarizer App! ✨
This project leverages Large Language Models (LLMs) to generate concise summaries of financial and news articles, with tools for evaluation, visualization, and interactive exploration.
This app allows users to fetch articles either from a preloaded database or in real-time using the NewsAPI, and generate concise summaries using powerful Transformer-based models.
-
🔗 Fetch real-world articles using NewsAPI
-
🏗️ Store articles in a database-backed storage
-
🤖 Summarize articles using Transformer-based models (e.g., BART, Pegasus, BigBird)
-
🧪 Evaluate summaries with ROUGE, BLEU, METEOR, BERTScore, Factual Consistency
-
📈 Visualize and compare model performance across datasets
-
📝 Explore and annotate datasets
-
📚 Educational materials on evaluation metrics and hallucinations
git clone https://github.com/GregB712/LLM_Finance_Summaries.git
cd LLM_Finance_Summaries
✅ Important Setup Steps:
- Create a folder for models:
mkdir models
👉 Place your NLP / LLM models inside the models/ folder.
- Create a
.envfile with your NewsAPI key:
In the project root:
NEWS_API_KEY=your_newsapi_key_here
If not using Docker:
pip install -r requirements.txt
⚠️ Activate your virtual environment if you're using one.
- Build the Docker image:
docker build -t summarizer-app .
- Run the Docker container:
docker run -p 8501:8501 --env-file .env summarizer-app
✅ Access the app at http://localhost:8501
LLM_Finance_Summaries
├── Dockerfile
├── LICENSE
├── README.md
├── REPORTS
│ ├── REPORT.html
│ ├── REPORT.md
│ └── REPORT.pdf
├── app
│ ├── About.py
│ ├── Database.py
│ ├── FinetunedModelsResults.py
│ ├── Home.py
│ ├── NewsAPI.py
│ ├── Setup_and_Installation.py
│ ├── Summarize.py
│ ├── ViewFinetunedMetrics.py
│ ├── ViewSummarizedArticles.py
│ └── __init__.py
├── app.py
├── assets
│ ├── 1_Article_Summarizer_App.png
│ ├── 2_Fetch_Articles_From_NewsAPI.png
│ ├── 3_Article_Summarizer.png
│ ├── 4_View_Summarized_Articles.png
│ ├── 5_Fine-tuned_Model_Metrics.png
│ ├── 6_Finetuned_Models_Results.png
│ ├── 7_Project_Setup_Installation.png
│ ├── 8_About_This_Project.png
│ └── Summaries_Comparison.png
├── data
│ ├── articles.json
│ ├── bart-large-cnn
│ │ ├── summarized_articles_bart-large-cnn.json
│ │ ├── summary_metrics_bart-large-cnn.csv
│ │ ├── test_bart-large-cnn.jsonl
│ │ └── train_bart-large-cnn.jsonl
│ ├── pegasus-cnn_dailymail
│ │ ├── summarized_articles_pegasus-cnn_dailymail.json
│ │ ├── summary_metrics_pegasus-cnn_dailymail.csv
│ │ ├── test_pegasus-cnn_dailymail.jsonl
│ │ └── train_pegasus-cnn_dailymail.jsonl
│ ├── pegasus-multi_news
│ │ ├── summarized_articles_pegasus-multi_news.json
│ │ ├── summary_metrics_pegasus-multi_news.csv
│ │ ├── test_pegasus-multi_news.jsonl
│ │ └── train_pegasus-multi_news.jsonl
│ ├── pegasus-xsum
│ │ ├── summarized_articles_pegasus-xsum.json
│ │ ├── summary_metrics_pegasus-xsum.csv
│ │ ├── test_pegasus-xsum.jsonl
│ │ └── train_pegasus-xsum.jsonl
│ ├── test_summarized_articles.jsonl
│ └── train_summarized_articles.jsonl
├── main.py
├── models <-- You need to create this manually and add your models
│ ├── facebook-bart-large-cnn
│ │ ├── config.json
│ │ ├── generation_config.json
│ │ ├── merges.txt
│ │ ├── model.safetensors
│ │ ├── special_tokens_map.json
│ │ ├── tokenizer.json
│ │ ├── tokenizer_config.json
│ │ └── vocab.json
│ └── google-pegasus-cnn_dailymail
│ ├── config.json
│ ├── generation_config.json
│ ├── model.safetensors
│ ├── special_tokens_map.json
│ ├── spiece.model
│ ├── tokenizer.json
│ └── tokenizer_config.json
├── notebooks
│ ├── LLM_Finance_Summaries_Dataset.ipynb
│ ├── LLM_Finance_Summaries_Metrics.ipynb
│ └── LLM_Finance_Summaries_Train.ipynb
├── requirements.txt
├── utils
│ ├── __init__.py
│ ├── article_store.py
│ ├── newsapi.py
│ ├── preprocessor.py
│ └── summarizer.py
└── .env <-- You need to create this manually with your API key
-
Start the app (either via Docker or
streamlit run main.py) -
Go to the NewsAPI page in the app to fetch new articles
-
Use the Summarize page to preprocess and summarize articles from the database
-
View summarized articles or evaluation metrics for completed experiments
-
Explore finetuned model results and annotated datasets
Gregory Barbas
📧 Email: gregorybarbas@gmail.com
💼 LinkedIn
🖥️ GitHub
For questions or contributions, feel free to reach out!
-
You must provide a valid NewsAPI key in the
.envfile to use the live news feature. -
Some models may take longer to load on first use. Models are preloaded for performance.
This project is licensed under the MIT License.