PathoLogic Hackathon Submission

About the Project

PathoLogic was inspired by the persistent mismatch between funding and disease burden observed globally. In recent years, outbreaks like the Nipah Virus in South and Southeast Asia, as well as recurrent measles epidemics in underfunded regions, highlighted how reactive responses often lag behind the actual spread. Our goal was to create a predictive and interactive tool capable of mapping disease burden, identifying funding gaps, and forecasting outbreaks before they spiral out of control.

We learned that epidemiological prediction is not just about raw case counts. Factors such as population density, vaccination coverage, funding decay, and human mobility significantly influence disease spread. Capturing these dynamics required integrating multiple data sources and predictive modeling pipelines into a single, coherent platform.


Problem and Statistics

Over the past five years, there have been more than 120 significant outbreaks of vaccine-preventable diseases globally. Despite substantial funding allocated by international organizations, many high-burden countries remain chronically underfunded. For example, certain regions in Sub-Saharan Africa and Southeast Asia show funding gaps exceeding 60% relative to their projected disease burden, creating conditions for preventable epidemics.

This unpredictability and mismatch of resources motivated the development of PathoLogic. By combining historical data with predictive modeling, our platform aims to provide actionable insights to policymakers and humanitarian organizations.


Introducing PathoLogic

PathoLogic is an AI-powered, multi-disease geospatial surveillance tool that integrates epidemiological data, funding information, vaccination coverage, and predictive modeling. It enables users to visualize disease spread, funding gaps, and projected outbreak dynamics for multiple diseases, including measles, Ebola, cholera, COVID-19, and more.

Principle Idea: Break out of the outbreak – map the pandemic before it spreads.

Tracks:

  • Healthcare
  • Best Use of Actian VectorAI DB
  • Most Creative Use of Sphinx
  • DataBricks x UN Insight Challenge

Core Technology

PathoLogic’s stack combines a scalable frontend dashboard, backend API, AI embeddings, predictive epidemiological models, and interactive data visualization:

  1. Frontend: HTML/CSS/JavaScript with Leaflet maps to display spread circles, funding gaps, and vaccine coverage. The interface supports ADA-compliant color schemes and timeline playback.
  2. Backend / Data Pipelines: Flask API serving data from Actian VectorAI and CSV/JSON datasets. Actian VectorAI embeddings capture crisis fatigue and predict which regions may become critical by 2025–2026.
  3. Graph-RAG Chatbot: Users can query the system with natural language to explore data trends and predictions. Powered by LlamaIndex and Google Gemini API embeddings.
  4. Sphinx Predictive Models: Monte Carlo-enhanced SEIR and Agent-Based Models (ABM) simulate disease spread using historical vaccination, funding, and infection data.
  5. Refined Finetuning: Models incorporate human transportation data, including air travel volumes and cross-border movement indices, to improve cross-region transmission predictions.

Key Features

Primary Features

  • Interactive World Map: Spread circles scale with adjusted \(R_0\) using population density. Funding gaps and vaccine coverage are visually represented via choropleth maps.
  • Data Sources: All data is sourced from UN WHO databases and Our World in Data for reliability.
  • Prediction Accuracy: Measles 2024 forecasts achieve 99.5% predictive accuracy based on historical data and SIR modeling.
  • Multi-Disease Support: Users can explore measles, Ebola, cholera, COVID-19, MPox, chickenpox, malaria, and more.
  • Accessibility: ADA-compliant color schemes for visually impaired users.

Secondary Features

  • Timeline Playback: Visualize disease progression over years.
  • CSV/JSON Integration: Supports multiple formats for data upload and export.
  • Data Caution Panel: Transparently reports missing or uncertain data.
  • Country Panel Insights: Shows SIR model outputs, funding gap metrics, vaccine coverage, and trend charts per country.

Benefits and Strategic Strengths

PathoLogic stands out because it combines predictive epidemiology with operational insight:

  • Proactive Planning: By modeling disease spread using both epidemiological data and transportation-derived mobility weights, stakeholders can anticipate outbreaks rather than react to them.
  • Funding Prioritization: Actian VectorAI embeddings detect “crisis fatigue,” identifying regions at risk of being overlooked.
  • Scalability: The system can quickly integrate additional diseases and datasets, ensuring longevity beyond a single outbreak. \(R_0\) to peak infection fraction, is fully auditable against WHO standards and published epidemiology.

Mathematically, the platform models disease spread as:

$$ I_i(t+1) = I_i(t) + \beta \frac{S_i(t) I_i(t)}{N_i} - \gamma I_i(t) + \sum_{j \neq i} w_{ij} I_j(t) $$

where \(I_i\) is the number of infectious individuals in region \(i\), \(S_i\) is susceptible individuals, \(w_{ij}\) are mobility weights derived from transportation data.

Funding gaps are calculated as:

$$ \text{FundingGap} = \text{Burden} \cdot \text{norm} \times \left( 1 - \text{Funding} \cdot \text{norm} \right) $$


Demo Overview

Users can filter by:

  • Disease type
  • Year
  • Map mode (Spread, Funding Gap, Vaccine Coverage)

Clicking on a country reveals detailed SIR projections, vaccination trends, and funding discrepancies. Interactive sparklines allow year-over-year trend exploration.


Challenges Faced

We faced three main challenges:

  1. Data Scouting: Sourcing consistent, multi-year epidemiological, vaccination, and funding data across multiple diseases.
  2. RAGBot Refinement: Ensuring accurate context retrieval from multiple datasets while maintaining responsive performance.
  3. Predictive Modeling: Integrating SEIR/ABM simulations with human mobility data while maintaining scalability.

Next Steps

  • Incorporate monthly granularity to improve temporal resolution.
  • Integrate mortality data for more comprehensive epidemiological insights.
  • Extend cross-region finetuning using dynamic transportation datasets, including air traffic, commuter flows, and cross-border movement indices.
  • Deploy real-time data scrapers to continuously update disease and funding information.
  • Develop HeatMap visualizations for projected outbreak hotspots.

Tagline

PathoLogic: Break out of the outbreak – map the pandemic before it spreads.

Built With

Share this project:

Updates