Inspiration

I’ve always been passionate about global justice, data for good, and understanding how climate change and poor infrastructure collide to create crisis hotspots. With the rise of natural disasters, I wondered: what if we could turn risk scores and historical data into something useful, even predictive?

That curiosity led me to this project—merging INFORM Risk Index with the EM-DAT disaster database to not only validate and visualize historical patterns, but also to forecast future disasters, analyze policy readiness, and recommend smarter resource allocation.

What it does

Data_Doomdays_Angelina is a full-stack, end-to-end data science project that:

  • Identifies top high-risk countries based on INFORM scores
  • Correlates those scores with real disaster frequency and population impact from EM-DAT
  • Predicts disaster likelihood for 2025 using moving averages and regression
  • Highlights under-supported countries in terms of aid vs. impact
  • Uses clustering and heatmaps to visualize regional risk patterns
  • Applies feature importance techniques to reveal which risk indicators matter most
  • Offers policy insight on governance, infrastructure, and underreporting
  • Builds creative storytelling tools (like a retro “mission briefing”) to engage users

    How we built it

    Solo-built using:

  • Python for data processing, regression modeling, clustering, and feature selection

  • Pandas, NumPy, Scikit-learn, Seaborn, Matplotlib for analysis and visualization

  • Jupyter Notebook for EDA and reporting

  • INFORM Risk Index and EM-DAT datasets for ground truth comparison

  • Markdown + Storytelling for accessible reporting

  • Retro-style visual themes for user engagement (in progress)

Challenges we ran into

  • Merging messy datasets: INFORM and EM-DAT had different country names, time frames, and missing values that required intensive cleaning.
  • Underreporting issues: Some countries had high risk but no disaster records, making it hard to draw accurate correlations.
  • Balancing complexity vs. clarity: Sophisticated models performed better, but were harder to interpret for non-technical users.
  • Being a one-person team: Every step—data cleaning, analysis, design, storytelling—had to be done solo, which was both empowering and overwhelming.

Accomplishments that we're proud of

  • Successfully built a pipeline that translates global risk indicators into meaningful forecasts
  • Identified 5 countries with high risk but low historical data, raising important questions about underreporting or resilience
  • Built a visually engaging, story-driven interface to make data feel human
  • Created a fully independent end-to-end project from idea to prediction to storytelling ## What we learned
  • Not all risk indicators are created equal—some truly drive outcomes, others don’t
  • Combining datasets amplifies insight, but also multiplies complexity
  • Climate-related risks are increasing sharply, especially in regions like Sub-Saharan Africa and South Asia
  • Solo-building is tough, but incredibly rewarding when the vision comes together ## What's next for Data_Doomdays_Angelina Finish building the retro game-style visualization tool that explains disaster risk like a mission scenario
  • Deploy the tool publicly so anyone—from policymakers to students—can explore global disaster risks interactively
  • Add a real-time update feature that pulls fresh disaster data every quarter
  • Write a blog post or mini case study to showcase the work to nonprofits or disaster response organizations
  • Explore integrating with platforms like ReliefWeb, UNDRR, or Mapbox for expanded functionality

Built With

Share this project:

Updates