Inspiration
I’ve always been passionate about global justice, data for good, and understanding how climate change and poor infrastructure collide to create crisis hotspots. With the rise of natural disasters, I wondered: what if we could turn risk scores and historical data into something useful, even predictive?
That curiosity led me to this project—merging INFORM Risk Index with the EM-DAT disaster database to not only validate and visualize historical patterns, but also to forecast future disasters, analyze policy readiness, and recommend smarter resource allocation.
What it does
Data_Doomdays_Angelina is a full-stack, end-to-end data science project that:
- Identifies top high-risk countries based on INFORM scores
- Correlates those scores with real disaster frequency and population impact from EM-DAT
- Predicts disaster likelihood for 2025 using moving averages and regression
- Highlights under-supported countries in terms of aid vs. impact
- Uses clustering and heatmaps to visualize regional risk patterns
- Applies feature importance techniques to reveal which risk indicators matter most
- Offers policy insight on governance, infrastructure, and underreporting
Builds creative storytelling tools (like a retro “mission briefing”) to engage users
How we built it
Solo-built using:
Python for data processing, regression modeling, clustering, and feature selection
Pandas, NumPy, Scikit-learn, Seaborn, Matplotlib for analysis and visualization
Jupyter Notebook for EDA and reporting
INFORM Risk Index and EM-DAT datasets for ground truth comparison
Markdown + Storytelling for accessible reporting
Retro-style visual themes for user engagement (in progress)
Challenges we ran into
- Merging messy datasets: INFORM and EM-DAT had different country names, time frames, and missing values that required intensive cleaning.
- Underreporting issues: Some countries had high risk but no disaster records, making it hard to draw accurate correlations.
- Balancing complexity vs. clarity: Sophisticated models performed better, but were harder to interpret for non-technical users.
- Being a one-person team: Every step—data cleaning, analysis, design, storytelling—had to be done solo, which was both empowering and overwhelming.
Accomplishments that we're proud of
- Successfully built a pipeline that translates global risk indicators into meaningful forecasts
- Identified 5 countries with high risk but low historical data, raising important questions about underreporting or resilience
- Built a visually engaging, story-driven interface to make data feel human
- Created a fully independent end-to-end project from idea to prediction to storytelling ## What we learned
- Not all risk indicators are created equal—some truly drive outcomes, others don’t
- Combining datasets amplifies insight, but also multiplies complexity
- Climate-related risks are increasing sharply, especially in regions like Sub-Saharan Africa and South Asia
- Solo-building is tough, but incredibly rewarding when the vision comes together ## What's next for Data_Doomdays_Angelina Finish building the retro game-style visualization tool that explains disaster risk like a mission scenario
- Deploy the tool publicly so anyone—from policymakers to students—can explore global disaster risks interactively
- Add a real-time update feature that pulls fresh disaster data every quarter
- Write a blog post or mini case study to showcase the work to nonprofits or disaster response organizations
- Explore integrating with platforms like ReliefWeb, UNDRR, or Mapbox for expanded functionality
Built With
- google-colab
- markdown
- matplotlib
- numpy
- open-source-datasets
- pandas
- python
- scikit-learn
- seaborn
Log in or sign up for Devpost to join the conversation.