Inspiration
Power outages and natural disasters generate an overwhelming stream of text: incident notes, status updates, weather bulletins, and operator reports. We were inspired by a simple question:
Can we turn that noisy text into a reliable, real-time signal that helps people make better decisions under pressure?
We wanted to build something practical for grid operations, where minutes matter and clear prioritization is critical.
What it does
Aegis is an applied NLP decision-support pipeline for outage and disaster response.
- Ingests live outage telemetry and disaster/event text feeds
- Parses incident text into structured fields:
event_type,cause,severity,region,confidence - Converts those signals into a normalized risk score
- Applies explicit decision logic to produce alert actions
- Generates AI response suggestions for operators (assessment + actionable steps)
What it does
Aegis is an applied NLP decision-support pipeline for outage and disaster response.
- Ingests live outage telemetry and disaster/event text feeds
- Parses incident text into structured fields (event_type, cause, severity, region, confidence)
- Converts those signals into a normalized risk score
- Applies explicit decision logic to produce alert actions
- Generates AI response suggestions for operators (assessment + actionable steps)
- In short: text in -> risk signal -> decision out.
How we built it
We built Aegis as a full-stack pipeline.
Backend
- FastAPI with modular services for:
- parsing
- scoring
- alerts
- replay consumption
- AI suggestions
NLP Parser
Rule-based extraction of:
- event
- cause
- severity
- region
with confidence scoring.
Risk Engine
A weighted scoring model combining:
- severity
- event type
- cause
- confidence
Decision Layer
Threshold-based alert policy producing:
send_alertprioritychannelaudience
AI Suggestions
Gemini-powered tactical recommendations with:
- schema-constrained outputs
- deterministic fallback behavior
Replay Support
Timestamp-aware feed consumption with cursor reset to simulate information arriving over time.
Frontend
React dashboard with:
- risk score cards
- explanation factors
- region risk
- incident/disaster feeds
- replay controls
- inline AI recommendations
Risk Score Dynamics
To avoid pathological saturation, we used the following scoring function:
Our outage-to-score transform:
$$ S(m) = \min\left(100,\ \mathrm{round}\left(12 \cdot \log_{10}(1 + m)\right)\right) $$
where m is the aggregated affected-customer count.
Temporal Score Updates
To smooth volatility and prevent single spikes from permanently pinning risk:
$$ S_t = (1 - \alpha) S_{t-1} + \alpha S_{\text{new}} $$
where:
- (S_t) = current risk score
- (S_{t-1}) = previous score
- (S_{\text{new}}) = newly computed score
- (\alpha) = smoothing factor controlling responsiveness
This produces a stable but responsive risk signal.
Challenges we ran into
- Calibrating score behavior so it is responsive but not noisy
- Preventing score saturation at 100 for large outage bursts
- Handling negation in language (e.g., “no failure”)
- Resolving async UI state races between live feed updates and NLP updates
- Making recommendations structured, robust, and explainable
Accomplishments that we're proud of
- Delivered a complete end-to-end decision pipeline, not just a classifier
- Built explicit, auditable alert logic with clear operator-facing outputs
- Added replay-driven evaluation workflow for realistic time-order testing
- Integrated AI recommendations while preserving deterministic fallback behavior
- Designed a clear dashboard that ties evidence to decisions
What we learned
- The hard part of applied NLP is decision reliability, not extraction alone
- Deterministic fallbacks and transparent rules are crucial for operator trust
- Time-ordered evaluation prevents leakage and improves realism
- UX clarity (signal + rationale + action) is as important as model quality
What's next for Aegis
Baseline-vs-model evaluation metrics:
- precision
- recall
- false alarms
- decision utility
Confidence calibration and uncertainty reporting
Drift detection as incident language evolves
Policy simulation for threshold tuning under different risk appetites
Production-grade controls:
- audit trails
- security hardening
- governance
Log in or sign up for Devpost to join the conversation.