Inspiration

Power outages and natural disasters generate an overwhelming stream of text: incident notes, status updates, weather bulletins, and operator reports. We were inspired by a simple question:

Can we turn that noisy text into a reliable, real-time signal that helps people make better decisions under pressure?

We wanted to build something practical for grid operations, where minutes matter and clear prioritization is critical.


What it does

Aegis is an applied NLP decision-support pipeline for outage and disaster response.

  • Ingests live outage telemetry and disaster/event text feeds
  • Parses incident text into structured fields:
    event_type, cause, severity, region, confidence
  • Converts those signals into a normalized risk score
  • Applies explicit decision logic to produce alert actions
  • Generates AI response suggestions for operators (assessment + actionable steps)

What it does

Aegis is an applied NLP decision-support pipeline for outage and disaster response.

  • Ingests live outage telemetry and disaster/event text feeds
  • Parses incident text into structured fields (event_type, cause, severity, region, confidence)
  • Converts those signals into a normalized risk score
  • Applies explicit decision logic to produce alert actions
  • Generates AI response suggestions for operators (assessment + actionable steps)
  • In short: text in -> risk signal -> decision out.

How we built it

We built Aegis as a full-stack pipeline.

Backend

  • FastAPI with modular services for:
    • parsing
    • scoring
    • alerts
    • replay consumption
    • AI suggestions

NLP Parser

Rule-based extraction of:

  • event
  • cause
  • severity
  • region

with confidence scoring.

Risk Engine

A weighted scoring model combining:

  • severity
  • event type
  • cause
  • confidence

Decision Layer

Threshold-based alert policy producing:

  • send_alert
  • priority
  • channel
  • audience

AI Suggestions

Gemini-powered tactical recommendations with:

  • schema-constrained outputs
  • deterministic fallback behavior

Replay Support

Timestamp-aware feed consumption with cursor reset to simulate information arriving over time.

Frontend

React dashboard with:

  • risk score cards
  • explanation factors
  • region risk
  • incident/disaster feeds
  • replay controls
  • inline AI recommendations

Risk Score Dynamics

To avoid pathological saturation, we used the following scoring function:

Our outage-to-score transform:

$$ S(m) = \min\left(100,\ \mathrm{round}\left(12 \cdot \log_{10}(1 + m)\right)\right) $$

where m is the aggregated affected-customer count.


Temporal Score Updates

To smooth volatility and prevent single spikes from permanently pinning risk:

$$ S_t = (1 - \alpha) S_{t-1} + \alpha S_{\text{new}} $$

where:

  • (S_t) = current risk score
  • (S_{t-1}) = previous score
  • (S_{\text{new}}) = newly computed score
  • (\alpha) = smoothing factor controlling responsiveness

This produces a stable but responsive risk signal.


Challenges we ran into

  • Calibrating score behavior so it is responsive but not noisy
  • Preventing score saturation at 100 for large outage bursts
  • Handling negation in language (e.g., “no failure”)
  • Resolving async UI state races between live feed updates and NLP updates
  • Making recommendations structured, robust, and explainable

Accomplishments that we're proud of

  • Delivered a complete end-to-end decision pipeline, not just a classifier
  • Built explicit, auditable alert logic with clear operator-facing outputs
  • Added replay-driven evaluation workflow for realistic time-order testing
  • Integrated AI recommendations while preserving deterministic fallback behavior
  • Designed a clear dashboard that ties evidence to decisions

What we learned

  • The hard part of applied NLP is decision reliability, not extraction alone
  • Deterministic fallbacks and transparent rules are crucial for operator trust
  • Time-ordered evaluation prevents leakage and improves realism
  • UX clarity (signal + rationale + action) is as important as model quality

What's next for Aegis

  • Baseline-vs-model evaluation metrics:

    • precision
    • recall
    • false alarms
    • decision utility
  • Confidence calibration and uncertainty reporting

  • Drift detection as incident language evolves

  • Policy simulation for threshold tuning under different risk appetites

  • Production-grade controls:

    • audit trails
    • security hardening
    • governance

Built With

Share this project:

Updates