Pluvial - Overall Winner 🏆

Automated Water-Reuse Prospecting Engine - HackSMU 2026

Built for Grundfos. Identifies commercial and industrial buildings across the continental US that are strong candidates for rainwater harvesting systems, ranked by opportunity score and visualized on an interactive national map.

The Problem

Finding the right buildings for water-reuse systems is a manual, expensive prospecting problem. Sales teams rely on gut instinct and cold outreach across millions of potential sites. RainUSE Nexus automates that search — scoring every large commercial rooftop in 37 states against physical, financial, regulatory, and ESG criteria, then surfacing the best opportunities in a sales-ready interface.

What It Does

Discovers large buildings — roof catchment area 100,000–1,000,000 sqft from Microsoft's 129M-building open dataset
Filters false positives — 5-stage geometry filter removes farms, parking canopies, lakes, and irregular shapes
Flags cooling towers — cross-referenced against EPA FRS facility registry by NAICS industry code
Attaches rainfall data — NOAA 30-year county precipitation normals per building
Enriches addresses — street address and building type via Overture Maps
Scores every building — multi-dimensional viability and urgency scoring
Surfaces opportunities — interactive national choropleth + building-level drill-down with AI sales briefs
Calculates financial ROI — three-scenario engine (conservative / base / upside) with CV-confidence-adjusted returns
Delivers pitch-ready briefs — Gemini RAG brief rendered as a structured 7-section investment deliverable, emailable as a PDF

Output: 11,577 pre-scored candidate buildings across 37 states, ready for sales prospecting.

Architecture

Pipeline (Python / GeoPandas)

Microsoft Building Footprints (GeoJSON per state)
         ↓
s01  Download + geometry filter    → large_buildings_{STATE}.gpkg
s01b Re-filter in place (QA pass)  → large_buildings_{STATE}.gpkg (overwrite)
s02  EPA FRS facility download     → industrial_facilities_{STATE}.csv
s03  Spatial join                  → buildings_enriched_{STATE}.gpkg  (cooling tower flag)
s04  Rainfall join                 → buildings_rainfall_{STATE}.gpkg  (annual_rainfall_in)
s05  Overture address enrichment   → buildings_addressed_{STATE}.gpkg (address, city)
s06  Scoring                       → output/buildings.json
                                   → output/state_scores.json
s07  CV layer (optional)           → CLIP confidence scores per building

Backend (FastAPI / Python)

Loads buildings.json once at startup into memory. All endpoints are sub-millisecond reads except /brief and /email-brief.

Endpoint	Description
`GET /buildings`	Full building list with satellite imagery URLs injected per building
`GET /buildings/{id}`	Single building lookup by ID
`POST /roi`	Three-scenario ROI engine (conservative / base / upside) — harvestable gallons, CAPEX, NPV, payback, CV-confidence-adjusted ROI
`POST /brief`	Gemini RAG brief — structured 7-section investment deliverable, template fallback + disk cache
`POST /email-brief`	Renders brief as PDF via WeasyPrint, delivers to recipient via Resend

Frontend (Next.js / MapLibre GL)

National view — state choropleth colored by opportunity tier (Prime → Low)
State view — zooms to state, renders individual building markers
Building view — satellite imagery, roof polygon overlay, cooling tower markers
Right panel — building profile, ROI preview, AI brief generation

Scoring

Building Viability Score (0–100)

Dimension	Weight	What it measures
Physical	25%	Roof area tier + cooling tower presence
Financial	25%	Annual water + sewer savings vs local utility rates
Regulatory	20%	State incentive environment, stormwater fees
ESG	20%	Drought risk, SBTi commitment, LEED certification
Climate Risk	10%	Long-term water scarcity exposure

Urgency Score (1–10)

Composite of drought risk index, water rate pressure, stormwater fee activity, and ESG commitments. Drives the "why now" angle in AI briefs.

State Opportunity Score (0–100, choropleth)

Weighted from pipeline sub-scores that actually vary by state:

Sub-score	Weight
Water cost pressure	38%
Rainfall capture potential	31%
Regulatory / incentive tailwind	31%
Candidate count density bonus	+10 pts max

Tiers: Prime (75+) · Strong (60–75) · Moderate (48–60) · Emerging (38–48) · Low (<38)

Data Sources

Data	Source	Used for
Building footprints	Microsoft USBuildingFootprints (129M buildings)	Roof geometry, lat/lng
Geometry classification	OpenStreetMap (embedded in MS footprints)	Building type tags
Cooling tower proxy	EPA Facility Registry Service (FRS)	NAICS-based facility matching
Rainfall normals	NOAA Climate Normals 1991–2020	Annual precipitation per building
Address enrichment	Overture Maps	Street address, building use
Water & sewer rates	State-level utility averages	ROI calculation
Satellite imagery	Google Maps Static API (zoom 18)	Visual rooftop confirmation
State boundaries	US Census Bureau GeoJSON (5m)	Choropleth map layer

ROI Model

harvestable_gal = roof_area_sqft × rainfall_in × 0.623 × efficiency

annual_savings  = (harvestable / 1000) × (water_rate + sewer_rate × 0.70)
                + stormwater_fee_avoidance
                + incentive_value / 10

npv_10yr        = Σ annual_savings / (1.05)^year  −  capex
payback_yrs     = capex / annual_savings

Three scenarios (conservative / base / upside) adjust rainfall, efficiency, and capex multipliers.

AI Brief Generation

Each building can generate a structured sales brief via Google Gemini:

Prospect summary and physical suitability
Financial snapshot (savings, payback, NPV)
Recommended sales angle (cost_savings / resilience / compliance / esg_credibility)
Confidence caveats and next validation steps

Includes a template fallback (no API call) so demos never fail on quota limits, plus in-memory + disk caching for instant re-loads.

Current Coverage

37 states  ·  11,577 scored buildings
AL AR AZ CA CO GA IA ID IN KY LA ME MI MN MO
MS MT NC ND NE NM NV NY OH OK OR PA SC SD TN
TX UT VA WA WI WV WY

Running Locally

Prerequisites

Python 3.11+, Node 18+
Google Maps API key (satellite imagery)
Google Gemini API key (AI briefs, optional)

Backend

cd backend
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt

# Create backend/.env
echo "GOOGLE_MAPS_API_KEY=your_key" > .env
echo "GOOGLE_API_KEY=your_gemini_key" >> .env

uvicorn main:app --reload --port 8000

Frontend

cd frontend
npm install
npm run dev        # → http://localhost:3000

Pipeline (re-run scoring only)

cd pipeline
source venv/bin/activate
python -m scripts.s06_score          # regenerate buildings.json from existing processed files
cp output/buildings.json ../backend/data/
cp output/buildings.json ../frontend/public/data/
cp output/state_scores.json ../backend/data/
cp output/state_scores.json ../frontend/public/data/

Project Structure

saving_water/
├── pipeline/
│   ├── config.py                     ← paths, constants, state lists
│   ├── scripts/
│   │   ├── s01_download_footprints.py  ← download + 5-stage geometry filter
│   │   ├── s01b_refilter_existing.py   ← re-apply filter without re-downloading
│   │   ├── s02_download_epa_frs.py
│   │   ├── s03_spatial_join.py
│   │   ├── s04_rainfall_join.py
│   │   ├── s05_overture_enrich.py
│   │   ├── s06_score.py                ← scoring → buildings.json + state_scores.json
│   │   └── s07_cv_layer.py             ← OpenAI CLIP visual confidence
│   ├── data/
│   │   ├── raw/                        ← input GeoJSON per state
│   │   └── processed/                  ← intermediate .gpkg files
│   └── output/
│       ├── buildings.json
│       └── state_scores.json
│
├── backend/
│   ├── main.py                         ← FastAPI app, satellite URL injection
│   ├── models.py                       ← Pydantic schemas
│   ├── roi_engine.py                   ← ROI / NPV calculation
│   ├── brief_generator.py              ← Gemini brief + template fallback
│   └── data/
│       ├── buildings.json              ← copy from pipeline output
│       └── state_scores.json
│
└── frontend/
    ├── src/
    │   ├── app/                        ← Next.js pages (map, roi, brief)
    │   ├── components/
    │   │   ├── map/MainMap.tsx         ← MapLibre choropleth + building markers
    │   │   ├── dashboard/              ← prospecting dashboard layout
    │   │   ├── panels/                 ← building profile, ROI, brief panels
    │   │   └── brief/SatellitePanel.tsx
    │   └── lib/
    │       ├── stateOpportunity.ts     ← opportunity scoring + color scale
    │       └── api.ts                  ← data loading helpers
    └── public/data/                    ← static JSON served to browser

Key Design Decisions

Roof area — Microsoft's dataset provides ML-derived building polygons for all 50 states. Roof area comes directly from polygon geometry — zero inference cost, near-perfect coverage.

EPA FRS over satellite for cooling tower detection — NAICS codes reliably flag cooling tower presence for power plants, refineries, hospitals, data centers, and hotels via a spatial join. Faster, cheaper, and no false positives from puddles or HVAC units.

Geometry filter before any enrichment — A 5-stage shape filter (area bounds, compactness, bbox fill ratio, aspect ratio, building class) eliminates farms, lakes, parking canopies, and highway overpasses early, keeping only credible commercial rooftops in the pipeline.

Template brief fallback — Gemini quota limits are real during demos. The fallback generates a deterministic, data-accurate brief from real pipeline numbers — no API call, no failure.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
backend		backend
frontend		frontend
pipeline		pipeline
readme_assets		readme_assets
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pluvial - Overall Winner 🏆

Automated Water-Reuse Prospecting Engine - HackSMU 2026

The Problem

What It Does

Architecture

Pipeline (Python / GeoPandas)

Backend (FastAPI / Python)

Frontend (Next.js / MapLibre GL)

Scoring

Building Viability Score (0–100)

Urgency Score (1–10)

State Opportunity Score (0–100, choropleth)

Data Sources

ROI Model

AI Brief Generation

Current Coverage

Running Locally

Prerequisites

Backend

Frontend

Pipeline (re-run scoring only)

Project Structure

Key Design Decisions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pluvial - Overall Winner 🏆

Automated Water-Reuse Prospecting Engine - HackSMU 2026

The Problem

What It Does

Architecture

Pipeline (Python / GeoPandas)

Backend (FastAPI / Python)

Frontend (Next.js / MapLibre GL)

Scoring

Building Viability Score (0–100)

Urgency Score (1–10)

State Opportunity Score (0–100, choropleth)

Data Sources

ROI Model

AI Brief Generation

Current Coverage

Running Locally

Prerequisites

Backend

Frontend

Pipeline (re-run scoring only)

Project Structure

Key Design Decisions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages