Inspiration

Atlanta has committed to cutting greenhouse-gas emissions 59% by 2030 and reaching net-zero by 2050. Buildings are roughly 40% of those emissions, and most of them were built before modern energy codes — they'll still be standing in 2050. There's no path to the 2030 goal that doesn't run through retrofitting existing buildings.

But the city can't retrofit them all at once. The real question is which buildings get the dollars first. Meanwhile Georgia Power won approval to spend $16.5 billion on new grid infrastructure for the data-center boom — while walking back its commitment to expand energy-efficiency programs. The cheaper relief is hiding in the buildings already sitting under those stressed feeders. Nobody had ranked it. We built Substrate to do exactly that.

What it does

Substrate is a grid-relief console for metro Atlanta. It scores every building by how much electrical capacity a retrofit frees at the exact hour its local feeder is most stressed, then ranks them on one metric: kW recovered per $1,000 invested. Cheapest megawatts first.

That freed capacity is worth real money to utilities and to large customers stuck in the interconnection queue, which makes a retrofit pencil out now. Every megawatt of peak load eliminated is fossil generation that doesn't run — so carbon reduction is the byproduct, not the pitch. Carbon follows the money.

The console shows:

  • A priority queue ranking 96 real Atlanta buildings cheapest-first across 4 constrained feeders
  • A 3D map where building height = relief per dollar (old, inefficient buildings shoot up; new efficient ones stay flat)
  • An avoided-infrastructure view: retrofit 137 buildings in a Midtown load pocket to defer a $48M substation upgrade for $4.2M — ~11× cheaper, online years sooner
  • A scenario simulator: drop a +500 MW data center or +30,000 EVs on the grid and watch the priority list re-rank in real time

How we built it

A React + Vite + TypeScript front end renders a custom 3D Mapbox GL JS canvas. A Node.js pipeline ingests real public data once and commits a static JSON snapshot, so the app runs fully offline and deterministic.

Sources: the City of Atlanta CBEEO benchmarking database (site EUI, ENERGY STAR, property type, floor area, year built), Fulton and DeKalb county assessor records via ArcGIS REST, Atlanta Regional Commission footprints, EIA-930 Southern Company load shapes for peak-coincidence weighting, and the NREL Alternative Fuels Data Center API for the EV scenario.

Scoring model. A deterministic, multi-factor model outputs the relief score:

$$ \text{recoverableKW} = (\text{peakKW} \times \text{flexPct}) + \text{batteryKW} + \text{drKW} $$

$$ \text{peakCoincKW} = \text{recoverableKW} \times \text{coincidence} $$

$$ \text{reliefScore} = \frac{\text{peakCoincKW}}{\text{costToFix} / 1000} $$

Inputs include vintage decay (year-built factor), an EUI waste index versus property-type CBECS medians, peak-demand modeling from annual kWh, flexibility/battery/demand-response potential by property type, EIA-930 coincidence weighting, and a multi-factor cost-to-fix estimate.

Provenance is a first-class concept. Every value is labeled measured, modeled, or synthetic: building energy data is measured, scoring outputs are modeled, and the 4 feeder zones are synthetic — because Georgia Power's Hosting Capacity Tool is registration-gated. Buildings covered by the ordinance that never reported energy render as a labeled unmeasured set. Most tools hide their assumptions; we surface them.

Challenges we ran into

The feeder data gap. Distribution-feeder constraints — which feeders are nearing capacity, by how much, by when — are the single most important data layer for non-wires alternative targeting, and they aren't public. Georgia Power's Hosting Capacity Tool is registration-gated; utilities consider this data competitively sensitive. We solved it by being honest: four synthetic feeder zones placed over real Atlanta districts, clearly labeled in the UI, with methodology cited. The pitch to a real customer is that their own interconnection system-impact study becomes the first legitimate constraint dataset — the design partner has every right to data the utility won't share.

Joining messy address data to footprint geometry. CBEEO addresses are inconsistently formatted ("123 Main St NW" vs "123 Main Street Northwest"), and ~15-25% of geocoded addresses didn't cleanly land inside a Microsoft Open Buildings polygon. We built a fallback layer: where polygon joins failed, buildings render as geocoded point markers with vertical extrusion, preserving the 3D visual without faking footprint precision.

Long-tail outliers in real benchmarking data. When we swapped synthetic data for ~150 real CBEEO records, the top of the priority queue suddenly featured a handful of buildings with implausibly high EUIs — almost certainly data-entry errors in the city's reporting. We added EUI bounds by property type to filter out values that fall outside physical reality (e.g. office EUI > 170 or warehouse EUI > 150), keeping the queue defensible to anyone who clicks the top results.

Property-type normalization. The city's labels ("Office - Medical," "Mixed Use - Office/Retail," "Multifamily - Senior Housing") are more granular than CBECS medians and our scoring buckets. We built a mapping layer that funnels city categories into seven scoring archetypes (office, multifamily, retail, warehouse, hotel, datacenter, hospital), dropping ambiguous types entirely rather than mismapping them — better to have fewer, cleaner buildings than more buildings with garbage outputs.

Excluding the Emory/Druid Hills campus. Real geographic data brought in a cluster of academic buildings east of the four target districts that would have crowded the top of the priority queue under "Midtown office" labels. We excluded the region with a longitude cutoff and documented the exclusion in code. Filtering data is a methodology choice, not a bug.

Making the 3D map feel like infrastructure software, not a consumer map. Default Mapbox dark styles read as "generic dashboard." We stripped the basemap heavily, drove building height by relief score instead of building size (decorrelating height from square footage), used a single accent color surgically, and animated constrained feeders with a slow opacity pulse. The aesthetic carries the credibility.

Accomplishments we're proud of

  • 96 real Atlanta buildings, scored end-to-end with measured energy data joined to real geographic coordinates, real property types, and a deterministic scoring model.
  • Provenance as a first-class feature, not a footnote. Every field in the UI carries a measured / modeled / synthetic tag. We surface uncertainty instead of hiding it — which is what utilities, regulators, and ratepayer advocates actually need to trust the output.
  • A genuine reframing of building energy intelligence. Most tools rank buildings by energy savings or carbon. Substrate ranks by peak-coincident grid capacity per dollar, which is what actually unlocks non-wires alternative deployment. The same building data, oriented around a different unit of value, becomes a different category of product.
  • End-to-end demo on real data in under a week. From data ingestion through a 3D interactive console with scenario simulation, built by a two-person team.

What we learned

  • The unit of value changes the product. Choosing to rank on capacity per dollar instead of carbon per dollar was the single highest-leverage decision we made. It changes who buys the product, what they pay for, and what makes the company defensible. The math is similar; the business is completely different.
  • Honesty about data limits is a feature. Our first instinct was to hide that the feeder layer was synthetic. The better instinct was to label it explicitly, cite the public-data gap, and explain how a real customer's interconnection study becomes our first legitimate constraint dataset. Surfacing the methodology made the product more credible, not less.
  • Real public data is messier than it looks. Atlanta's CBEEO ordinance produces one of the more rigorous municipal benchmarking datasets in the country, and it still has property-type inconsistencies, EUI outliers, geocoding gaps, and pre-1900 year-built placeholders. Building a production-grade pipeline meant treating data cleaning as a first-class engineering problem, not a preprocessing afterthought.
  • The right buyer isn't always the obvious one. Utilities are the natural customer for grid software, but they have slow procurement cycles and cost-recovery structures that don't reward demand-side investment. The acute buyers are large commercial customers and hyperscalers stuck in the interconnection queue — they'll pay directly for local capacity relief because the alternative is waiting years to plug in. That insight changed how we frame everything.
  • Carbon follows the money. Buildings don't get retrofitted because owners care about emissions. They get retrofitted because the economics work. Lead with capacity and dollars, and the carbon comes along for the ride. Lead with carbon, and you spend years trying to convince owners to do something they won't fund.

What's next for Substrate

  • Real feeder data via a design partner. A metro-Atlanta large-load customer or hyperscaler with an active interconnection study would become our first legitimate constraint dataset, routing around the utility's reluctance to share.
  • Expansion to additional benchmarking jurisdictions. New York, Boston, Seattle, and DC have similar building energy ordinances. The data pipeline is jurisdiction-agnostic — only the property-type mappings need re-tuning.
  • Integration with utility non-wires alternative procurement processes. Substrate as a vendor-neutral targeting layer that utilities, regulators, and ratepayer advocates can all reference in rate cases and resource planning dockets.
  • Outcome-tracked retrofit verification. Building post-retrofit measurement-and-verification on top of the same data pipeline — closing the loop from "this is the priority" to "this is what it actually delivered."

Built With

Share this project:

Updates