Challenges we ran into

  • Picking the right dataset. There are few datasets that have specific hometown info for olympic athletes.
  • Mapping born_region to states. Values arrive as full state names, two-letter codes, or numeric FIPS codes mixed in the same column so a normalizer had to handle all three before any state could be shaded.
  • Keeping the small towns visible. An early version capped city totals globally, which meant drilling into Pennsylvania showed only Philadelphia and Pittsburgh. Removing that cap (and grouping by city + state instead of just city) brought the smaller hometowns back.
  • Wiring two runtimes. Express loads env vars one way and Flask another; Gemini chat returned 503s until I added python-dotenv to Flask so both processes read the same .env files.

Accomplishments that we're proud of

  • Showing that you don't have to be from a big city to compete for Team USA.
  • Integrating Gemini server-side (google-generativeai in Flask) so the UI stays simple and the API key never touches the bundle.
  • A clean, single-page experience with no roster of individuals — only aggregates — that still tells a real geographic story about Team USA.
  • A Gemini integration that stays grounded: it only answers about the aggregate JSON the UI is showing, including the active cohort.
  • First time using cloud run to deploy a web app and it was easy with the CLI!

What we learned

  • Giving an LLM structured context (state counts, city counts, cohort filters) produces noticeably better answers opposed to dumping raw rows, and it stays cheaper and more accurate.
  • Real public datasets are great but can have Gaps in completeness — gaps need to be filled as POC matures to an actual product.
  • The data you use is just as important as the story you are trying to tell.
  • Technical Lesson Same-origin proxying is worth the boilerplate: putting Express in front of Flask kept the frontend simple and made it easy to add new API routes (like /api/chat). We used Python for Kaggle-backed data loading and for calling Gemini server-side so secrets never ship to the browser.

What's next for Hometown Heroes

  • Fill in Data Gaps — The dataset we used has some incomplete bio data for athletes and we'd to update it. One way to do this would be with Gemini.
  • More Olympic Metadata — We would like to also include event and medal count data for states in the future.
  • Sport and event lenses — let the map answer "where does Team USA's track team come from?" or "which states feed the Winter Games?".
  • A timeline scrubber instead of a decade dropdown, so the map animates the geography of Team USA over time.
  • Coach and high school context — overlay where athletes trained, not just where they were born, to better honor the local programs that shaped them.
  • Share links — a URL that captures the active decade and selected state so people can share their own hometown's story.

Built With

Share this project:

Updates