Inspiration
Our project started realization that human-wildlife conflict along our coastlines is increasing, and our current methods of protecting marine life rely on static, outdated data. With rising coastal tourism and boat traffic, accidental habitat disruption and vessel strikes are major threats to seal populations. We realized park rangers and harbor masters need to know where seals are going to be rather than where they were. We wanted to build a tool that shifts marine conservation from reactive to proactive.
What it does
Our platform is a data-driven predictive mapping tool. Instead of just logging past sightings, we use a logistic regression model to predict the probability of a seal encounter based on dynamic environmental conditions (like Sea Surface Temperature, ocean floor depth, and wind speed).
This data is visualized on a custom-built, interactive geospatial density map. Local authorities can use our dashboard to set dynamic boat speed limits or temporary habitat protections in high-probability zones before human traffic arrives.
How we built it
We approached this in three main phases: Data, Modeling, and Visualization.
The Data: We processed a massive dataset of over 79,000+ global marine wildlife occurrences, cleaning coordinates and filtering by species.
The Model: We used a two-stage model. The first model was logistic regression (chosen for speed on the front-end) to predict the probability of a seal being found at the given location. We trained and cross-validated multiple models with various interactions and environmental features, and the core of our probability prediction utilizes this logit function:
$$\begin{aligned} \text{logit}(P(Y = 1)) &= \beta_0 + \beta_1(\text{depth}) + \beta_2(\text{slope}) + \beta_3(\text{sst}) \ &\quad + \beta_4(\text{wind_speed}) + \beta_5(\text{wind_speed})^2 \ &\quad + \beta_6(\text{distance_to_shore_km}) + \beta_7(\text{month_sin}) + \beta_8(\text{month_cos}) \end{aligned}$$
- The Visualization: We designed an intuitive, user-centric UI in Figma. Then, we used Python, Pandas, and Plotly to build a custom Mapbox geospatial scatter plot. We injected our exact brand hex codes directly into the Mapbox Studio API so the map perfectly matches our UI's Light Cream and Brand Teal aesthetic.
Challenges we ran into
- Geospatial Visualization & Performance: Visualizing 79,000 individual data points without degrading browser performance was a massive hurdle! Our initial WebGL heatmap approach was blocked by hardware acceleration limitations in Chrome and Safari. We had to pivot our visualization strategy quickly, eventually linking a custom Mapbox Studio style URL through Plotly's
scatter_mapboxlibrary to render the exact overlapping density effect we needed. - Full-Stack Integration: Integrating Mapbox into React introduced unexpected friction. We hit a version conflict with our Vite build that required downgrading the library, and our custom map style token had domain restrictions that temporarily blocked local development for the rest of the team. Additionally, establishing clean cross-origin communication between our React frontend and FastAPI backend took more iteration than anticipated.
- UI/UX Translation: While building our interactive map in Mapbox Studio, we initially struggled to customize it to match our exact vision. Translating our high-fidelity Figma designs into functional React components under a strict time limit was challenging, especially when managing CSS specificity conflicts and getting the dynamic model data to render correctly in the UI.
- Workflow & Version Control: Coordinating across three fast-moving, simultaneous workstreams (Data Science, Backend, and Frontend) while keeping a shared Git repository clean and conflict-free added a constant layer of complexity throughout the hackathon.
- Deployment & Containerization Limits: Deploying our API to Render presented a major infrastructure challenge due to strict free-tier resource constraints. While our core logistic regression model was lightweight enough to bake directly into a Docker container, the underlying geospatial dependencies (like massive
.shpshapefiles) exceeded the container size limits. To solve this, we are continuously rethinking our deployment architecture and configuring external volumes to dynamically mount and feed the spatial data into the container at runtime.
Accomplishments that we're proud of
We are incredibly proud of bridging the gap between Data Science and UI/UX. Often, datathon projects are purely mathematical, but we successfully built a custom Mapbox integration that looks exactly like our Figma prototype while accurately rendering tens of thousands of data points.
What we learned
- Geospatial Visualization & Feature Engineering: We learned a massive amount about geospatial data visualization, specifically how to manipulate Python libraries like Plotly and GeoPandas to render custom vector maps. On the machine learning side, we realized just how crucial it is to properly engineer environmental features to ensure a predictive model is actually viable for real-world conservation.
- Bridging Design and Development: For most of us, this was our first time designing and coding simultaneously. We learned how to iterate on our UI incredibly fast to match our Figma prototype under extreme time pressure, and we discovered how powerful tools like Figma Dev Mode are for effective collaboration.
- Full-Stack Coordination & Git Workflows: We learned firsthand exactly how much coordination full-stack development requires. From managing environment variables across different machines to keeping our Git repository clean while multiple team members pushed code simultaneously, we gained invaluable, real-world experience in technical project management.
What's next
In the future, we want to scale this model to include live API feeds for real-time weather and ocean temperature data. We also plan to expand our target variables to predict the migratory patterns of other vulnerable species like whales and sea turtles!
Log in or sign up for Devpost to join the conversation.