Inspiration

Construction sites are chaotic, dangerous, and constantly evolving. While AI has revolutionized text and image processing, it still struggles with spatial reality. Project managers currently rely on time-consuming manual walkthroughs and periodic reports that leave critical blind spots—blind spots that can mean the difference between safety and disaster.

We asked ourselves: What happens when AI finally learns to see in 3D?


What it does

Splatt transforms first-person POV videos from construction sites into comprehensive 3D spatial models. Upload multiple video perspectives, and our platform:

  • Reconstructs the entire construction site in 3D using 4D Gaussian Splatting technology
  • Maps object locations globally by combining camera positioning with AI-detected objects
  • Enables intelligent queries through RAG-powered semantic search
  • Tracks site evolution across different timestamps and angles
  • Identifies safety risks by cross-referencing detected objects with construction safety databases

All through an intuitive web interface—no need to attach cameras to hundreds of workers' heads.


How we built it

Our architecture combines cutting-edge spatial AI with semantic understanding:

Spatial Processing:

  • 4D Gaussian Splatting generates 3D scenes at specific timestamps using neural network optimization
  • Custom lightweight sequential processor handles multiple messy, real-world video inputs
  • Automated filtering removes motion blur and obstructed frames

Intelligence Layer:

  • Gemini extracts object data and their camera-relative positions
  • 3072-dimension embeddings enable precise semantic search matching Gemini's hidden layers
  • LangChain orchestrates secure data flow between components
  • RAG integration identifies leading safety risks in real-time

Data Infrastructure:

  • Supabase stores embeddings, object locations, and video metadata
  • Vector database enables semantic querying of spatial data
  • Combined camera poses + object data create comprehensive site maps

The modular design allows easy integration of emerging research like EgoGaussian for improved egocentric video processing.


Challenges we ran into

  • Handling messy, real-world construction footage with motion blur and obstructions
  • Synchronizing multiple video perspectives into a coherent 3D model
  • Calculating global object positions from camera-relative data
  • Building a RAG system that could meaningfully query spatial information
  • Creating seamless data flow between Gaussian Splatting, Gemini, and vector storage

Accomplishments that we're proud of

  • Successfully integrated 4D Gaussian Splatting with LLM-based object recognition
  • Built an end-to-end pipeline that transforms raw video into queryable 3D intelligence
  • Created a modular architecture that showcases how existing AI infrastructure can achieve spatial reasoning
  • Demonstrated practical application for construction safety and project management
  • Made spatial analysis accessible through a web interface

What we learned

  • Spatial AI requires creative integration—no single model solves everything
  • Combining camera pose data with semantic understanding unlocks powerful capabilities
  • Vector embeddings can bridge 3D reconstruction and natural language queries
  • Real-world construction footage demands robust preprocessing
  • Modular architecture enables rapid experimentation with emerging research

What's next for Splatt

  • Integration with EgoGaussian for improved first-person video processing
  • Real-time processing for live site monitoring
  • Predictive analytics for workflow optimization
  • Automated compliance reporting for safety violations
  • Mobile app for on-site access
  • Multi-site comparison features for project managers overseeing multiple locations

Built With

4d-gaussian-splatting gemini langchain supabase vector-database rag computer-vision spatial-ai next.js python


Built With

  • 4d-gaussian-splatting
  • colmap
  • computer-vision
  • conda
  • cuda
  • gemini
  • langchain
  • nerfstudio
  • next.js
  • python
  • rag
  • spatial-ai
  • supabase
  • vast.ai
  • vector-database
Share this project:

Updates