bridge

🏗️ Inspiration

Construction professionals and civil engineers often rely on visual data to assess infrastructure, document materials, and ensure safety. We were inspired to create an AI-powered assistant that could analyze construction images and provide instant, structured feedback about visible components, materials, and equipment — reducing manual documentation and enabling smarter, faster insights.

🧠 What it does

GemStructify lets users upload a construction-related image (e.g., bridges, buildings, work sites) and uses Gemini AI to analyze the photo. It returns a structured breakdown in JSON format that categorizes:

🏗️ Structural Components (e.g., beams, trusses, slabs)

🧱 Building Materials (e.g., bricks, concrete, steel)

🧰 Construction Equipment (e.g., cranes, excavators)

The app also provides an animated frontend experience, complete with JWT-based login and a clean UX that highlights the AI-generated findings in real time.

🔧 How we built it

Frontend: Built using Next.js, Tailwind CSS, and Framer Motion for animated UI

Backend: Powered by FastAPI, handles auth, uploads, and Gemini integration

Authentication: Secure JWT-based login system using MongoDB to store users

AI Analysis: Uses Gemini 1.5 Pro through LangChain to interpret image data

Deployment: Integrated via ngrok for real-time API testing and secure communication

🚧 Challenges we ran into

Sending image data to Gemini in a format it could parse — we resolved this by converting images to base64 and structuring messages to align with Gemini’s expectations

Managing authentication and secure access to protected pages using JWT

Handling and formatting the raw AI response into clean, readable, and animated UI

CORS issues and backend/frontend communication via ngrok

🏅 Accomplishments that we're proud of

Successfully integrated image upload and AI interpretation in a single app

Built a complete login-authenticated flow with live user session tracking

Delivered a fast, interactive, and animated user interface

Created a fully structured, categorized, and readable output from raw AI text

📚 What we learned

How to send and format multimodal inputs (image + instruction) to Gemini AI

Fine-tuning prompt design to generate structured, predictable JSON responses

JWT authentication flow and decoding directly on the frontend

Working with Framer Motion to create delightful animations that enhance UX

🚀 What's next for GemStructify

🔍 Add real-time object detection for live construction monitoring

🏗️ Enable automatic safety checks or risk detection from site images

💾 Allow users to save and export analysis reports for documentation

🛠️ Expand to support construction plan drawings, drone footage, and 3D scans

🌐 Deploy to production with a proper domain and host backend using Render or Railway

🛠️ Built With

🧠 Gemini 1.5 Pro (via LangChain) – Multimodal AI for analyzing images

⚙️ FastAPI – Backend framework for handling auth, image uploads, and Gemini integration

🌐 Next.js – React framework for frontend with server components

💨 Tailwind CSS – Utility-first CSS for styling

🎞️ Framer Motion – For smooth, animated transitions in the UI

🧾 jwt-decode – Client-side JWT decoding for session control

🍃 MongoDB Atlas – Database for storing user credentials

🔐 OAuth2 & JWT – Secure authentication and access control

🌉 Ngrok – Tunnel public traffic to local backend during development

📦 Python – Backend language for FastAPI, JWT, and Gemini setup

🟨 TypeScript – Strongly-typed language for frontend logic

📤 FormData API – For sending uploaded images to the backend

📷 base64 Encoding – To embed images into Gemini’s input schema

🪄 LangChain – To orchestrate structured prompts with Gemini

💡 Vercel (optional) – For frontend deployment (Next.js hosting)

Built With

fastapi
framer-motion
gemini
jwt-decode
langchain
mongodb
ngrok
python
typescript
vercel

Updates

super sonic started this project — Mar 31, 2025 05:53 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.