Textify

Input for MathPix API and Google Gemini Response
Profile of all stored outputs given a user
Outputted MathPix API and Google Gemini Response
Log In Page (Fully Functional)

Inspiration

We were inspired to build Textify after noticing how often students struggle to understand textbook problems, handwritten notes, or screenshots that can’t easily be copied or pasted into online learning tools. Many of these materials, like physics formulas or math problems, are locked inside images or PDFs. We wanted to create a platform that bridges that gap: turning confusing pictures into editable, searchable, and analyzable text while also highlighting the most important keywords

What it does

Textify allows users to upload any image or PDF that contains text, formulas, or diagrams. The system automatically performs OCR (Optical Character Recognition) to extract readable text and identify key terms using AI.

Once converted, the user can: View the extracted text in an editable box Copy and paste the text for deeper research or problem-solving Save both the original image and the converted version to their library for future access

How we built it

We built Textify using Next.js for the frontend, hosted on Vercel, and connected it to a FastAPI backend hosted on Render.

Key components include: MathPix API for accurate text and formula extraction (OCR) Google Gemini API for generating summaries and highlighting keywords Azure Blob Storage for handling uploaded images TypeScript + Tailwind CSS for a clean, responsive UI MySQL database for saving user sessions and text conversions

Challenges we ran into

Integrating multiple APIs (MathPix and Gemini) and synchronizing their outputs was complex, especially since both return structured JSON differently. -Handling cross-origin (CORS) issues between Vercel (frontend) and Render (backend). -Managing file uploads and large image data efficiently. -Ensuring that OCR worked accurately on both handwritten and printed equations

Accomplishments that we're proud of

Successfully built a fully deployed end-to-end pipeline where users can upload an image and instantly receive a clean text transcription. -Integrated AI-based keyword extraction to make the output more meaningful and interactive. -Designed a modern, accessible UI with lightbox previews, live saving, and a profile library. -Deployed both backend and frontend in production (Render + Vercel).

What we learned

How to bridge frontend and backend communication using APIs hosted on different platforms. -The power of combining OCR with AI-based summarization to make raw text more useful. -How to deploy and manage cloud-hosted web apps under real-world constraints (timeouts, build commands, environment variables).

What's next for Textify

-Adding support for more file types (Word, PowerPoint, handwritten notes). -Implementing text-to-speech so users can listen to extracted notes. -Building AI-powered question-answering directly on the extracted content. -Improving accuracy and speed with a fine-tuned OCR pipeline. -Expanding the platform into a mobile version for quick note scanning.