Inspiration
We’ve all been TAs or graders and know grading is tedious, repetitive, and inconsistent. Clicking through thousands of rubric items wastes time and increases errors, while universities face high grading costs. GradeMate was built to make grading faster, fairer, and cheaper.
What it does
GradeMate is a full-stack AI-powered homework grader that turns problem sets into precise, editable rubrics and then grades student submissions—typed or handwritten—in minutes. Users can upload a problem set; GradeMate parses the structure, generates a detailed scoring breakdown for each problem and subproblem, and applies it consistently across submissions. Crucially, instructors stay in control: they can review and edit rubrics and individual scores, so the AI accelerates the process without becoming a black box. The result is faster, fairer feedback for students and significant time back for instructors and TAs.
How we built it
We built GradeMate as a full‑stack prototype that pairs a Next.js/TypeScript frontend with a FastAPI Python backend and Google Gemini for the LLM work. On the backend we upload PDFs and connect them to the Gemini Model (gemini-2.5-flash-lite), then run a staged agent pipeline: a preprocessing agent converts assignment PDFs into strict JSON rubrics, an extraction agent turns student PDFs into faithful Markdown answers, and a grading agent returns per-item JSON scores with explanations and confidence values; these heavy LLM tasks run as FastAPI BackgroundTasks so endpoints stay responsive. The frontend polls assignment/submission endpoints to detect when background tasks finish, displays the student PDF side‑by‑side with editable rubric-driven grading cards, and lets instructors review, adjust, and persist grades—combining automation with transparent human-in-the-loop control.
Challenges we ran into
We discovered that real submission data is messy: inconsistent formatting, handwriting, and scanning artifacts pushed us to consider robust parsing strategies. Another challenge was dealing with how LLMs are non-deterministic, so prompt engineering and output validation were essential to keep results consistent for downstream use, too.
Accomplishments that we're proud of
We designed and engineered a fully functional prototype that features both a web interface and a backend, incorporating Gemini API calls for seamless integration. We’ve also tested it on 15+ diverse sets—Linear algebra, Discrete math, and Foundations of Computer Science etc—and 35+ real student submissions, delivering high accuracy and cutting grading effort by over 95%.
What we learned
Prompt design and output validation are crucial. LLMs need strict guardrails to be reliable in production. We also learned that workflow design matters as much as model quality—breaking grading into smaller, staged agents (preprocessing, extraction, scoring) dramatically improves accuracy compared to a single monolithic prompt. Building a smooth human-in-the-loop experience taught us that transparency is key for instructor trust; even highly accurate AI still needs clear explanations and easy overrides.
What's next for GradeMate
For the next steps, we plan to implement a multi-layer data structure, introducing a Class module where assignments belong. We are also trying to implement a feature to automatically highlight sections in student submissions that correspond to specific subquestions/rubric breakdowns. In addition, we plan to integrate user feedback loops so the system adapts to each instructor’s grading style, improving personalization and trust over time.
Built With
- fastapi
- gemini
- langchain
- next.js
- python
- tailwind
- typescript

Log in or sign up for Devpost to join the conversation.