Inspiration

Every developer has code they shipped and can't explain anymore. You wrote the diff, merged the PR, reviewed the comments — and six months later you can't say why you made the decisions you did. Tutorials don't fix this. They teach general concepts, not the specific ones you already paid for by writing them. We wanted a tool that turns what you already built into something you actually retain. The best way to know if you understand something is to say it out loud — not type it, not recognize it from a list, but articulate it. So we built BananaDuck: it watches your merged PRs, pulls out the CS concept in each one, and quizzes you on it with your own voice, on a schedule that adapts to how well you know it.

What it does

BananaDuck is a spaced-repetition voice quiz built from the code you actually merged. You sign in with GitHub, hit Sync, and the app extracts concepts from your merged PRs — things like memoization, idempotency, backpressure — and turns each one into a quiz card tied to your specific implementation. When a card comes due, you speak your answer out loud, and the app transcribes it, grades it, and schedules your next review automatically. Every review gets a calendar block so you never have to remember to come back. The whole loop — sync, concept extraction, voice quiz, grading, rescheduling, calendar event — runs end-to-end against real services with no mocked steps.

How we built it

We built BananaDuck as a FastAPI backend with a Next.js frontend, connected through a set of sponsor integrations that each own a specific part of the pipeline. GitHub OAuth pulls your merged PRs. Bear-2 compresses each diff before it reaches Claude, cutting token cost by 35–62% per PR while preserving the variable names and structure that make the roasts specific. Claude extracts concepts from the compressed diff and grades your spoken answers. Deepgram transcribes the audio. Redis stores all quiz state and serves pre-cached cards so the quiz itself is instant. Poke schedules your next review on your calendar. Sentry traces the full pipeline so every PR's journey from diff to scheduled review is visible in one timeline.

Challenges

Getting idempotency right across syncs was harder than expected — a crashed sync had to know which PRs it had already processed without a state file, so we built a Redis-based hash checked before any API call. We also ran into a session bug where GitHub access tokens were silently dropped between the frontend and backend, causing the dashboard to fall back to mock data without any visible error. And mid-hackathon, we caught an IDOR vulnerability where calendar IDs were coming from the request body instead of resolving server-side — we fixed it and wrote a regression test before the demo.

Accomplishments that we're proud of

We're proud that the full product loop is real. Every integration — GitHub, Bear-2, Claude, Deepgram, Redis, Poke, Sentry — is live, not mocked, and every leg of the pipeline has been tested. We built graceful degradation into every step so that if any one service is unavailable, the rest of the loop still works. And we caught and shipped a real security fix during the hackathon, which felt like proof the codebase was being taken seriously.

What's next

We want to refine how concepts are extracted and how questions are written to get closer to the experience we imagined. Past the demo, SM-2 intervals flip back to days and we add a calendar view so your full review schedule is visible at a glance. Longer term, we want team workspaces — so a tech lead can see the concepts their whole org keeps re-shipping, and catch the recurring anti-patterns before they become habits.