💡 Inspiration
In classrooms and online lectures, students often struggle to focus while taking notes, especially when content is lengthy, fast-paced, or in a different language. We wanted to create something that helps learners capture, summarize, and understand information effortlessly — even across language barriers. That’s how AI Note Generator was born — an intelligent system that transforms audio lectures into summarized, translated, and spoken notes with one click.
⚙️ What it does
AI Note Generator is an AI-powered ETL (Extract–Transform–Load) pipeline for educational audio. It allows users to: 🎙️ Upload lecture or meeting audio 🧠 Transcribe speech into text ✍️ Summarize it into bullet notes 📚 Generate flashcards for revision 🌍 Translate the notes into the user’s preferred language (via Gemini API) 🔊 Convert summaries back into speech for audio playback
In short — it’s your AI study assistant that listens, summarizes, and teaches you back!
🛠️ How we built it
We used:
Flask for the backend API and server
SpeechRecognition and pydub for speech-to-text processing
Sumy and NLTK for text summarization
Custom Flashcard Generator for extracting Q&A pairs
Google Gemini API for language translation
gTTS and pyttsx3 for text-to-speech conversion
Flask-CORS for frontend integration
Hosted locally with modular Python scripts for ETL steps
🚧 Challenges we ran into
Handling audio files of different formats and qualities
Balancing accuracy vs. speed in transcription and summarization
Integrating multiple AI components into a smooth pipeline
Managing Gemini API rate limits and consistent translation quality
Building an architecture that remains modular, scalable, and lightweight
🏆 Accomplishments that we're proud of
Developed a fully functional AI ETL pipeline within hackathon time limits
Integrated speech recognition, summarization, translation, and TTS seamlessly
Created a tool that promotes accessible learning — especially for multilingual users
Successfully built a Flask-based backend that can be extended with a frontend or mobile app
📚 What we learned
Working with AI language models (Gemini) for translation
Implementing ETL logic beyond data engineering — into audio workflows
Designing efficient pipelines for multi-step AI tasks
Improving collaboration and debugging modular AI systems under time pressure
🚀 What's next for AI Note Generator
🔹 Build a React-based frontend for real-time upload & visualization 🔹 Add voice-controlled commands (e.g., “summarize this lecture”) 🔹 Introduce note organization with tagging & cloud sync 🔹 Train a custom summarization model fine-tuned for educational data 🔹 Deploy on AWS or Hugging Face Spaces for global access
Built With
- css
- flask
- gemeni-api
- html
- javascript
- natural-language-processing
- nltk
- pydantic
- pydub
- python
- speech-recognizer
- werkzeug


Log in or sign up for Devpost to join the conversation.