🎧 EduHacks AI Note-Generator — Backend (Flask + Gemini) & Frontend (HTML,CSS & JS)

An intelligent AI Note-Generator that extracts, summarizes, translates, and speaks back notes from audio — powered by Flask, Gemini API, and AI-powered ETL (Extract–Transform–Load) processing.

🚀 Overview

EduHacks AI Note-Generator is a Flask-based backend application that automates lecture and meeting note-taking. You simply upload an audio file, and it performs a full AI-powered ETL pipeline:

🎤 Transcription — Converts speech to text.
🧠 Summarization — Generates bullet-point summaries using LexRank.
🗂 Flashcards — Creates intelligent Q&A flashcards.
🔊 Text-to-Speech (TTS) — Converts summaries back to audio.
🌍 Translation — Translates summaries using the Gemini API into your preferred language (default: Urdu 🇵🇰).

🧩 Tech Stack

Layer	Tools / Libraries
Backend	Flask, Flask-CORS, Werkzeug
AI & NLP	SpeechRecognition, Sumy, NLTK
Text-to-Speech	pyttsx3 (offline), gTTS (online fallback)
Translation	Google Gemini API
Audio Processing	pydub
Data Validation	Pydantic
Runtime	Python 3.9+

⚙️ Installation & Setup

1️⃣ Clone the Repository

git clone https:https://github.com/ck-ahmad/EduHacks_AI_Note_Creator.git
cd eduHacks-ai-note-taker

2️⃣ Create a Virtual Environment

python -m venv .venv
source .venv/bin/activate      # Windows: .venv\Scripts\activate

3️⃣ Install Dependencies

pip install -r requirements.txt

4️⃣ Configure Gemini API Key

Create a .env file (or use environment variable):

API_Key_2=YOUR_GEMINI_API_KEY

You can get your API key from:
🔗 https://aistudio.google.com/app/apikey

5️⃣ Download NLTK Data (First Run)

python -c "import nltk; nltk.download('punkt'); nltk.download('punkt_tab')"

6️⃣ Run the App

python app.py

Server runs on 👉 http://localhost:5000

🧠 API Endpoints

✅ Health Check

GET /health

Response:

{
  "status": "ok",
  "message": "EduHacks AI Note-Taker backend running"
}

🎤 Upload Audio (Full ETL Pipeline)

POST /api/upload

Form Data:

file: audio file (.mp3, .wav, .m4a, etc.)
lang: optional target language code (default "ur" for Urdu)

Response:

{
  "transcript": "...",
  "bullets": ["...", "..."],
  "flashcards": [
    {"question": "...", "answer": "..."}
  ],
  "translated_summary": "...",
  "files": {
    "transcript_txt": "/outputs/xxxx_transcript.txt",
    "summary_txt": "/outputs/xxxx_summary.txt",
    "flashcards_txt": "/outputs/xxxx_flashcards.txt",
    "translated_summary_txt": "/outputs/xxxx_summary_translated_ur.txt",
    "audio_summary": "/outputs/xxxx_summary.mp3"
  }
}

🎧 Outputs

Files are saved in the /outputs folder:

transcript.txt → raw speech-to-text output
summary.txt → bullet summary
flashcards.txt → generated Q&A pairs
summary_translated_ur.txt → translated summary
summary.mp3 → AI-generated audio version

You can access them directly via:

GET /outputs/<filename>

🗣 Supported Audio Formats

.wav
.mp3
.m4a
.aac
.ogg

🌍 Translation with Gemini

The backend uses Gemini 2.5 Flash via the official google-genai SDK.

Example (translator.py):

from google import genai

class GeminiTranslator:
    def __init__(self, api_key, model="gemini-2.5-flash"):
        self.client = genai.Client(api_key=api_key)
        self.model = model

    def translate(self, text, target_language):
        prompt = f"Translate the following text to {target_language}:\n\n{text}\n\nOnly return translation."
        response = self.client.models.generate_content(model=self.model, contents=prompt)
        return response.text.strip()

🧾 Folder Structure

📦 eduHacks-ai-note-taker
│
├── app.py
├── translator.py
├── speech_to_text.py
├── summarizer.py
├── flashcards.py
├── tts.py
│
├── uploads/            # Uploaded audio files
├── outputs/            # Generated files (txt, mp3)
├── templates/
│   └── home.html       # Optional frontend
├── static/
│
├── requirements.txt
└── README.md

🧰 Requirements.txt

flask
flask-cors
werkzeug
pydub
SpeechRecognition
sumy
nltk
pyttsx3
gTTS
pydantic==2.8.2
google-genai

💡 Future Enhancements

🔹 Multi-language UI (frontend translation toggle)
🔹 Database support (store notes, summaries, and metadata)
🔹 Real-time transcription (WebSocket)
🔹 User authentication & dashboards
🔹 Audio segmentation for longer files
🔹 Front-end React or Next.js integration

🧑‍💻 Contributors

Name	Role
Ahmad	Developer & ML Integrator
Aizazullah	Assistant in Frontend Development

🏁 License

This project is licensed under the MIT License — free for educational and personal use.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
EduHacks		EduHacks
DEMO.md		DEMO.md
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎧 EduHacks AI Note-Generator — Backend (Flask + Gemini) & Frontend (HTML,CSS & JS)

🚀 Overview

🧩 Tech Stack

⚙️ Installation & Setup

1️⃣ Clone the Repository

2️⃣ Create a Virtual Environment

3️⃣ Install Dependencies

4️⃣ Configure Gemini API Key

5️⃣ Download NLTK Data (First Run)

6️⃣ Run the App

🧠 API Endpoints

✅ Health Check

🎤 Upload Audio (Full ETL Pipeline)

🎧 Outputs

🗣 Supported Audio Formats

🌍 Translation with Gemini

🧾 Folder Structure

🧰 Requirements.txt

💡 Future Enhancements

🧑‍💻 Contributors

🏁 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎧 EduHacks AI Note-Generator — Backend (Flask + Gemini) & Frontend (HTML,CSS & JS)

🚀 Overview

🧩 Tech Stack

⚙️ Installation & Setup

1️⃣ Clone the Repository

2️⃣ Create a Virtual Environment

3️⃣ Install Dependencies

4️⃣ Configure Gemini API Key

5️⃣ Download NLTK Data (First Run)

6️⃣ Run the App

🧠 API Endpoints

✅ Health Check

🎤 Upload Audio (Full ETL Pipeline)

🎧 Outputs

🗣 Supported Audio Formats

🌍 Translation with Gemini

🧾 Folder Structure

🧰 Requirements.txt

💡 Future Enhancements

🧑‍💻 Contributors

🏁 License

About

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages