Skip to content

YuvamKumarSWE/StudyForgeAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

100 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿš€ StudyForgeAI: Intelligent Multi-Source Study Guide Generator

Executive Overview

StudyForgeAI is a sophisticated AI-powered educational platform that revolutionizes how students and professionals synthesize knowledge from diverse information sources. This full-stack application leverages cutting-edge natural language processing and Google's Gemini 2.5 Flash Lite AI model to automatically generate comprehensive, well-structured study guides from PDFs, web articles, YouTube videos, and raw text inputsโ€”all through an elegant, modern web interface.


๐ŸŽฏ What It Does

Core Functionality

StudyForgeAI transforms the overwhelming task of consolidating learning materials into a seamless, intelligent workflow:

  1. Multi-Modal Content Ingestion: Users can upload PDFs, paste URLs to web articles, provide YouTube video links, and input raw textโ€”all simultaneously through an intuitive dashboard
  2. Intelligent Content Extraction: The system employs specialized extraction engines for each content type, preserving context and structure
  3. AI-Powered Topic Analysis: Using Google's Gemini AI, the platform identifies unique topics, removes redundancy, and intelligently categorizes information
  4. Automated Study Guide Generation: Produces professionally formatted markdown study guides with summaries, key points, detailed content, and table of contents
  5. Real-Time Processing: Comprehensive logging and error handling ensure transparent, reliable processing with detailed progress tracking

๐Ÿ—๏ธ Technology Stack

Frontend Architecture (Modern React Ecosystem)

  • React 19.2.0 - Latest React with concurrent features and automatic batching
  • Vite 7.2.2 - Lightning-fast build tool with Hot Module Replacement (HMR)
  • Material-UI (MUI) 7.3.5 - Enterprise-grade component library with custom theming
  • React Router DOM 7.9.6 - Client-side routing with protected route authentication
  • Axios 1.13.2 - HTTP client with custom interceptors for API communication
  • Tailwind CSS 4.1.17 - Utility-first CSS framework for responsive design
  • ESLint - Code quality and consistency enforcement

Frontend Highlights:

  • Protected routing with session-based authentication
  • Custom axios interceptors for centralized error handling and request/response transformation
  • Responsive Material Design system with dark mode support via custom theming
  • Real-time loading states with animated overlays and smooth scroll-to-output functionality
  • Modular component architecture promoting reusability and maintainability

Backend Architecture (Production-Grade Python)

  • FastAPI - Modern, high-performance asynchronous web framework
  • Uvicorn - Lightning-fast ASGI server with async support
  • Pydantic 2.5.0 - Data validation using Python type annotations
  • Python-dotenv - Environment variable management for secure configuration
  • CORS Middleware - Configurable cross-origin resource sharing for secure frontend-backend communication

Content Extraction Powerhouses:

  • PyMuPDF (fitz) - High-performance PDF text extraction with page-level granularity
  • Trafilatura - State-of-the-art web scraping with automatic main content detection
  • YouTube Transcript API - Multi-language transcript extraction with automatic fallback to generated captions

AI & ML:

  • Google GenAI (Gemini 2.5 Flash Lite) - Advanced language model for topic extraction and content synthesis with AI-driven deduplication
  • Custom retry logic with exponential backoff and intelligent rate limit handling
  • Token-aware processing to optimize API usage and prevent quota exhaustion

๐Ÿง  The AI Pipeline: A Deep Dive

Phase 1: Multi-Source Content Extraction

The system processes each source sequentially with comprehensive error handling and detailed logging:

PDF Processing:

  • Asynchronous file upload handling via FastAPI's UploadFile
  • Page-by-page extraction with PyMuPDF preserving formatting
  • Automatic whitespace normalization and paragraph detection
  • Per-page error isolation ensuring partial extraction on corrupted PDFs

Web Article Extraction:

  • Trafilatura-powered content extraction with automatic main content detection
  • Metadata extraction (title, author, publication date)
  • Fallback mechanisms for sites with complex HTML structures
  • URL validation and protocol enforcement

YouTube Transcript Extraction:

  • Video ID extraction via regex patterns supporting all YouTube URL formats
  • Multi-language transcript support with auto-generated fallback
  • Intelligent text formatting with paragraph segmentation every 6 entries
  • Handles private videos, disabled captions, and geo-restrictions gracefully

Raw Text Processing:

  • Direct text input with sanitization and validation
  • Support for structured notes and user annotations

Phase 2: Content Consolidation

All extracted content is concatenated into a unified document with basic normalization:

Combined Output โ†’ [PDF Text] + [Article Content] + [Video Transcripts] + [User Text]
                 โ†“
         Whitespace Normalization
                 โ†“
         Single Unified Document

Normalization includes:

  • Paragraph break normalization (consistent line spacing)
  • Excessive whitespace removal
  • Text structure preservation

Request Tracking: Each processing request receives a unique request_id (timestamp-based) for complete traceability through centralized logging

Phase 3: AI-Powered Topic Extraction & Deduplication

Gemini 2.5 Flash Lite analyzes the consolidated content using advanced prompt engineering to identify unique topics and remove redundant information:

Prompt Engineering Strategy:

Role: Study guide assistant specialized in content deduplication
Task: 
  1. Identify all main topics covered
  2. Extract ALL unique text content related to each topic
  3. Remove ONLY exact duplicates or near-identical phrases
  4. Keep different explanations if they provide unique value
  5. Consolidate related information under appropriate topics
Output: JSON object {topic: unique_content}

Key Features:

  • AI-driven deduplication: The LLM intelligently identifies and removes duplicate content through prompt instructions (no algorithmic deduplication)
  • JSON extraction with markdown code block fallback parsing
  • Rate limiting with 2-second minimum delay between API calls
  • Intelligent retry logic extracting retry delays from API error messages
  • Exponential backoff with 5 retry attempts and up to 45-second waits for rate limits

Phase 4: Study Guide Synthesis

Single-API-Call Optimization: The entire study guide is generated in one comprehensive request to minimize latency and API costs

Adaptive Guide Generation:

  • Concise (< 2,000 characters): Quick overviews with essential points
  • Standard (2,000-10,000 characters): Balanced depth and breadth
  • Comprehensive (> 10,000 characters): Exhaustive coverage with detailed analysis

Structured Output:

{
  "overview": "2-3 sentence introduction",
  "topics": [
    {
      "topic": "Topic Name",
      "original_content": "Full content",
      "summary": "2-3 sentence summary",
      "key_points": ["point 1", "point 2", ...]
    }
  ],
  "metadata": {
    "total_topics": 5,
    "guide_type": "comprehensive",
    "content_length": 15000
  }
}

Phase 5: Markdown Formatting & Delivery

Professional Document Structure:

  • Title and overview section with metadata
  • Automatic table of contents with anchor links
  • Emoji-enhanced section headers for visual hierarchy
  • Detailed content with proper paragraph spacing
  • Topic separators and study success footer

Frontend Rendering:

  • Custom markdown parser supporting headers (H1-H3), lists, bold text, and horizontal rules
  • Responsive typography with Material-UI theming
  • Scrollable container with smooth animations
  • Success indicators and error handling with actionable messages

๐Ÿ”ฅ Standout Features

1. Enterprise-Grade Error Handling

  • Comprehensive logging system with colored console output (logger.py)
  • Timestamp-based request ID tracking across the entire pipeline
  • Detailed exception messages with actionable user guidance
  • Graceful degradation: if some sources fail, successfully processes remaining sources
  • Per-source error isolation with detailed logging (success/failure tracking)

2. Intelligent Rate Limit Management

  • Automatic retry delay extraction from API error messages
  • Global rate limiting preventing API quota exhaustion
  • User feedback on rate limit issues with retry suggestions

3. Security & Authentication

  • Environment-based password protection
  • Session-based authentication with protected routes
  • Optional user API key support for personal quota management
  • CORS configuration with environment-specific origins

4. Developer Experience

  • Modular service-oriented architecture (separation of concerns)
  • Type hints and Pydantic schemas for API contracts
  • Comprehensive logging at DEBUG, INFO, WARNING, and ERROR levels
  • Test suite with isolated test files for each service

5. User Experience

  • Real-time loading indicators with custom video overlays
  • Auto-scroll to output section on completion
  • Material Design system with consistent spacing and colors
  • API key input field with helpful documentation links
  • Collected items preview showing all sources before processing

๐Ÿ“Š Technical Achievements

โœ… Async/Await Architecture: Non-blocking I/O for file upload handling
โœ… Sequential Processing with Error Isolation: Per-source error handling preventing complete failures
โœ… AI-Driven Content Deduplication: Intelligent prompt engineering for LLM-based deduplication
โœ… Intelligent Prompt Engineering: Specialized prompts for topic extraction vs. synthesis
โœ… Single-Call Optimization: Entire study guide generated in one API request
โœ… Comprehensive Logging: 50+ log statements tracking every operation
โœ… Token Optimization: Adaptive guide complexity based on content length
โœ… Cross-Platform Support: Unified PDF handling for file paths and uploaded streams
โœ… Modular Design: 15+ reusable services and components
โœ… Production-Ready: Environment configuration, CORS setup, and deployment-ready structure


๐ŸŽ“ Impact

StudyForgeAI represents a paradigm shift in knowledge synthesis, transforming hours of manual note-taking and content organization into a streamlined, AI-powered workflow that takes minutes. By intelligently combining multiple content sources and leveraging state-of-the-art language models, this platform empowers learners to focus on understanding and retention rather than tedious consolidation tasks.

The sophisticated AI pipeline, enterprise-grade architecture, and polished user experience demonstrate mastery of full-stack development, advanced API integration, and modern web application best practicesโ€”making StudyForgeAI a standout portfolio piece showcasing both technical depth and practical utility.


๐Ÿš€ Future Plans

We're excited about where we're taking StudyForgeAI next.

1. Full User Authentication & Accounts

Students will be able to save guides, history, and personalized settings.

2. RAG-Powered In-Pipeline Model

We plan to integrate ChromaDB to store embeddings of user content.

This will:

  • Allow users to process much larger inputs
  • Speed up guide generation
  • Enable multi-session learning and fast retrieval

3. New Learning Tools

  • AI-generated flashcards
  • Personalized study plans
  • Topic quizzes
  • Progress tracking dashboards
  • Collaborative study features

Our long-term vision is to turn StudyForgeAI into a full AI study companion that not only generates notes, but actively helps students learn smarter every day.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors