Docuer

An AI-Powered, Adaptive & Context-Aware, TikTok-Style Personalized Learning Platform

Transform any documentation website or Google Drive content into bite-sized, interactive courses with intelligent knowledge graphs and adaptive learning paths.

Features

Core Capabilities

Documentation-to-Course Transformation: Automatically converts technical documentation into structured, personalized learning courses
AI-Personalized Content: Generates bite-sized articles (160 words max) tailored to your experience level, interests, and learning goals
Interactive Knowledge Graph: Visual Neo4j-powered graph showing topic relationships, prerequisites, and learning paths
Adaptive Learning Paths: AI-curated course sequences that adapt to your profile and progress
Progress Tracking & Analytics: Behavioral analytics tracking completion rates, time spent, quiz scores, and learning patterns
Interactive Quizzes: Auto-generated assessments with multiple difficulty levels for each article
Google Drive Integration: Import and learn from your personal Google Drive documents
TikTok-Style Interface: Swipeable, mobile-first learning experience for modern learners

Intelligent Features

Hash-Based Content Sharing: Multiple users accessing the same documentation share cached content (cost-efficient)
Semantic Relationship Detection: Automatically discovers connections between topics based on concept overlap
Two-Phase Content Crawling: AI recommends relevant pages based on your profile before full crawling
Dangling Node Auto-Connection: Ensures fully connected knowledge graphs with no isolated topics
Behavioral Learning Insights: Identifies your preferred topics, areas of struggle, and learning patterns
Rate-Limited AI Processing: Built-in throttling prevents API errors and manages costs

User Experience

5-Step Personalized Onboarding: Set experience level, goals, interests, and time commitment
Multiple Course Creation Modes:
- Simple URL mode (auto-crawl entire documentation)
- Advanced mode (AI-recommended page selection)
- Google Drive import
Real-Time Progress Dashboard: Track completion rates, streaks, and performance
AI Learning Assistant: Built-in chatbot for questions and learning support
Fullscreen Graph Navigation: Explore interconnected topics visually

How to Use

Getting Started

Login / Create Account
- Use demo accounts (Alice - beginner, Bob - advanced) or create your own
- Note: Current version uses prototype authentication
Complete Onboarding
- Select your experience level (Beginner, Intermediate, Advanced)
- Choose learning goals (Career advancement, Skill development, Personal interest, etc.)
- Pick interests (Web Development, Machine Learning, DevOps, etc.)
- Set time commitment (hours per week)
- Add a bio for additional personalization

Creating Courses

Method 1: Simple URL Course Creation

Navigate to "Courses" page
Click "Create New Course" or "+"
Enter documentation URL (e.g., https://docs.python.org)
Click "Create Course"
AI will automatically:
- Crawl all pages
- Extract topics and structure
- Generate personalized articles
- Build knowledge graph
- Create learning path

Method 2: Advanced Course Creation (Selective)

Click "Create Advanced Course"
Enter documentation URL
View AI-recommended pages based on your profile
Select/deselect pages you want to include
Click "Create Course with Selected Pages"
AI processes only selected content

Method 3: Google Drive Import

Click "Google Drive" integration
Connect your Google Drive account (OAuth)
Browse your Drive files
Select documents to import
Create course from selected documents

Learning Experience

Browse Courses: View all your created courses with progress indicators
Start Learning: Click on a course to begin
Navigate Articles:
- Read bite-sized 160-word articles
- Swipe or use navigation buttons
- Mark articles as complete
Take Quizzes: Test your knowledge with auto-generated questions
Explore Knowledge Graph:
- Click graph icon to view topic relationships
- Navigate by clicking nodes
- See prerequisites and related topics
- Fullscreen mode for detailed exploration
Track Progress: View analytics dashboard with:
- Completion percentage
- Learning streaks
- Quiz performance
- Time spent learning
- Preferred topics

Managing Content

Edit Course: Update course name, description, or metadata
Delete Course: Remove courses you no longer need
Sync Google Drive: Manually trigger sync for updated documents
Edit Profile: Update your learning preferences anytime

Setup

Prerequisites

Node.js: Version 20 or higher
npm or yarn: Package manager
Neo4j Database: AuraDB (cloud) or local instance
API Keys: Gemini, Firecrawl, Supermemory (see below)

Installation

Clone the Repository
```
git clone <repository-url>
cd docuer
```
Install Dependencies
```
npm install
# or
yarn install
```

Configure Environment Variables

Copy the example environment file:

cp .env.example .env

Fill in your API keys and credentials in .env:

# Firecrawl API for web scraping
FIRECRAWL_API_KEY=your_firecrawl_api_key_here

# Cohere API for topic extraction (optional fallback)
COHERE_API_KEY=your_cohere_api_key_here

# Google Gemini API for content generation
GEMINI_API_KEY=your_gemini_api_key_here

# Neo4j Database for knowledge graph
NEO4J_URI=neo4j+s://your-instance.neo4j.io
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_neo4j_password_here

# Supermemory API for user behavior tracking
SUPERMEMORY_API_KEY=your_supermemory_api_key_here
SUPERMEMORY_BASE_URL=https://api.supermemory.ai

Set Up Neo4j Database

Option A: Neo4j AuraDB (Recommended for Production)

Sign up at Neo4j AuraDB
Create a free instance
Copy connection URI, username, and password to .env

Option B: Local Neo4j

# Using Docker
docker run \
  --name neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/your_password \
  neo4j:latest

Set in .env:

NEO4J_URI=neo4j://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_password

Initialize Neo4j Schema (Optional)

The application will automatically create constraints and indexes on first run, but you can manually set up:

// Unique constraints
CREATE CONSTRAINT course_id IF NOT EXISTS FOR (c:Course) REQUIRE c.id IS UNIQUE;
CREATE CONSTRAINT topic_id IF NOT EXISTS FOR (t:Topic) REQUIRE t.id IS UNIQUE;
CREATE CONSTRAINT article_id IF NOT EXISTS FOR (a:Article) REQUIRE a.id IS UNIQUE;
CREATE CONSTRAINT user_id IF NOT EXISTS FOR (u:User) REQUIRE u.id IS UNIQUE;

// Indexes for performance
CREATE INDEX article_category IF NOT EXISTS FOR (a:Article) ON (a.category);
CREATE INDEX article_importance IF NOT EXISTS FOR (a:Article) ON (a.importance);
CREATE INDEX article_difficulty IF NOT EXISTS FOR (a:Article) ON (a.difficulty);

Run Development Server
```
npm run dev
# or
yarn dev
```
Open http://localhost:3000 in your browser.

Obtaining API Keys

Firecrawl API (Web Scraping)

Visit Firecrawl
Sign up for an account
Generate API key from dashboard
Free tier: 500 credits/month

Google Gemini API (AI Content Generation)

Visit Google AI Studio
Sign in with Google account
Create API key
Free tier: 1,500 requests/day (Gemini 2.0 Flash)

Supermemory API (Memory & Behavior Tracking)

Visit Supermemory
Sign up for developer account
Generate API key from settings
Note: Check current pricing/free tier

Cohere API (Optional - Fallback)

Visit Cohere
Sign up for account
Get API key from dashboard
Free tier: 100 requests/minute

Neo4j (Knowledge Graph Database)

Free tier: Neo4j AuraDB Free
Includes: 200k nodes, 400k relationships, 50MB storage

Production Deployment

Note: Current authentication is prototype-only. For production:

Implement Proper Authentication
- Replace hardcoded auth in lib/services/auth.ts
- Use NextAuth.js, Auth0, or similar
- Implement JWT or session-based auth
Security Hardening
- Add API rate limiting
- Implement CORS policies
- Use environment-based secrets management
- Add input validation and sanitization
- Enable HTTPS only
Deploy to Vercel (Recommended)
```
npm run build
vercel deploy
```
Set environment variables in Vercel dashboard.
Database Considerations
- Use Neo4j AuraDB for managed hosting
- Set up automated backups
- Configure connection pooling
Monitoring & Logging
- Add error tracking (Sentry, LogRocket)
- Monitor API usage and costs
- Set up performance monitoring

Technology Stack

Frontend

Framework: Next.js 16.0.1 (App Router)
UI Library: React 19.2.0
Component Library: HeroUI (Hero Icons UI)
Styling: Tailwind CSS 4
Animations: Framer Motion
State Management: Zustand with localStorage persistence
Graph Visualization: react-force-graph-2d
Markdown Rendering: react-markdown

Backend & Services

Runtime: Node.js 20+
API Routes: Next.js API Routes
Type Safety: TypeScript 5
Validation: Zod

External Services

AI Content Generation: Google Gemini 2.0 Flash
Web Scraping: Firecrawl
Knowledge Graph: Neo4j
Memory & Behavior: Supermemory
Fallback AI: Cohere (optional)

How Technologies Are Used

Supermemory (Memory & Behavior Tracking)

Purpose: Primary storage for documentation content, user behavior analytics, and Google Drive integration.

Key Responsibilities:

Documentation Caching: Stores scraped content using hash-based deduplication
- Shared containers for multi-user efficiency (doc_{hash})
- Reduces API costs by preventing redundant crawls
- Memories stored with source URLs and metadata
User Behavior Tracking: Records all learning actions
- Article views and completions
- Time spent on each article
- Quiz attempts and scores
- Bookmarks and favorites
- Navigation patterns
Learning Analytics: Analyzes behavioral data to identify:
- Preferred topics and learning styles
- Areas where user struggles
- Fast-learning patterns
- Optimal content difficulty
Google Drive Integration: Manages OAuth connections
- Stores connection credentials per user
- Tracks imported documents
- Monitors sync status and schedules
Profile Storage: Maintains user preferences
- Experience level, goals, interests
- Time commitment and learning schedule
- Content preferences

Container Strategy:

// Shared documentation (multi-user)
`doc_${hashUrl(documentationUrl)}`
// User-specific behavior
`user_${userId}``user_${userId}_course_${courseId}`
// Google Drive sync
`user_${userId}_gdrive_${connectionId}`;

API Integration:

Add memories: supermemory.add(content, containerTag, metadata)
Search memories: supermemory.search(query, containerTag)
Track actions: Custom behavior logging functions
Retrieve analytics: Query memories by action type and timestamp

Neo4j (Knowledge Graph Database)

Purpose: Stores course structure, topic relationships, and generates intelligent learning paths.

Schema Design:

Node Types:

Course: {id, name, description, sourceUrl, createdAt}
Topic: {id, name, description, category, importance, difficulty}
Article: {id, title, content, order, difficulty, estimatedTime, keywords}
User: {id, username, email, level, goals, interests}

Relationship Types:

CONTAINS: Course→Topic, Course→Article, Topic→Article
PREREQUISITE: Topic→Topic, Article→Article (directed, enforces learning order)
RELATED_TO: Topic↔Topic (undirected, with strength weight 0-1)
ENABLES: Reverse of PREREQUISITE (auto-created)
COMPLETED: User→Article (with timestamp, score, timeSpent)

Intelligent Features:

Semantic Connection Detection
- Analyzes topic names for similarity (edit distance, common words)
- Detects concept overlap using keyword matching
- Assigns relationship strength based on semantic closeness
- Auto-connects dangling nodes to prevent isolation

Personalized Learning Path Generation

// Scoring algorithm (pseudocode)
score = (
  0.35 * difficultyMatch(article.difficulty, user.level) +
  0.25 * interestMatch(article.keywords, user.interests) +
  0.25 * goalMatch(article.category, user.goals) +
  0.15 * article.importance
)

// Order by prerequisite depth (topological sort)
// Return top N articles matching user profile

Prerequisite Enforcement
- Topological sorting ensures correct learning order
- Locked articles until prerequisites complete
- Dynamic path updates based on completion
Knowledge Graph Analytics
- Identifies central topics (high betweenness centrality)
- Detects learning bottlenecks (many prerequisites)
- Suggests related content based on graph traversal

Query Patterns:

Create course structure: Batch node creation with relationships
Get learning path: Cypher query with user profile scoring
Mark completion: Create COMPLETED relationship with metadata
Find related articles: Graph traversal with relationship weights
Get progress: Count completed vs total articles per course

Google Gemini (AI Content Generation)

Model: Gemini 2.0 Flash (gemini-2.0-flash)

Purpose: Primary AI engine for content generation, topic extraction, and personalization.

Key Responsibilities:

Personalized Article Generation
- Input: Raw documentation + user profile (level, interests, goals)
- Output: 160-word bite-sized article tailored to user
- Prompt engineering: Adjusts complexity, examples, and tone based on profile
- Format: Structured markdown with key concepts highlighted
Topic Hierarchy Extraction
- Input: Scraped documentation from Supermemory
- Output: Hierarchical topic structure with categories
- Identifies: Main topics, subtopics, dependencies
- Replaces: Previous Cohere-based extraction (consolidated AI provider)
Knowledge Graph Generation
- Input: Extracted topics and content
- Output: Semantic relationships with strength scores
- Detects:
  - Prerequisites (Topic A must be learned before Topic B)
  - Related concepts (similar or complementary topics)
  - Difficulty progression (beginner → advanced)
- Assigns importance scores (1-10) per topic
Quiz Generation
- Input: Article content + difficulty level
- Output: 3-5 multiple choice questions
- Difficulty tiers:
  - Easy: Recall and recognition
  - Medium: Application and understanding
  - Hard: Analysis and synthesis
- Includes explanations for correct answers
Content Filtering & Recommendation
- Two-Phase Crawling: Analyzes documentation index
  - Scores pages based on user profile relevance
  - Recommends top N pages before full crawl
  - User can review and select pages
- Learning Path Selection: Chooses optimal article sequence
  - Considers user's current knowledge level
  - Balances difficulty progression
  - Aligns with stated learning goals
AI Chat Assistant
- Answers user questions during learning
- Provides additional context and examples
- Clarifies confusing concepts
- Suggests related articles

Rate Limiting Implementation:

// Built-in throttling to prevent API errors
const RATE_LIMIT = 9; // requests per minute
const RATE_WINDOW = 60000; // 1 minute in ms

// Automatic queuing and retry logic
// Prevents 429 errors and manages costs

API Integration:

Structured prompts with JSON schema responses
Error handling with fallbacks
Token usage optimization (160-word limit reduces costs)
Streaming for real-time chat responses

Firecrawl (Web Scraping)

Purpose: Robust, AI-powered web scraping for documentation websites.

Capabilities:

Single Page Scraping
- Extracts markdown and HTML content
- Handles JavaScript-rendered pages
- Retrieves metadata (title, description, keywords)
- Retry logic for failed requests
Site Mapping
- Crawls entire website to discover all URLs
- Respects robots.txt and sitemap.xml
- Returns structured list of pages with metadata
- Filters by patterns (e.g., only /docs/* pages)
Selective Crawling (Two-Phase Mode)
- Phase 1: Map site and get page previews
- Phase 2: User selects pages, then full scrape
- Reduces API usage for large documentation sites
Content Processing
- Cleans HTML and converts to markdown
- Preserves code blocks and formatting
- Extracts headings for topic detection
- Removes navigation and boilerplate

Use Cases in Docuer:

Simple course creation: Scrape all pages automatically
Advanced course creation: Map site → AI recommends → user selects → scrape
Content updates: Re-scrape changed pages
Google Drive alternative: For public documentation

API Integration:

// Single page scrape
firecrawl.scrapeUrl(url, { formats: ["markdown", "html"] });

// Site mapping
firecrawl.map(url, { includeSubdomains: false });

// Batch scraping
firecrawl.scrapeUrls(selectedUrls);

Error Handling:

Automatic retries on failure
Fallback to HTML if markdown extraction fails
Handles rate limits with exponential backoff

Cohere (Fallback AI)

Model: Command R

Purpose: Backup AI provider for topic extraction when Gemini or Supermemory unavailable.

Current Usage: Minimal

Legacy fallback for topic extraction
Most functionality migrated to Gemini for consistency
Maintained for redundancy and testing

Potential Use Cases:

A/B testing content generation quality
Cost optimization (cheaper model for simple tasks)
Geographic availability fallback

Architecture Overview

Data Flow: Course Creation

User Input (URL)
    ↓
Firecrawl (Scrape Pages)
    ↓
Supermemory (Cache Content with hash-based deduplication)
    ↓
Gemini (Extract Topics + Generate Personalized Articles)
    ↓
Neo4j (Build Knowledge Graph with Semantic Relationships)
    ↓
Zustand Store (Local State for UI)
    ↓
User Interface (Course Ready)

Data Flow: Learning Path Generation

User Profile (Level, Interests, Goals)
    ↓
Neo4j Query (Score Articles by Relevance)
    ↓
Topological Sort (Order by Prerequisites)
    ↓
Personalized Learning Path (Ordered Article IDs)
    ↓
User Interface (Display Sequential Articles)

Data Flow: Progress Tracking

User Completes Article/Quiz
    ↓
Neo4j (Mark COMPLETED Relationship)
    ↓
Supermemory (Log Behavior: timestamp, score, timeSpent)
    ↓
Analytics Aggregation (Both Sources)
    ↓
Dashboard (Completion %, Streaks, Quiz Scores, Insights)

Project Structure

docuer/
├── app/
│   ├── api/                          # API Routes
│   │   ├── articles/                 # Article management
│   │   ├── courses/                  # Course CRUD operations
│   │   │   ├── create/              # Simple course creation
│   │   │   ├── create-advanced/     # Two-phase course creation
│   │   │   └── learning-path/       # Personalized path generation
│   │   ├── quiz/                     # Quiz generation and submission
│   │   ├── integrations/             # External service integrations
│   │   │   └── google-drive/        # Google Drive OAuth and import
│   │   ├── analytics/                # User analytics
│   │   └── chat/                     # AI chatbot
│   ├── components/                   # React Components
│   │   ├── Layout.tsx               # Main app layout with sidebar
│   │   ├── GoogleDriveIntegration.tsx
│   │   ├── KnowledgeGraphVisualization.tsx
│   │   ├── Chatbot.tsx
│   │   └── QuizModal.tsx
│   ├── courses/                      # Course pages
│   │   └── [courseId]/
│   │       ├── page.tsx             # Course overview
│   │       └── [articleId]/page.tsx # Article viewer
│   ├── onboarding/                   # User onboarding flow
│   ├── profile/                      # User profile management
│   ├── login/                        # Authentication
│   └── page.tsx                      # Home page
├── lib/
│   ├── services/                     # External service clients
│   │   ├── auth.ts                  # Authentication (prototype)
│   │   ├── firecrawl.ts             # Web scraping
│   │   ├── gemini.ts                # AI content generation
│   │   ├── neo4j.ts                 # Knowledge graph database
│   │   ├── supermemory.ts           # Memory & behavior tracking
│   │   └── cohere.ts                # Fallback AI
│   ├── store/                        # State management
│   │   └── useStore.ts              # Zustand store
│   └── utils/                        # Utility functions
├── .env.example                      # Environment variables template
├── package.json                      # Dependencies
├── tsconfig.json                     # TypeScript configuration
├── tailwind.config.ts               # Tailwind CSS configuration
└── next.config.js                   # Next.js configuration

Development

Running Tests

npm run test        # Run unit tests
npm run test:e2e    # Run end-to-end tests

Linting & Formatting

npm run lint        # Run ESLint
npm run format      # Format with Prettier

Build for Production

npm run build       # Create optimized production build
npm run start       # Start production server

Known Limitations & Roadmap

Current Limitations

Authentication: Prototype-only with hardcoded users (not production-ready)
Rate Limits: Gemini limited to 9 requests/minute
Scraping: Some documentation sites may block Firecrawl
Mobile: Optimized for mobile but desktop experience needs refinement

Roadmap

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open a Pull Request

License

MIT License

Support

For issues, questions, or feature requests:

Open an issue on GitHub
Contact: [your-email@example.com]

Acknowledgments

Built with:

Next.js - React framework
Neo4j - Graph database
Google Gemini - AI content generation
Firecrawl - Web scraping
Supermemory - Memory and behavior tracking
HeroUI - UI components
Tailwind CSS - Styling

Made with ❤️ for modern learners

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.claude		.claude
.github		.github
app		app
docs		docs
lib		lib
public		public
.gitignore		.gitignore
README.md		README.md
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

Docuer

Features

Core Capabilities

Intelligent Features

User Experience

How to Use

Getting Started

Creating Courses

Method 1: Simple URL Course Creation

Method 2: Advanced Course Creation (Selective)

Method 3: Google Drive Import

Learning Experience

Managing Content

Setup

Prerequisites

Installation

Obtaining API Keys

Firecrawl API (Web Scraping)

Google Gemini API (AI Content Generation)

Supermemory API (Memory & Behavior Tracking)

Cohere API (Optional - Fallback)

Neo4j (Knowledge Graph Database)

Production Deployment

Technology Stack

Frontend

Backend & Services

External Services

How Technologies Are Used

Supermemory (Memory & Behavior Tracking)

Neo4j (Knowledge Graph Database)

Google Gemini (AI Content Generation)

Firecrawl (Web Scraping)

Cohere (Fallback AI)

Architecture Overview

Data Flow: Course Creation

Data Flow: Learning Path Generation

Data Flow: Progress Tracking

Project Structure

Development

Running Tests

Linting & Formatting

Build for Production

Known Limitations & Roadmap

Current Limitations

Roadmap

Contributing

License

Support

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages