Skip to content

aaronjshelen/connect-the-docs

Repository files navigation

Document Knowledge Graph Analyzer

An interactive 3D knowledge graph visualization tool that analyzes document relationships using AI-powered semantic analysis. Upload multiple documents and discover connections, shared themes, and conceptual relationships visualized in an immersive 3D graph.

Document Knowledge Graph License

๐ŸŒŸ Features

๐Ÿ“Š Interactive 3D Visualization

  • Force-directed graph layout with optimized node spacing
  • Hover interactions - nodes scale up and glow when hovered
  • Click to focus - smooth camera animation zooms to selected nodes
  • Orbit controls - rotate and zoom around focused nodes
  • Auto-rotate mode - automated graph rotation for presentations

๐Ÿค– AI-Powered Analysis

  • OpenAI GPT-4o-mini integration for intelligent document analysis
  • Dynamic theme extraction - no predefined categories, analyzes ANY content
  • Semantic similarity using OpenAI embeddings (text-embedding-3-small)
  • Relationship detection - automatically finds conceptual connections
  • AI Chatbot - ask questions about your document relationships

๐Ÿ“ˆ Advanced Analytics

  • Optimized storage system - O(1) lookups with bidirectional indexing
  • Theme hierarchies - identifies main themes and subthemes
  • Connection strength scoring - quantifies relationship strength
  • Shared theme analysis - finds documents with common concepts
  • Real-time search - filter nodes by document or theme names

๐ŸŽจ Modern UI

  • Dark theme with glassmorphism effects
  • Responsive sidebar with custom scrollbars
  • Live processing progress indicators
  • Detailed node information overlays
  • Hover tooltips for quick node identification

๐Ÿš€ Quick Start

Prerequisites

  • Modern web browser (Chrome, Firefox, Edge, Safari)
  • OpenAI API key

Installation

  1. Clone the repository

    git clone https://github.com/YOUR_USERNAME/document-knowledge-graph.git
    cd document-knowledge-graph
  2. Configure your API key

    Edit config.js and add your OpenAI API key:

    window.CONFIG = {
      OPENAI_API_KEY: "your-api-key-here",
    };
  3. Open the application

    Simply open enhanced-analyzer.html in your web browser!

    No build process required - this is a pure client-side application.

๐Ÿ“– Usage

1. Upload Documents

  • Click "Choose Files" button
  • Select multiple text files (.txt)
  • Files are loaded instantly in the browser

2. Process Documents

  • Click "Analyze Documents" button
  • AI analyzes themes, definitions, and relationships
  • Progress updates show analysis stages

3. Explore the Graph

  • Hover over nodes to see quick info
  • Click nodes to focus and view details
  • Drag to rotate around focused node
  • Scroll to zoom in/out
  • ๐Ÿ  Reset button returns to overview

4. Search & Filter

  • Use search bar to find specific documents/themes
  • Click search results to focus on nodes

5. Chat with AI

  • Ask questions about your documents
  • Example: "What themes do these documents share?"
  • AI has full context of your graph analysis

๐ŸŽฎ Controls

Action Control
Rotate camera Drag with mouse
Zoom in/out Scroll wheel
Select node Click node
Reset view ๐Ÿ  button
Toggle auto-rotate ๐Ÿ”„ button
Manual zoom โž• โž– buttons

๐Ÿ—๏ธ Architecture

Core Components

theme-storage-system.js

Optimized data structure for O(1) lookups:

  • Bidirectional Maps for themes โ†” documents
  • Efficient relationship queries
  • Memory-optimized storage

semantic-analyzer.js

AI-powered semantic analysis:

  • OpenAI embeddings integration
  • Cosine similarity calculations
  • Conceptual connection detection

enhanced-prompts.js

Structured LLM prompts for:

  • Theme extraction (no predefined categories)
  • Definition identification
  • Relationship analysis

document-connection-analyzer.js

Main orchestrator that:

  • Processes documents through AI pipeline
  • Builds optimized graph structure
  • Generates analysis reports

enhanced-analyzer.html

React + Three.js frontend:

  • 3D graph visualization
  • Interactive UI components
  • Real-time state management

๐Ÿ”ง Configuration

Edit the config object in enhanced-analyzer.html (lines 476-483):

{
  minThemeConfidence: 0.7,           // Minimum confidence for theme extraction
  maxThemesPerDocument: 8,            // Maximum themes per document
  semanticSimilarityThreshold: 0.7,   // Threshold for semantic connections
  connectionStrengthThreshold: 0.4,   // Minimum connection strength
  enableSemanticAnalysis: true,       // Enable embeddings analysis
  enableHierarchicalThemes: true      // Enable theme hierarchies
}

๐Ÿ“ Project Structure

document-knowledge-graph/
โ”œโ”€โ”€ enhanced-analyzer.html          # Main application (React + Three.js)
โ”œโ”€โ”€ config.js                       # API key configuration
โ”œโ”€โ”€ theme-storage-system.js         # Optimized storage data structure
โ”œโ”€โ”€ semantic-analyzer.js            # AI semantic analysis engine
โ”œโ”€โ”€ enhanced-prompts.js             # LLM prompt templates
โ”œโ”€โ”€ document-connection-analyzer.js # Main analysis orchestrator
โ”œโ”€โ”€ GRAPH_CONTROLS_GUIDE.md        # User control guide
โ”œโ”€โ”€ README.md                       # This file
โ””โ”€โ”€ test files/                     # Sample documents for testing
    โ”œโ”€โ”€ beyblades.txt
    โ”œโ”€โ”€ climate_change_and_fish_migration_patterns.txt
    โ”œโ”€โ”€ Energy_Analytics_and_Grid_Optimization.txt
    โ”œโ”€โ”€ Marine_Biodiversity_and_Coral_Reef_Fish.txt
    โ”œโ”€โ”€ Solar_Forecasting_Trends.txt
    โ”œโ”€โ”€ sustainable_fisheries_and_ocean_conservation.txt
    โ””โ”€โ”€ Wind_Integration_Challenges.txt

๐ŸŽฏ Use Cases

  • Research Analysis - Find connections between academic papers
  • Content Organization - Understand relationships in documentation
  • Knowledge Discovery - Uncover hidden themes across documents
  • Literature Review - Visualize research landscapes
  • Documentation Management - Organize technical documentation
  • Creative Writing - Track themes and character relationships

๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

๐Ÿ“ TODO

  • Adjust LLM prompt for better responses
    • Improve theme extraction accuracy
    • Better handling of edge cases
    • More nuanced relationship detection
  • Fix UI
    • Improve responsive design for mobile
    • Better error message styling
    • Enhanced loading states
    • Accessibility improvements (ARIA labels, keyboard navigation)

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • OpenAI - GPT-4o-mini and text-embedding-3-small APIs
  • Three.js - 3D graphics library
  • React - UI framework

๐Ÿ“ž Support

For questions or issues, please open an issue on GitHub.


Built with โค๏ธ using React, Three.js, and OpenAI

About

Project for CalHacks '25

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors