Skip to content

colin-mclaughlin/visulo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Visulo - AI-Powered Study Coach

Visulo is a voice-first AI tutor that sees what you're studying. Capture screenshots and voice notes, then ask questions about your content using real AI models.

Features

  • πŸ“Έ Screenshot Capture: Capture any screen content with OCR text extraction
  • 🎀 Voice Notes: Push-to-talk voice recording with transcription
  • 🧠 AI-Powered Answers: Get grounded answers from your captured content
  • πŸ” Semantic Search: Find relevant information using vector embeddings
  • βš™οΈ Flexible Providers: Choose between local and cloud AI providers

AI Provider Options

Embedding Providers

  • Local Python (Default): Uses sentence-transformers locally (privacy-focused)
  • OpenAI: Cloud-based embeddings (requires API key)
  • Stub: Development/testing mode

LLM Providers

  • Ollama (Default): Local LLM via Ollama (privacy-focused)
  • OpenAI GPT-3.5: Cloud-based LLM (requires API key)
  • Stub: Development/testing mode

Quick Start

Prerequisites

Installation

  1. Clone the repository

    git clone <repository-url>
    cd visulo
  2. Install dependencies

    npm install
  3. Set up local embeddings (optional but recommended)

    python setup_local_embeddings.py
  4. Set up Ollama (optional but recommended)

    # Install Ollama from https://ollama.ai
    # Then pull a model:
    ollama pull llama3
  5. Run the development server

    npm run tauri:dev

Configuration

Using Local Providers (Privacy-Focused)

  1. Local Embeddings:

    • Run python setup_local_embeddings.py to install sentence-transformers
    • In Settings β†’ Embedding Provider β†’ Select "Local Python"
  2. Local LLM:

    • Install Ollama
    • Pull a model: ollama pull llama3
    • In Settings β†’ LLM Provider β†’ Select "Ollama"

Using OpenAI (Cloud-Based)

  1. Get an OpenAI API key from OpenAI Platform
  2. In Settings β†’ Enter your API key
  3. Select "OpenAI" for both Embedding and LLM providers

Usage

Basic Workflow

  1. Capture Content:

    • Click the camera button or press F9 for screenshots (works globally)
    • Hold the microphone button or press Ctrl+Space for voice notes
  2. AI Processing:

    • OCR extracts text from screenshots
    • Content is automatically indexed with embeddings
    • AI generates contextual answers
  3. Ask Questions:

    • Your captured content becomes searchable
    • AI provides grounded answers with citations
    • View source snippets for each answer

Keyboard Shortcuts

  • F9: Quick screenshot (works globally, even when app is minimized)
  • Ctrl+Space: Push-to-talk (hold to record)

Architecture

Backend (Rust/Tauri)

  • Capture System: Screenshot capture, OCR processing
  • Indexing Service: Text chunking, embedding generation, SQLite storage
  • Retrieval Service: Semantic search, answer generation
  • Provider System: Modular AI providers (local/cloud)

Frontend (React/TypeScript)

  • Capture Interface: Screenshot and voice capture UI
  • History Sidebar: Real-time processing indicators
  • Settings Panel: Provider configuration
  • Toast System: User feedback and notifications

Data Flow

Screenshot/Voice β†’ OCR/Transcription β†’ Text Chunks β†’ Embeddings β†’ SQLite
                                                                      ↓
Query β†’ Query Embedding β†’ Similarity Search β†’ Top-K Chunks β†’ LLM β†’ Grounded Answer

Development

Project Structure

visulo/
β”œβ”€β”€ src/                    # React frontend
β”‚   β”œβ”€β”€ components/         # UI components
β”‚   └── styles/            # Tailwind CSS
β”œβ”€β”€ src-tauri/             # Rust backend
β”‚   └── src/               # Tauri application logic
β”œβ”€β”€ public/                # Static assets
└── setup_local_embeddings.py  # Setup script

Building

# Development
npm run tauri:dev

# Production build
npm run tauri:build

Adding New Providers

  1. Embedding Provider:

    • Add variant to EmbeddingProvider enum
    • Implement in EmbeddingService::generate_embedding()
    • Update frontend settings UI
  2. LLM Provider:

    • Add variant to LLMProvider enum
    • Implement in LLMService::generate_grounded_answer()
    • Update frontend settings UI

Troubleshooting

Local Embeddings Issues

  • Ensure Python 3.7+ is installed
  • Run python setup_local_embeddings.py to install dependencies
  • Check that sentence-transformers is properly installed

Ollama Issues

  • Ensure Ollama is running: ollama serve
  • Check available models: ollama list
  • Pull a model if needed: ollama pull llama3

OpenAI Issues

  • Verify API key is correct and has credits
  • Check network connectivity
  • Ensure API key has appropriate permissions

General Issues

  • Check console logs for detailed error messages
  • Verify all dependencies are installed
  • Try restarting the application

Privacy & Security

  • Local Mode: All processing happens on your device when using Local Python + Ollama
  • API Keys: Stored locally, never transmitted except to respective services
  • Data Storage: All captures stored locally in SQLite database
  • No Telemetry: No usage data is collected or transmitted

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

License

[Add your license here]

Support

For issues and questions:

  • Check the troubleshooting section above
  • Review console logs for error details
  • Open an issue on GitHub with detailed information

Note: This is an early version. Some features may be experimental or require additional setup.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors