Visulo is a voice-first AI tutor that sees what you're studying. Capture screenshots and voice notes, then ask questions about your content using real AI models.
- πΈ Screenshot Capture: Capture any screen content with OCR text extraction
- π€ Voice Notes: Push-to-talk voice recording with transcription
- π§ AI-Powered Answers: Get grounded answers from your captured content
- π Semantic Search: Find relevant information using vector embeddings
- βοΈ Flexible Providers: Choose between local and cloud AI providers
- Local Python (Default): Uses sentence-transformers locally (privacy-focused)
- OpenAI: Cloud-based embeddings (requires API key)
- Stub: Development/testing mode
- Ollama (Default): Local LLM via Ollama (privacy-focused)
- OpenAI GPT-3.5: Cloud-based LLM (requires API key)
- Stub: Development/testing mode
- Node.js (v16 or higher)
- Rust (latest stable)
- Python 3.7+ (for local embeddings)
-
Clone the repository
git clone <repository-url> cd visulo
-
Install dependencies
npm install
-
Set up local embeddings (optional but recommended)
python setup_local_embeddings.py
-
Set up Ollama (optional but recommended)
# Install Ollama from https://ollama.ai # Then pull a model: ollama pull llama3
-
Run the development server
npm run tauri:dev
-
Local Embeddings:
- Run
python setup_local_embeddings.pyto install sentence-transformers - In Settings β Embedding Provider β Select "Local Python"
- Run
-
Local LLM:
- Install Ollama
- Pull a model:
ollama pull llama3 - In Settings β LLM Provider β Select "Ollama"
- Get an OpenAI API key from OpenAI Platform
- In Settings β Enter your API key
- Select "OpenAI" for both Embedding and LLM providers
-
Capture Content:
- Click the camera button or press
F9for screenshots (works globally) - Hold the microphone button or press
Ctrl+Spacefor voice notes
- Click the camera button or press
-
AI Processing:
- OCR extracts text from screenshots
- Content is automatically indexed with embeddings
- AI generates contextual answers
-
Ask Questions:
- Your captured content becomes searchable
- AI provides grounded answers with citations
- View source snippets for each answer
F9: Quick screenshot (works globally, even when app is minimized)Ctrl+Space: Push-to-talk (hold to record)
- Capture System: Screenshot capture, OCR processing
- Indexing Service: Text chunking, embedding generation, SQLite storage
- Retrieval Service: Semantic search, answer generation
- Provider System: Modular AI providers (local/cloud)
- Capture Interface: Screenshot and voice capture UI
- History Sidebar: Real-time processing indicators
- Settings Panel: Provider configuration
- Toast System: User feedback and notifications
Screenshot/Voice β OCR/Transcription β Text Chunks β Embeddings β SQLite
β
Query β Query Embedding β Similarity Search β Top-K Chunks β LLM β Grounded Answer
visulo/
βββ src/ # React frontend
β βββ components/ # UI components
β βββ styles/ # Tailwind CSS
βββ src-tauri/ # Rust backend
β βββ src/ # Tauri application logic
βββ public/ # Static assets
βββ setup_local_embeddings.py # Setup script
# Development
npm run tauri:dev
# Production build
npm run tauri:build-
Embedding Provider:
- Add variant to
EmbeddingProviderenum - Implement in
EmbeddingService::generate_embedding() - Update frontend settings UI
- Add variant to
-
LLM Provider:
- Add variant to
LLMProviderenum - Implement in
LLMService::generate_grounded_answer() - Update frontend settings UI
- Add variant to
- Ensure Python 3.7+ is installed
- Run
python setup_local_embeddings.pyto install dependencies - Check that sentence-transformers is properly installed
- Ensure Ollama is running:
ollama serve - Check available models:
ollama list - Pull a model if needed:
ollama pull llama3
- Verify API key is correct and has credits
- Check network connectivity
- Ensure API key has appropriate permissions
- Check console logs for detailed error messages
- Verify all dependencies are installed
- Try restarting the application
- Local Mode: All processing happens on your device when using Local Python + Ollama
- API Keys: Stored locally, never transmitted except to respective services
- Data Storage: All captures stored locally in SQLite database
- No Telemetry: No usage data is collected or transmitted
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
[Add your license here]
For issues and questions:
- Check the troubleshooting section above
- Review console logs for error details
- Open an issue on GitHub with detailed information
Note: This is an early version. Some features may be experimental or require additional setup.