Notemind

Main Screen
Note View

Inspiration

The inspiration for Notemind is coming from the need for an intelligent, auto-enhancing memory application that could seamlessly process diverse types of content such as links, files, images, audio, and videos. Existing tools often lack self-organization capabilities and struggle with providing powerful, context-aware searches, which motivated the creation of a solution to bridge this gap.

What it does

Notemind is a multimodal note-taking application designed to simplify and enhance how users manage their notes. Key functionalities include:

Converting audio to text using Azure OpenAI's Whisper model.
Extracting structured content from documents using Azure Document Intelligence.
Performing OCR and generating descriptive metadata for images.
Supporting note creation, editing, and management with markdown syntax.
Recording audio notes and storing them seamlessly.
Offering robust, AI-powered search across all media types using Azure Search and CosmosDB with vector embeddings.
Automatically generating metadata like tags and titles using LLMs to improve organization.
Enabling interaction with an AI chatbot that can retrieve and analyze information from notes conversationally.

How I built it

Frontend: Developed using React Typescript for a clean and interactive UI/UX experience.
Backend: Built with Python, leveraging Azure's suite of AI services:
Azure OpenAI for transcription and text generation.
Azure Document Intelligence for structured data extraction.
Azure Vision for OCR and image processing.
Azure Blob Storage for media management.
Azure AI Search and/or CosmosDB for vector-embedded, hybrid search.

Challenges I ran into

Time Constraints: Starting the project around December 20 meant a tight timeline, as the hackathon began on November 1.

Backend Code Quality: While functional, the backend could benefit from better structure, refactoring, and expanded AI functionalities. Future goal

Feature Completion: Limited time resulted in some planned features being postponed or only partially implemented.

Accomplishments that I'm proud of

UI/UX Design: Delivered a clean and user-friendly interface that aligns with the vision of an intelligent note-taking application.

Working Features: Successfully implemented extraction functionalities for documents, images, and metadata generation.

Hybrid Search: Integrated a robust search mechanism combining AI-driven vector embeddings and traditional indexing.

General Idea Realization: Created a working prototype that showcases the core concept and potential of Notemind.

What I learned

Github Copilot: I've discovered recent features of Copilot, they are great!

Azure Ecosystem: Gained in-depth knowledge of Azure’s AI services, including their capabilities and limitations.

Multimodal Integration: Learned how to combine diverse AI models and services to create a seamless user experience.

What's next for Notemind

Backend Refactoring: Improve code quality, modularity, and performance for easier scalability and maintenance.
Feature Completion: Finalize the audio transcription workflow and address any remaining issues with files extraction.
Advanced AI Agent: Enhance the AI chatbot with tools for more in-depth analysis and insights into notes.
Deployment Automation: Streamline deployment to Azure, making it a truly cloud-native solution.
Search & Analytics: Add advanced analytics and visualization for notes, improving knowledge retrieval and insights.
Enhanced Collaboration: Explore real-time collaboration features for teams and shared note management.
Multiuser and Security

Built With

ai
async
azure
fastapi
llm
python
react
typescript
vision
whisper

Updates

Michał Kurc started this project — Jan 06, 2025 04:52 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.