Inspiration

The inspiration behind Pythangorean came from the frustration of modern file-sharing: you send documents, but nobody has time to truly read or absorb what’s inside. We wanted to build something beyond Dropbox—a platform where files themselves become living memories, instantly searchable, summarizable, and understandable by both humans and AI.

What it does

We discovered that advances in AI embeddings and vector databases (especially Chroma and open-source models like E5) now make it possible to turn any file into a semantic memory—allowing for instant natural language querying, contextual chat, and automatic summarization. This project taught us the power and limits of retrieval-augmented generation and the significance of persistent, stateful memory in collaborative workflows.

How I built it

We engineered a desktop app that converts documents into vector embeddings and stores them in Chroma. Each file generates a secure, shareable link. The recipient opens this link inside a conversational iframe—much like a "ChatGPT for your Dropbox"—where they can ask questions, receive AI summaries, extract key data, and chat with the memory of the document. Our stack combines Electron, Python FastAPI, Chroma, and React, with Microsoft E5 powering the embeddings.

Challenges I ran into

Key challenges included finding lossless, efficient chunking algorithms for diverse filetypes, optimizing retrieval speed at scale, and designing a chat interface that made memory both persistent and secure. We also grappled with privacy controls, ensuring shared memories were available only to intended users, and balancing usability with developer extensibility.

Accomplishments that I'm proud of

  • Built an end-to-end platform that transforms static files into living, conversational AI memories.
  • Engineered seamless integration between desktop, API, and web chat interfaces.
  • Enabled instant AI-powered summaries and natural language querying of any uploaded file.
  • Implemented secure, shareable links for intelligent file access without friction.
  • Leveraged state-of-the-art open-source tools (Chroma, E5 embeddings) for scalable semantic memory.
  • Created a persistent memory layer to track file-based conversations and insights.

What's next for Pythagorean

  • Expand file type support (images, spreadsheets, videos) for universal memory.
  • Optimize retrieval speed and lossless compression for larger datasets.
  • Release public APIs and SDKs for developers to build on our platform.
  • Add more advanced AI agents for summarization, Q&A, and data extraction.
  • Build team-focused features: shared workspaces, annotatable chats, enterprise access controls.
  • Prepare for launch and beta with real users, iterating based on feedback.

Built With

Share this project:

Updates