InvestiCAT

Inspiration

As students, we began by exploring everyday problems that lacked complete solutions. Our initial brainstorming led us to the concept of timeline visualization, though we weren't immediately certain how to apply it meaningfully. Through targeted research into professional workflows, we discovered a significant pain point in the legal industry: attorneys spend weeks manually creating case chronologies from discovery documents. This process involves cross-referencing hundreds of pages across contracts, emails, depositions, and court filings, work that is both time intensive and error prone. We realized this represented a perfect opportunity to apply AI automation. Rather than lawyers spending billable hours on document organization, they could focus on legal analysis and strategy. The timeline concept suddenly had clear purpose: transform scattered legal evidence into structured, interactive chronologies that preserve the precision and source attribution critical to legal work. This discovery shaped our mission to build InvestiCAT, not just another timeline tool, but a specialized solution for legal professionals facing document overload in complex cases.

What it does

InvestiCAT transforms scattered legal documents into intelligent, interactive timelines. The system:

Processes multiple document formats (PDF, DOCX) using advanced text extraction
Extracts timeline events automatically using OpenAI's API
Identifies key information including dates, participants, locations, and legal significance
Creates structured data compatible with Neo4j graph database for relationship mapping
Generates exportable timelines suitable for legal briefs and court presentations
Enables intelligent timeline querying through CedarOS-powered conversational interface that answers questions like "What happened between January and March?" or "Who was involved in the contract negotiations?"
Offers natural language interaction allowing legal professionals to explore timeline data conversationally rather than manually searching through events

How we built it

Backend Architecture: Document Processing: Python based ETL pipeline using pdfplumber and python-docx for text extraction AI Event Extraction: An OpenAI model with custom prompts optimized for legal document analysis Data Structure: Neo4j knowledge graph database with nodes for Documents, Events, Dates, Locations, Entities, and Users API Layer: RESTful endpoints for document upload and timeline data retrieval

Frontend Development: Cedar Framework: React-based interface using Cedar-OS for advanced AI features Timeline Visualization: An Interactive timeline Filtering System: Real-time event filtering by document source, date range, and content Responsive Design: Optimized for desktop use in legal environments

Key Technical Decisions: Separated document-level processing from investigation management for scalable architecture Implemented fallback pattern matching for offline operation when OpenAI API is unavailable Used environment variables for secure API key management Designed schema to support multi-document investigations while processing individual files

Challenges we ran into

One main thing we struggled with was connecting all of the moving parts - An OpenAI model for event extraction, Python ETL for processing, Neo4J for graph storage, and CedarOS for the frontend - into a seamless system, all while keeping accuracy and precision with legal documents.

Accomplishments that we're proud of

A highlight for us was managing to successfully build a complete end-to-end pipeline in just a weekend. Seeing everything come together into a working demo showed us the real impact and potential, even, that this tool could have for legal professionals.

What we learned

Our understanding of how to effectively use LLMs in development grew significantly through both hands-on implementation and workshops throughout the hackathon. We also saw how combining AI with graph databases and visualization frameworks can create practical, trustworthy solutions for real industries. Most importantly, we learned a lot about the potential of AI frameworks and tools like CedarOS and Mastra in which agents can work behind the scenes to power tools and streamline workflows.

What's next for InvestiCAT

Moving forward, we will be adding more interactive visualizations for analysis, introduce collaboration features for legal teams, and expand on AI-to-data interactions with CedarOS integrations. With the explainability and interpretability of Neo4J's graph database structure, developers can extend InvestiCAT with custom visualizations, integrate new data sources, and seamlesslylink LLMs to the graph for new context-aware features.