Inspiration

Finding meaningful connections in data is tough. Existing tools often fall short, especially when explaining why a correlation exists. Data Disco simplifies this by using a smart analysis engine to reveal hidden trends across datasets.

What it does

Data Disco helps you uncover correlations in your data. Upload your datasets, and our engine analyzes them, provides insights, and calculates correlation scores. Interactive graphs let you visualize your data and explore the findings.

How we built it

We built Data Disco using a modern tech stack:

  1. Frontend (Plotly): Interactive data visualization and user interface.
  2. Backend (FastAPI/Python): Handles data processing, correlation analysis, and API calls.
  3. Database (MongoDB): Stores data and analysis results.
  4. AI Integration (Gemini API): Provides AI-driven insights.
  5. Data Handling (CSV npm module, Python's csv library): Parses and processes various data formats.

Data is sent from the frontend to the backend using Axios. The backend uses Python's built-in csv library for parsing and pymongo to interact with MongoDB. FastAPI endpoints handle calls to the Gemini API, which provides the insights. Our pairing algorithm helps visualize the datasets in graphs, empowering users and statisticians to explore relationships.

Challenges we ran into

Working with new technologies like the Gemini API was a challenge. Time management was also a factor; we spent a bit too long deciding on a specific challenge to tackle. Bringing together a team with diverse backgrounds required extra effort in explaining new concepts and tools. Data handling, while simplified by the csv npm module on the frontend and Python's built-in csv library on the backend, still required careful consideration for various data formats and potential inconsistencies.

Accomplishments that we're proud of

We built a functional platform that can identify and explain correlations across datasets. We're proud of integrating Gemini for AI-powered insights and creating a visualization tool that helps users understand complex relationships. The fact that we have a working prototype despite the challenges is a significant achievement.

What we learned

This project gave us hands-on experience with Plotly for visualization and Google's Gemini API. We learned about the complexities of integrating AI into a data analysis workflow and the importance of clear communication within a diverse team. We also gained valuable experience in building a full-stack application with a modern tech stack, including data handling with CSV files, backend development with FastAPI, database integration with MongoDB, and API interactions.

What's next for Data Disco

We plan to expand the range of accepted datasets beyond CSV files. We also want to improve the accuracy and depth of the AI-generated insights and refine the user interface based on feedback.

Share this project:

Updates