Inspiration

Inspired by a LinkedIn Article discussing the importance of AI-human collaboration to pioneer future AI models, and my dad, who discussed a need for AI to verify the data is was being fed.

What it does

SafeData is a tiered-based database that stores AI training data and an API designed to return an array of information based on inputs requested by developers.

How we built it

It was built on the Atlas MongoDB platform, which has a revolutionary Vector Search algorithm that allows for rapid retrieval of information perfect for AI training. This project is intended to be utilized by other developers, and thus no front-end was considered. Rather it is a read-only API that developers can submit queries to access data.

Challenges we ran into

Originally, the intent was to train an open-source LLM, however, due to the amount of data required and the limited timeframe, this demonstration idea was scrapped to prioritize the perfection of the concept and the database design.

Accomplishments that we're proud of

I am proud of the idea, while some may consider the visualization dull, I hope other developers will hold this concept as valuable as I do.

What we learned

I learned a lot about how LLMs are designed, and how industry-leading AI models such as Google Gemini and ChatGPT generate responses from the ground up.

What's next for SafeData

SafeData intends to grow with AI models, as OpenAI teased in September of 2023, ChatGPT intends to directly gather information in real-time by browsing the internet, SafeData intends to utilize its tiered-source database as a method for AIs to select credible data sources while browsing the internet to fulfill queries.

Share this project:

Updates