Inspiration
Inspired by a LinkedIn Article discussing the importance of AI-human collaboration to pioneer future AI models, and my dad, who discussed a need for AI to verify the data is was being fed.
What it does
SafeData is a tiered-based database that stores AI training data and an API designed to return an array of information based on inputs requested by developers.
How we built it
It was built on the Atlas MongoDB platform, which has a revolutionary Vector Search algorithm that allows for rapid retrieval of information perfect for AI training. This project is intended to be utilized by other developers, and thus no front-end was considered. Rather it is a read-only API that developers can submit queries to access data.
Challenges we ran into
Originally, the intent was to train an open-source LLM, however, due to the amount of data required and the limited timeframe, this demonstration idea was scrapped to prioritize the perfection of the concept and the database design.
Accomplishments that we're proud of
I am proud of the idea, while some may consider the visualization dull, I hope other developers will hold this concept as valuable as I do.
What we learned
I learned a lot about how LLMs are designed, and how industry-leading AI models such as Google Gemini and ChatGPT generate responses from the ground up.
What's next for SafeData
SafeData intends to grow with AI models, as OpenAI teased in September of 2023, ChatGPT intends to directly gather information in real-time by browsing the internet, SafeData intends to utilize its tiered-source database as a method for AIs to select credible data sources while browsing the internet to fulfill queries.
Log in or sign up for Devpost to join the conversation.