Inspiration
In the world of data engineering tools for open source machine learning engineers, there is no effective tool that acts as a search engine for various open source data sets. There is also no effective tool to remind machine learning engineers about datasets being modified. Our goal with DataGeddon is to solve these problems and have a scalable and practical solution.
What it does
DataGeddon is a search engine tool which parses through various open source datasets and shows the most appropriate search results of datasets. It also serves as a tool to send reminders about datasets which have been recently modified. This helps engineers that actively refer to datasets through API to account for any changes that are critical to systems.
How we built it
We used TypeScript and React for the frontend with react-router-dom package to connect to multiple pages. We used Twilio's API to handle SMS notifications for users, and CockroachDB for storing the data and modifying older data.
Challenges we ran into
We had problems setting up a custom select and multi-select component that were flexible enough for our Search Engine. We also had a hard engineering problem of getting the search engine's parser to effectively read through all the datasets that are open source.
Accomplishments that we're proud of
We were able to solve the primary source of problem, which is to make an effective search engine tool for Machine Learning Engineers. We were also about to work on a highly scalable platform in TypeScript on the Frontend and Backend.
What we learned
Data Engineering and Machine Learning requires a lot of attention to detail and with this project, we gained a lot of insight into what engineers are looking for. We also learned a lot about scalability when it comes to search engines and how to optimize our queries.
What's next for R&D Data Inventory
We want to incorporate OpenAI's API to generate custom datasets based on search results.
Built With
- cockroachdb
- css
- html
- javascript
- react
- twilio
- typescript
Log in or sign up for Devpost to join the conversation.