Inspiration
I am a co-founder of a startup called Orgspace - we help all people leaders, HR and beyond, make better organizational decisions. It is imperative that our customers are able to leverage the power of AI to make better people decisions. However, our customers are ... rightly ... skeptical of sharing their entire base of people data with an outside AI provider, regardless of promises from said vendor about data security.
What it does
Private GPT operates on the principle of "give an AI a virtual fish, and they eat for a day, teach an AI to virtual fish, they can eat forever". Rather than send GPT-4 lots of data in order to provide context for answering questions, we do the following:
- Provide GPT-4 context in the form of a question to be asked that leverages a given description of data that it should answer a question about
- Ask GPT-4 to provide an algorithm that runs over the given data described that answers the question in code that we can run ourselves
- Run the code provided by GPT-4 over the actual data in a context of our choosing. In any production use case, this would likely be a sealed container that has no outside access so that any "breakout" code provided by the AI is rendered inert
- Take the resulting answer and original question back to GPT-4, ask it to provide a human readable answer.
How we built it
The app is a simple nextjs service running on vercel.
Challenges we ran into
When asking GPT-4 to recontextualize the answer, it sometimes would have some outside and outdated opinions about the answer based on the time cutoff on the data set. So, for example, using our COVID dataset that has more recent data than what GPT-4 knows about, it would sometimes dispute the answer. So we had to add "assume the answer provided is correct" during the recontextualization step.
In the category of dumb technical challenges, it is not as easy as it should be to infer structure of a large dataset without jumping through some hoops. Thankfully GPT-4 helped us with the algorithm that seems to do a decent job of that.
Example queries
Ask questions for covid data like "what region has the highest COVID exposure" Ask questions for organizational data like "who is the highest rated person in Python across my org"
Accomplishments that we're proud of
We are proud we've stumbled onto a means by which AI can be useful for any organization where keeping data private is a high order concern. This will make AI usable by a broader array of organizations and businesses for whom privacy is critical and/or data residency is important.
What we learned
- AI is phenomenal at solving problems that integrate understanding language and translating that understanding to idempotent code. Nobody should be hand writing queries in basic query languages inside of a year or less.
- Being up to date with recent information is critical to any AI being seen as trustworthy.
What's next for Private GPT
- We will be integrating this technique into our engine that answers general organizational questions like: -- "what is the average number of direct reports per manager", -- "who is the person most knowledgable about AI within 3 time zones of X" -- "who are the top 10% of people most at risk of burnout in my engineering org"
Built With
- node.js
- openai
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.