Inspiration
During my time in AP Seminar, I had to research questions and split them up into multiple lenses and stakeholders. This program is the answer to that. It would automatically split up a large topic or question into smaller more specific topics and name the relationship between them. There are many tools (especially with the advent of Large Language models like GPT) that can provide a lot of data about a topic. But what is missing is the proper organization of the knowledge with entities and their relationships. I wanted to build a tool that can, given a topic, use various tools, especially AI, and create a knowledge graph.
What it does
This application will take the input from the user; it then uses AI to fetch Wikipedia articles similar to the input; it then calls GPT API to organize the results into a knowledge graph.
How I built it
- This application will take the input of a topic or a question from the user
- It will then use textai model, running on Hugging Face infrastructure, to vector search for Wikipedia articles that are similar to the topic or the question
- It will then call openAI GPT API with the fetched Wikipedia article provided as the prompt context (prompt engineering) to gpt3.5,
- gpt3.5 can recognize named entity and its relationship, and get the results back as JSON
- It will then call matplotlib and networkx library to display the JSON as a graph with nodes and edges
Challenges I ran into
- Both textai and GPT are prone to hallucination, and this affects the quality of the knowledge graph
- The application is dependent on the quality of the input (prompt) from the user. If the prompt is too specific, the similarity search will not get proper articles from Wikipedia. If the prompt is too general, textai and GPT will not provide accurate results
- OpenAI GPT API calls are not free, so I needed to be very cautious of the cost
Accomplishments that I am proud of
- The sheer volume of high-quality data that this tool could process is noteworthy. It has access to the entire Wikipedia and the latest LLM model to build the knowledge graph.
- It is able to organize the nodes and edges accurately and create a large network of relationships. This is very helpful to find the indirect relationships between nodes that are far removed from each other
- If one provides a good prompt, the accuracy of the vector similarity to identify the Wikipedia articles and the Named Entity and its Relationship was found to be very high.
What I learned
- I learned the technical details of different AI models (like GPT and TextAI)
- I am surprised by the power of LLM AI. A couple of years ago this project could not be completed.
- I learned how I could string together best-of-breed libraries and models, to get me the best results
What's next for AI LLM Generated Knowledge Graph
I spent most of the time developing the back end. If I had more time, then I would have developed a browser-based Verbwire-based Web3 UI where the user can input their topic or question and visualize the graph. In addition, at present the graph is a static image (PNG), I would like to make the graph interactive, which the user can expand/collapse/traverse and use Verbwire UI. Lastly, I would transition away from Wikipedia as anyone can edit it.
Log in or sign up for Devpost to join the conversation.