YapTrack

What it does

YapTrack is a real-time meeting guide that forces you to close the loop on all of your ideas

Inspiration

One of our team members spent 5 hours in a meeting just to MOVE SOME COLUMNS IN A DATABASE. The meeting should have been two hours, but it got stretched out because every time we proposed a small changes, we weren’t able to keep track of all of the prior requirements to our solution and the resulting implications of the change. This resulted in a feedback loop where we kept following the chain of ideas, forgetting why we changed something and reverting back and forth. 5 Software engineers, getting paid about $500 in company resources per an hour, got dragged out for 3 extra hours ($1500 in unnecessary costs) just to make a decision that they had considered from the start.

Topic fluctuations constantly arise in our meetings. We are tired of wasting our time trying to remember how to circle back to a concrete starting point. So we intended to solve this problem by creating a meeting guide that highlights questions that have yet to be explored to completion and previous decisions that informed our decision making process.

How we built it

We take speech audio, and in real-time convert it into text. In order to provide a real time speech experience with Groq (which requires complete audio files) we needed to send many small audio segments to create real-time transcriptions of spoken language. As a result, we constantly sliced our audio stream on 5 second increments. We queried LLMS to predict the intended sentence structure when the splits occurred in the middle of a word. For example, when the splitted language looked like “A”, “wire”, the LLM could predict that we intended to say “Acquire”. Thus using assistive LLM merging on different segments seamlessly built upon our ideas as we gained more speech-to-text information. As we keep adding information, we also use llama-index LLM to extract entity-relationship-entity triplets for building nodes and edges in our knowledge graph. Once our knowledge graph is built, we visualize current idea branches and the full knowledge graph. Additionally, the entities can be re-evaluated for additional correlation edges in our graph, by using a transformer model to embed the entities and threshold cosine similarity scores.

For example: when the splitted language looked like “A”, “wire”. Using the context of the sentence, the LLM could predict that we intended to say “Acquire”. Thus using assistive LLM merging on different segments seamlessly built upon our ideas as we gained more speech-to-text information.

For the backend, a graphical database, Neo4j, was used to keep track of both our initial node relationships based on the text, and also additional correlations derived after generating the knowledge graph.

Repo - Next JS w/ Flask and Aceternity UI ← need more info here

Challenges we ran into

Similarity Search and Similarity Thresholding: When generating different entities and relationships using our LLM, we realized that some nodes might be very closely correlated, but not understood in a sentence structure. Therefore, we used the BERT (Bidirectional Encoder Representations from Transformers) model, a pre-trained language model developed by Google. It gives us semantic encoding of all our entities, which allows us to do cosine similarities on our embeddings, and find new correlations on our knowledge graph. Based on a threshold, we create correlation edges on the graph. This could be potentially useful for finding new relationships or merging highly related nodes that represent near similar ideas. Redundancies in Knowledge graph entity generation . We queried LLAMA for (Source Entity -> Edge -> Destination Entity) mappings. Though it immediately began to form pairings, it often created high similarity pairings like (“greg”, “likes”, “sleeping on the couch”) and (“greg”, “enjoys”, “spending time sleeping on the couch”) which is a redundancy that we wanted to avoid when querying the graph. We used three techniques to mitigate these redundancies: GraphDB input to our RAG queries provided context for matching new entities to closely related existing ones. Prompt engineering using examples of closely matching inputs being generalized into the same entities Vector Cosine Similarity Scores: Depending on the use-case of our knowledge graph (ie. Learning, Brainstorming, Requirements Analysis) we set different thresholds based Knowledge graph real time visual display Because the knowledge graph was dynamically being changed due to updates from real time audio streaming, it was difficult keeping the UI live. To address this issue, we had to extensively research React components and their operations.

Accomplishments that we're proud of

Making use of the Neo4j Graph Database, allowing us to query data using the Cypher query language to extract paths and edges with less logic on the backend, and more logic in-database. Using Groq and llama-index to quickly streamline speech recognition to a knowledge graph database. Leveraging prompt engineering to get full power from our LLMs. Finding more discrete relationships using transformer model embedding on text entities.

What we learned

Learning how to use Groq (speech recognition and llama index node extractions). Fine-tuning LLM prompts for specific output formats. Embedding text information using machine learning.

What's next for YapTrack

Adding a large list of prompt options for knowledge graph styles, or allowing users to make their own prompts for knowledge graph entities to fit their specific needs. Merging nodes based on correlation calculations on similarity search.