Inspiration

The idea for the Kobby app was inspired by my frequent attendance at tech conferences, startup networking events, and other professional gatherings. During these events, I meet so many people, but I often struggle to remember their names just moments after our conversations. Like many, I’d catch myself thinking, "I'm just not good with names." This led to awkward moments where I had to pause, jot down their names, or risk losing the natural flow of networking.

I realized that if I had an app that could seamlessly capture and recall the names of people I meet, it would not only make interactions smoother but also help me appear more personable and professional. Interestingly, I noticed that a lot of people I interact with at these events face the same challenge. This realization drove me to create Kobby—a solution designed to effortlessly remember names, enhancing the overall networking experience.

What it does

The Kobby app uses artificial intelligence to help you remember the names of people you meet during conversations. When you introduce yourself to someone, the AI captures their name and saves it, along with important details like where and when you met. This makes it easy to recall who they are later on.

The information is stored on your phone and synced to your cloud account, ensuring you don’t lose it. After the event, you can review the list and use the AI to add additional notes or insights about the people you've met, helping you stay organized and connected.

How we built it

We built the app using SwiftUI to create the front-end, including an Apple Watch app that listens for names mentioned during conversations. For transcribing the audio, we used OpenAI's Whisper model, which converts the audio into text. Then, we used another OpenAI language model to extract the names from that text. Finally, the names are saved to Core Data on the iPhone, so they’re stored and easily accessible for later use.

We built a could function on GCP to handle the transcription and name extraction part . so the phone makes request to the cloud function to get the names of the people in the conversation. The cloud function delete the file after processing it

Challenges we ran into

One of the challenges we faced was our initial plan to convert the Whisper model to Core ML, so it could run directly on the phone without making API calls. Unfortunately, the conversion failed due to incompatibility issues, so we had to switch to using OpenAI’s API directly.

We also encountered difficulties when trying to create a cloud function on GCP to clean the audio and remove background noise. The library we wanted to use wasn't compatible with other packages, so we pivoted to using a low-pass filter on the phone instead. However, this caused issues with saving the cleaned audio file properly, so we had to remove that part and plan a better solution.

Finally, deploying the cloud function presented its own set of challenges, as we ran into multiple model compatibility issues during the process.

Accomplishments that we're proud of

We’re proud of several key accomplishments in this project. First, We were able to get a working MVP . We successfully integrated the Apple Watch to listen for names during conversations, adding both convenience and enhancing the user experience. Another major achievement was the seamless use of OpenAI’s Whisper model for accurate transcription and name extraction, despite the initial challenges with model conversion. While we weren’t able to fully clean the audio file before processing, we learned valuable lessons about what approaches don’t work, and we now have a clearer direction on alternative methods to try. Additionally, we optimized the app’s performance across both the iPhone and Apple Watch, ensuring smooth syncing and data storage. These accomplishments helped us create a more reliable and user-friendly app, and we’re excited to keep improving it.

What we learned

Throughout the project, we gained valuable experience in several areas. We learned how to convert a large language model (LLM) to Core ML, which gave us insights into optimizing models for mobile devices. We also became proficient in using the Whisper model for audio transcription, which was a key part of capturing conversations.

In addition, we learnt how to use OpenAI’s chat completion endpoint and transcription endpoints for name extraction through prompt engineering techniques. We also explored different methods for cleaning audio files, which taught us what works and what doesn’t in terms of improving audio quality. Lastly, we gained experience deploying Python-based cloud functions on Google Cloud Platform (GCP), enhancing our ability to handle backend processing efficiently.

What's next for Kobby

  • Improve audio accuracy: Update the cloud functions to clean audio files, ensuring better accuracy when extracting names.
  • Networking feature: Add a feature to connect users who are in the same space, suggesting potential contacts or people to meet, helping professionals expand their network.
  • Audio playback: Allow users to play back recorded audio files under each entry to help them remember parts of the conversation.
  • Memory game: Add a game feature to help users reinforce and memorize names.
  • Siri integration: Let users activate the app with Siri by saying, "Hey Siri, I’m meeting someone new," for hands-free name capturing.
  • Image Integration: Enhance the feature to capture not only names but also images of the people you connect with, allowing for a more comprehensive contact profile.
  • Expand to Android: Possibly develop the app for Android devices and Android smartwatches.
  • Smartwatch-free option: Add features that let users use the app without needing a smartwatch.
  • Launch on App Store: Deploy the app to the Apple App Store, gather feedback from users, and continue improving it.

Built With

  • core-ml
  • google-cloud-functions
  • open-api
  • python
  • swift
  • whisper-llm
Share this project:

Updates