Gem Run

⚾ Gem Run: Your Personalized MLB Podcast Experience 🎙️

✨ Inspiration

The inspiration for Gem Run came from a simple idea: 💡 baseball fans deserve a better way to stay connected with their favorite teams and players! We envisioned a personalized podcast experience, tailored to individual interests, that could be generated on demand. 🙅‍♀️ No more sifting through generic sports updates, just pure baseball bliss. ⚾❤️

🎧 What it does

Gem Run is a multi-speaker MLB podcast generator 🤖 that allows you to create your own custom audio updates. Here's a breakdown:

Personalized Content: Select your favorite MLB team(s), specific players, a timeframe (last game, last 3 games, specific date), and even a game type (regular season, playoffs, etc.) 📅.
Multi-Speaker Narrative: Enjoy a dynamic listening experience with multiple AI-powered speakers: a play-by-play announcer 🗣️, a color commentator 🤔, and even simulated player quotes 💬.
Multi-Lingual Support: Stay updated in your preferred language with support for English 🇬🇧, Spanish 🇪🇸, and Japanese 🇯🇵.
Cloud Storage: Enjoy a quick audio generation and access to the created podcasts, which are stored in google cloud storage. ☁️
Firebase Authentication: You can sign in securely with Firebase, so that you have private and secure access to your personalized content. 🔐
Mobile App Access: The app is compiled with Capacitor for Android, allowing users to enjoy the podcast experience on their mobile devices. 📱

⚙️ How we built it

We leveraged a powerful combination of Google Cloud technologies, and a complete Data Pipeline:

Data Ingestion:
- Data Source: We use the MLB Stats API to fetch game data, player details, and team information. 📊
- Pub/Sub: We then publish all this data to a google cloud Pub/Sub topic, for more reliable and scalable message handling. 📨
- Data Streaming: This topic is then used to stream data into BigQuery, to allow for easy access and querying of the data. 🌊
Data Storage:
- User profiles, preferences, and podcast history is stored in Firestore for easy management and access to personalised information. 🗄️
- Generated audio files are stored in Google Cloud Storage for reliable, and efficient access. ☁️
Podcast Generation:
- The Gemini 2.0 Flash/pro model(s) powers the podcast script creation, generating compelling scripts based on user parameters. 🧠
  - Google Search Grounding: We used Google Search to ground our model, by confirming the last game played by a team and used the date as an anchor to guide the script generation. 🔎
- Google Cloud Text-to-Speech API is used to create speech using different voices for each speaker. 🗣️
- Deployed the MLB Multispeaker Podcast Agent with Reasoning Engine in Vertex AI. Which also includes Evaluation (Trajectory Evaluation).
- LLM as Judge: To improve the prompt we leveraged Gemini as script evaluator to help rank the script output in terms of use case, edge case handling, multispeaker flow and the script quality. We could iterate and test the output which led to designing a more robust prompt.
Frontend & Backend:
- Firebase Authentication: We use Firebase for user sign-up and sign-in processes.
- Cloud Run: The application is deployed on Google Cloud Run, to allow a scalable and reliable backend. ⚙️
- Looker Studio: 📊 We displayed key metrics from evaluating the Reasoning Engine Agent.
- Google Cloud Logging (google.cloud.logging): 📝 This is used for logging events and errors within the class. The cloud_logging.Client().logger('gcs-handler') line creates a logger specifically for this class, allowing you to easily track its operations in the Google Cloud Console's Logs Explorer. 🔎
- Google's Vertex AI RAG engine: 📚❓ This is used to upload documents (PDF, text, etc.) and then ask questions about their content, receiving AI-powered answers. It uses Google's Vertex AI RAG engine to search your documents and generate responses based on the information found within them.
- Google Cloud Secret Manager (google.cloud.secretmanager): 🔐 This is used to securely store and retrieve the service account key. The secretmanager.SecretManagerServiceClient() line initializes the client, and access_secret_version retrieves the secret data. Storing credentials in Secret Manager is a security best practice. 👍
- Streamlit UI: The user-facing application is powered by Streamlit, for rapid and efficient user interface development. 🖼️
- gcloud cli: For most operations, we used gcloud commands, for building, deploying, and managing cloud resources. 🧰
- Poetry: Python dependencies are all managed using the poetry dependency manager. 📦
- Capacitor: We used Capacitor to compile our web application into a native Android app, enabling mobile access to our podcast service. 📱

🤯 Challenges we ran into

We faced some interesting hurdles along the way:

Organizational Policy Issues: We had to navigate tricky organizational policies that initially blocked access, by applying complex permissions and authentication schemes. 🚧
API Rate Limits: We also had to handle API rate limits for the MLB api, to prevent excessive requests. ⏳
Dependency Management: We faced complex dependency issues by carefully configuring both frontend and backend libraries. 🧩
Google Cloud errors: Figuring out which google cloud service to use and how to correctly set the parameters was difficult. 🤔
Authentication: Securing our cloud run application by implementing a robust firebase authentication and authorization mechanism was a complex task. 🔑
Mobile Compilation: Converting our web application to a native Android app using Capacitor required resolving various platform-specific issues and optimizations. 📲

🎉 Accomplishments that we're proud of

Despite the challenges, we're proud of what we've accomplished:

Working Podcast Generator: We successfully created an application that generates custom, multi-speaker podcasts, and is completely driven by user preferences! ✅
Multi-Language Support: We can now provide this functionality in three different languages, a very important aspect of the hackathon. 🌎
Cloud Native: We have implemented a cloud native approach by using google cloud services. ☁️
Data Pipeline: We have designed a pipeline that handles streaming data from various sources, with data processing, storage, and analysis. 🌊
Solid Authentication Flow: We implemented a secure authentication flow with Firebase, while using service accounts to communicate with other google cloud services. 🔐
Resilient System: We have created a robust and reliable application that can handle a large number of requests without unexpected crashes. 💪
Clear Architecture: We have set up a system that is now clearly structured and easy to extend with new features and functionality. 🏗️
Cross-Platform Deployment: Successfully compiled and deployed the app to Android using Capacitor, making our service accessible on mobile devices. 📱

📚 What we learned

This hackathon was a valuable learning experience:

Power of Cloud APIs: We've learned the immense power of Google Cloud APIs for AI and data management. 🚀
Attention to Detail: The importance of carefully managing dependencies, permissions, and configurations to ensure that all components can work together as expected. 🧐
Importance of Planning: We have learned the importance of planning for large projects, and to tackle problems one at a time. 📝
Troubleshooting is key: Debugging and troubleshooting are the key to implementing any software, and that it requires patience. 💯
Mobile Development: We gained valuable experience with Capacitor and the process of converting web applications to native mobile apps. 📱

🚀 What's next for Gem Run

We have many ideas on where to take Gem Run next:

Visualization: Add the ability to provide a visualization of key stats in the podcasts. 📊
Podcast Recommendations: Provide users with more relevant recommendations for podcasts based on their listening history. ✨
User Feedback Mechanism: Provide users with a mechanism to provide feedback to improve the quality of the podcasts. ✍️
More Granular Controls: Allow users to have more granular control over the generation of the podcasts by allowing them to filter specific types of plays. ⚙️
More Languages: Add support for more languages to make it accessible to a wider audience. 🌎
More personalization: Make the podcast experience even more personalized, with user profiles, and preferences. 👤
iOS Support: Extend our mobile app offerings by creating an iOS version using Capacitor. 🍎
Offline Mode: Add capabilities for users to download podcasts for offline listening on their mobile devices. 🔌

We believe Gem Run has the potential to revolutionize the way MLB fans connect with the game, and we are excited to continue working on this project! ⚾🚀