Chrome Hackathon

What's next for Chrome Hackathon

What it does AI WAIFU brings to life a virtual assistant in the form of a character that users can interact with through speech and text. The assistant is capable of understanding and responding to user inputs, engaging in conversations, providing assistance, and even reacting to music or speech with lip-syncing animations. It combines AI-driven conversations with expressive Live2D animations to create a seamless, engaging experience.

How we built it The project was built using a combination of several technologies:

PIXI.js for rendering Live2D models and managing animations. JavaScript for implementing interactivity, handling audio, and managing user inputs. Gemini Nano for the conversational abilities of the assistant. Web Audio API for analyzing audio and syncing the model’s mouth movements with sound. React for user interface elements, and integration with other frontend technologies to create a polished user experience. Challenges we ran into Audio playback restrictions: Browsers often block audio autoplay unless there’s user interaction, which required us to implement user gesture detection. Live2D SDK updates: Transitioning to newer versions of the Live2D SDK posed compatibility issues, particularly with setting parameters like mouth movements. CORS issues: Fetching resources (e.g., audio files and models) from different domains led to CORS errors, which required us to implement workarounds like using a proxy or handling server configurations. Syncing AI and Animation: Coordinating the AI responses and synchronizing them with smooth animations in real time was a challenge, especially ensuring lip-syncing aligned with speech. Accomplishments that we're proud of Successfully integrating Live2D animation with AI-driven conversations to make the virtual assistant more engaging and interactive. Real-time lip-syncing of the virtual assistant to music or speech, making the assistant feel more lifelike. Cross-browser compatibility: Despite multiple challenges with autoplay restrictions, the solution works seamlessly in most browsers. Customizable user interactions: The assistant can respond to a variety of gestures and actions, improving user engagement. What we learned Overcoming the limitations of modern web browsers when it comes to audio playback and user interaction was a significant learning curve. Integrating animation libraries like Live2D with AI-based conversation systems requires attention to both performance and user experience. We gained valuable experience with managing resources (models, audio) in a web environment, handling CORS errors, and working with new API versions. What’s next for AI WAIFU Expanded features: We plan to add more dynamic responses based on context and emotions, making the virtual assistant more adaptive. Voice recognition: Integrating speech-to-text will make the interactions more natural, enabling users to converse with the assistant more fluidly. Improved character customization: Allow users to choose and customize the assistant’s appearance and personality. Mobile support: Enhance the user experience for mobile users with optimized controls and better interaction mechanisms. Multilingual support: We aim to expand the assistant’s ability to converse in multiple languages, making it accessible to a global audience.

Built With

Updates

Akshat Jain started this project — Dec 04, 2024 02:44 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.