VocalAI

Inspiration

Whether you're hanging out with friends, feeling bored at home, or on a long car ride -- singing is often something we find ourselves reaching to. One of the members of our team (Fiona) really enjoys singing and so we thought it would be a nice idea to look into how she can sound better so that she, and everyone else around her can better enjoy her voice. We wanted to create a tool that allows singers to practice anywhere, with real time advice. Vocal lessons are often very expensive and being able to take these lessons and sing is often viewed as a privilege. Hence, to dispel all these beliefs and help everyone like Fiona improve their singing, we've created VocalAI -- an artificial-intelligence based software that can generate you a karaoke version of any song provide you real time feedback of your singing.

What it does

VocalAI, our innovative platform, offers two key features that aim to enhance your singing experience. With VocalAI, you can enjoy the convenience of transforming any song into a personalized karaoke track. Whether you're practicing your favorite tunes or planning to perform them, VocalAI seamlessly creates high-quality karaoke versions that align perfectly with the original compositions.

In addition to karaoke creation, VocalAI incorporates advanced speech recognition technology to ensure you're singing the right words. Our powerful speech recognition algorithms analyze your vocal delivery and compare it to the original lyrics, providing real-time feedback and guidance to ensure accurate pronunciation and word alignment. This feature helps you develop proper diction and ensures that your performance stays true to the original song.

Furthermore, VocalAI includes a valuable scoring system designed to evaluate your pitch accuracy while singing. By analyzing your vocal performance, our intelligent algorithms provide constructive feedback and precise scoring based on your pitch accuracy. This feature allows you to track your progress over time, identify areas for improvement, and refine your singing abilities.

With VocalAI, you have access to an all-in-one platform that combines karaoke creation, speech recognition, and pitch scoring to elevate your singing experience. Whether you're a beginner seeking to improve or an experienced vocalist aiming for perfection, VocalAI provides the tools and guidance you need to unlock your full singing potential.

How we built it

VocalAI was built using a carefully selected combination of technologies to deliver a seamless singing experience. Here's why we utilized each tool:

React

We chose React for its ability to build dynamic and responsive user interfaces, ensuring a smooth and interactive experience for VocalAI users.

Tailwind CSS

To expedite the development process and maintain visual consistency, we opted for Tailwind CSS. Its utility-first approach provided us with a comprehensive set of pre-built components and styles.

Flask

Flask, being lightweight and flexible, was an ideal choice for building the server-side components of VocalAI. It allowed us to handle various functionalities such as song processing, user authentication, and scoring algorithms.

Figma

Figma proved invaluable for prototyping. It enabled our team to work together effectively, iterate on designs, and create visually appealing interfaces that met user expectations.

MongoDB

For efficient data storage and management, we turned to MongoDB, a flexible and scalable NoSQL database. It facilitated seamless storage and retrieval of user information, song data, and scoring records.

By leveraging these technologies, we created VocalAI as a feature-rich platform that combines the power of React, Tailwind CSS, Python Flask, AWS, Figma, and MongoDB to provide users with a superior singing experience.

Challenges we ran into

During the development process of VocalAI, we encountered several challenges that tested our team's resilience and problem-solving abilities. Here are some notable obstacles we faced:

Health Issues

One significant challenge we encountered was when several team members fell ill simultaneously. This unexpected setback impacted our productivity and required us to reorganize tasks and responsibilities to ensure continuity and meet our development milestones.

GitHub Repository Deletion

Another hurdle we faced was the accidental deletion of our GitHub repository, which contained crucial codebase and project documentation. This setback required us to quickly recover the lost data from backups and implement stricter version control measures to prevent similar incidents in the future.

AWS

One major challenge arose when we attempted to leverage AWS for scaling our infrastructure to accommodate increased user demand. Unfortunately, we encountered persistent issues that prevented us from effectively utilizing AWS services. Despite our efforts to troubleshoot and optimize the configuration, the scalability features we had planned to implement were not fully realized.

Accomplishments that we're proud of

We are immensely proud of the notable achievements we have made in the domains of vocal pitch and speech recognition within VocalAI. Our dedicated efforts and technical expertise have allowed us to develop robust systems that accurately analyze and evaluate vocal performance. The successful implementation of advanced pitch detection algorithms has enabled us to provide precise feedback on pitch accuracy, aiding users in improving their singing abilities. Additionally, our cutting-edge speech recognition technology ensures accurate word alignment, allowing singers to deliver performances that stay true to the original lyrics. These achievements reflect our commitment to delivering a professional and high-quality singing experience through VocalAI.

What we learned

Throughout the development process of VocalAI, we gained valuable insights and knowledge that have shaped our understanding of building a successful platform. One key lesson we learned is the importance of efficient resource allocation and planning. We successfully completed the front-end development well ahead of schedule, which highlighted the significance of proper resource management.

This experience has shown us the potential for allocating more resources to the backend in future projects. By dedicating additional focus and attention to the backend development, we can ensure a well-rounded and robust platform that can handle increased user demand, maintain optimal performance, and accommodate future scalability requirements.

Furthermore, this project reinforced the importance of conducting thorough research and feasibility analysis before committing to specific technologies and platforms. It is crucial to anticipate potential challenges, such as the scaling issues we encountered with AWS, and have contingency plans in place to mitigate these obstacles effectively.

Additionally, effective communication and collaboration within the development team proved to be vital for problem-solving and overcoming challenges. Regular and open dialogue allowed us to address issues promptly, leverage each team member's expertise, and find innovative solutions.

Ultimately, this experience has taught us the significance of adaptability and flexibility in the face of unexpected hurdles. By embracing a proactive mindset and being prepared to adjust our plans as needed, we can navigate challenges effectively and deliver a high-quality product that meets and exceeds user expectations.

What's next for VocalAI

Next, our focus will be on further enhancing the VocalAI platform to provide an even better singing experience. Here are some key areas we will be prioritizing:

Backend Optimization

We will allocate additional resources and attention to optimizing the backend infrastructure. This will involve refining server configurations, improving database performance, and implementing efficient caching mechanisms. By doing so, we aim to enhance the overall speed, reliability, and scalability of VocalAI.

Integration of Advanced Features

We plan to integrate advanced features that go beyond pitch evaluation and speech recognition. This may include real-time harmonization, vocal effects, and interactive tutorials. By expanding the feature set, we strive to provide users with a comprehensive suite of tools to explore their creativity and elevate their singing performances.

User Experience Enhancements

Improving the user experience will be a key priority. We will focus on refining the interface, streamlining navigation, and implementing user feedback to ensure VocalAI remains intuitive, user-friendly, and enjoyable for singers of all skill levels.

Mobile Optimization

We recognize the importance of mobile accessibility. We will invest in optimizing VocalAI for mobile devices, ensuring a seamless and responsive experience across various screen sizes and operating systems. This will allow users to practice and perform on-the-go, further expanding the accessibility of the platform.

As we move forward, our goal is to continuously evolve and refine VocalAI, providing singers with an unparalleled experience that helps them grow, improve, and express their creativity with confidence.

Built With

Submitted to

JAMHacks 7
- Winner Best Game Hack

Created by

I worked on the front end, worked with API’s to provide access to songs and LRC format, as well as speech recognition and pitch detection and comparison .

Greg Ovis
backend processing, worked with AI models, as well as streamlined the process of searching for song, downloading it, and playing it to the user

Kevin Tang
Full-stack Developer & Machine Learning Enthusiast
Worked significantly on both sides of the stack; created backend functions, set up endpoints, coded several front-end pages and designed the algorithm for pitch recognition.

Arihan Sharma
Fiona Cai