Inspiration
A desire to help close the digital divide between older and younger generations and ensure that everyone has access to the benefits of technology. The goal of the voice assistant would be to provide a user-friendly, accessible, and efficient way for elderly people to learn about and use technology. By making technology more accessible, the aim is to empower older people to stay connected with loved ones, access important information, and improve their quality of life.
What it does
With simple voice commands you can get specialized answers to all your tech questions based on the given inputs about your comfort levels with various technologies. It also lets you take a picture of the tech you are struggling with to classify it for a more customized response. The web app is very easy to use with an intuitive UI and large and easy-to-read text.
How we built it
The backend was built using Python-Flask and SQLAlchemy. We trained Cohere's generate API using sample inputs and the user's inputs about their comfort levels for more customer responses. We used a convolutional neural network built using PyTorch and trained it with the CIFAR-100 dataset with 60000+ classified images for the image classification. We used React for the frontend and did the text-to-speech and speech-to-text using React-Speech-Kit. We also used React to take an image and send it to the backend for classification. The frontend UI was based on a prototype designed using Figma.
Challenges we ran into
We struggled a lot with building the neural network from scratch. This was our first time making our own machine learning model and in the past, we had always just called APIs. Converting the image into a numerical tensor of an appropriate shape to fit the first layer of the neural network was also something we struggled with. Finally, we struggled with using React to access the camera to take an image. It was also difficult to send it to the backend in an appropriate format for processing.
Accomplishments that we're proud of
We are very proud of being able to build a fully-functioning and accurate neural network for image classification. We faced various issues during its creation, but it now works well and has functionally to significantly aid the users. We are also proud of being able to collaborate effectively using GitHub, especially since one of our teammates had never used it before. Yet, we still managed to communicate and divide the tasks well.
What we learned
We learned a lot about machine learning practices and techniques, neural networks, and using large datasets to train a model. We also managed to use React very effectively and use many features such as text-to-speech, speech-to-text and capturing an image.
What's next for TechTutor
Our goal is to find a better dataset to train our model, namely one that is more relevant and has more images related to technology. We also plan to make our text-to-speech sound less robotic to allow for a better user experience.
Built With
- cohere
- figma
- flask
- python
- pytorch
- react
- react-speech-kit
- sqlalchemy
Log in or sign up for Devpost to join the conversation.