Inspiration
"How many of you dreamt of being an astronauts as kids?" While 1 out of 5 children wanted to become astronauts, the dream dies down as we grow up. With the determination to nurture and sustain these dreams, we are motivated to develop Astromatch to help match users of any age to astronauts in history that have similar background!
What it does
Astromatch is a AI-powered website that matches any user to three astronauts in history who have similar education, occupation or interests. Users can also view how compatible they are with the typical roles astronauts in history have assumed: commander, flight engineer, pilot, payload specialist, mission specialist, journalist.
How we built it
Based on the information from International Astronaut Database and additional dataset from Kaggle, we utilized Python BeautifulSoup and LLMs' API to thoroughly scrape the Wikipedia page of each and every one of them, retrieving their alma maters, occupations, interests, etc. With this unstructured data, we combine both REGEX and the power of LLMs to clean and structure this data into a JSON format. A Deep Learning network with two layers is then trained using the embedded vectors of this JSON data, allowing us to transform the users' text input on the web into numbers and vectors. The personalized vectors of every astronaut as well as every role are saved, and used to find the users' best matches!
Challenges we ran into
Cleaning unstructured data was especially challenging since there is no method that can perfectly handle all the edge cases. Categorising and evaluating data also posed a challenge due to time and resource limitations.
Accomplishments that we're proud of
We successfully utilised LLMs to systematically scrape large amounts of data in a short time period, which allows us to train a functional Deep Learning model.
What we learned
We learnt that the patterns of real-world data are often hidden, and a substantial amount of data pre-processing is essential. We also realise possibilities of numbers in numerically represent unstructured data.
What's next for Astromatch: "Naut" just connections
We will gather more data from the active astronauts and especially astronauts in training, such as their early life, or their viewpoints to the world, to strengthen our Deep Learning model.
Built With
- api
- beautiful-soup
- flask
- gensim
- llm
- nltk
- python
- react
- replit
- scikit-learn
- scraping

Log in or sign up for Devpost to join the conversation.