-
-
Distribution of Dinosaur Length in Late Tirassic
-
Distribution of Dinosaur Length in Early Jurassic
-
Distribution of Dinosaur Length in Mid Jurassic
-
Distribution of Dinosaur Length in Late Jurassic
-
Distribution of Dinosaur Length in Early Cretaceaous
-
Distribution of Dinosaur Length in Late Cretaceaous
-
GIF
Bar Chart Race: Distribution of Dinosaur Count per Location Over Time
Inspiration
Coming into this project, we wanted to make something similar to the moving bar chart races that we always see on YouTube and TikTok, and so that was the first inspiration. Then, we first decided to make histograms that showed the distribution of lengths of dinosaurs in each of the dinosaur eras because we wanted to work with numerical data. Then, we had time so we decided to continue with our first inspiration of making the bar chart racer, and we accomplished that by making a working bar chart race of the distribution of dinosaur counts per location over time in millions of years ago.
What it does
We have two components to our project, one of which displays 6 histograms corresponding to the distribution of lengths of dinosaurs in each of the dinosaur eras (Late Triassic, Early Jurassic, Mid Jurassic, Late Jurassic, Early Cretaceous, and Late Cretaceous). The second component shows a moving bar chart race of the distribution of dinosaur counts per location over time in millions of years ago.
How we built it
The four members of our team divided up the work, where one of us worked on the front end, one worked on the back end and connecting the server side with the API, and the other two of us worked on the data science code. For the data science side, we found our own dataset outside of the ones provided in the starter pack because we wanted more numerical data to work with as we knew we wanted to build a bar chart racer and perform EDA with corresponding data visualizations. We then cleaned the dataset to fit the format we wanted. We split single columns into multiple columns because they contained more than one type of data, dropped NaN values, converted whole columns to the type we wanted to work with, added new column with data we wanted, and more. Then, we first performed EDA on the cleaned data, where we used matplotlib to create histograms for each of the Dinosaur eras, accurately displaying the distribution of dinosaur lengths in each era. Then, we imported bar_chart_race to help us make a moving bar chart racer that displays the distribution of dinosaur counts per location over millions of years ago.
Challenges we ran into
Making the bar chart racer was a challenge as it was a package that none of us had worked with before and so it was difficult to understand the formatting and requirements. While we got the bar chart working quickly, we had trouble getting the labels to display and update accordingly (ie. the 200 Million Years Ago Section). We spent a few hours just trying to figure out the syntax and formating for that, but we ended up figuring it out.
Accomplishments that we're proud of
From the data science end, we are proud of being able to get the bar chart racer to show and work as intended. We initially planned to only show the histograms of the distribution of dinosaur lengths per era ( Late Triassic, Early Jurassic, Mid Jurassic, Late Jurassic, Early cretaceous, and Late Cretaceous) but because we had extra time, we wanted to challenge ourselves by creating a moving bar chart racer. Because bar_chart_race was a package that none of us had worked with before, it was difficult to understand the formatting and requirements. After 5 hours, we did get it to work and display all the information we wanted, which we are proud of.
What we learned
We learned that we can always step out of our comfort zone and try new things that we have never seen or touched before. For example, none of us knew how to make racing bar chart racers but we still ended up accomplishing it.
What's next for T-Race
We're going to try to connect the front end and back end to the server.
Log in or sign up for Devpost to join the conversation.