AI dataset

...

Comment

Inspiration

Our inspiration came from different machine learning tutorials, thus each of us started with different approaches and methods, but later combined and tested our algorithms together to find out the best one with the highest accuracy

What it does

We finally agreed that the Random forest classification algorithm has the best performance of all algorithms. With the highest learning accuracy and test accuracy, RFC proved its potential in learning and analyzing data with the correct data clustering, thus we focused on the RFC algorithm and made it our final submission

How we built it

We built the RFC model with the following steps

split the dataset and pre-processing data
build the model
train the model
evaluate the model with metrics
tune the hyperparameters

Challenges we ran into

choose the best algorithm base on performance
analyze data and select the most important columns to feed the algorithm
improve the performance of the RFC model

Accomplishments that we're proud of

we tried a variety of algorithms at the beginning stage
the RFC model is promising and has the highest accuracy
our way of analyzing input data is efficient

What we learned

how to split and cluster original data
more ML algorithms and their coding technique
how to test and improve ML algorithms

What's next for the AI dataset

there should be a better way to clean the data, so if we have more time, we can try to make the input data cleaner to achieve higher accuracy

Built With

Updates

yushi gan posted an update — Mar 12, 2023 09:58 AM EDT

Inspiration

Our inspiration came from different machine learning tutorials, thus each of us started with different approaches and methods, but later combined and tested our algorithms together to find out the best one with the highest accuracy

What it does

We finally agreed that the Random forest classification algorithm has the best performance of all algorithms. With the highest learning accuracy and test accuracy, RFC proved its potential in learning and analyzing data with the correct data clustering, thus we focused on the RFC algorithm and made it our final submission

How we built it

We built the RFC model with the following steps

split the dataset and pre-processing data
build the model
train the model
evaluate the model with metrics
tune the hyperparameters

Challenges we ran into

choose the best algorithm base on performance
analyze data and select the most important columns to feed the algorithm
improve the performance of the RFC model

Accomplishments that we're proud of

we tried a variety of algorithms at the beginning stage
the RFC model is promising and has the highest accuracy
our way of analyzing input data is efficient

What we learned

how to split and cluster original data
more ML algorithms and their coding technique
how to test and improve ML algorithms

What's next for the AI dataset

there should be a better way to clean the data, so if we have more time, we can try to make the input data cleaner to achieve higher accuracy

Log in or sign up for Devpost to join the conversation.

yushi gan started this project — Mar 12, 2023 09:56 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.