Please use the Google Drive link below to access our final submission and full documentation!

What it does

The Baker Hughes Challenge presents a common issue in modern-day algorithmic modeling: creating accurate data representations given limited computational power. With a mission of improving energy in an efficient and safe manner, having accurate models that better predict the lifespan of products is critical for maintenance and repair procedures. The following documentation outlines the process of preparing model training for an electric motor.

How we built it

The final report can be found here: https://drive.google.com/file/d/1JMLWMsQDiDozFRWFXhn752RjBwWNU_9C/view?usp=sharing

A full write-up and documentation of our design process may be found here: https://drive.google.com/drive/folders/1I0dAvUnOGW33BLSLVn-N_0BPp0K12vEa?usp=sharing

The final project presentation can be found here: https://drive.google.com/file/d/1OocvVIj5e6Kwx19Tlhgwdx1I6v3cADn5/view?usp=sharing

Challenges we ran into

Our team had a the most trouble converting 2D scatter plots into 3D histograms that displayed frequency of data points across various domains, whether that be getting values of 0 to stop displaying on the graph, or tweaking the area covered by histogram bars.

Another large issue was getting the uniform distribution to be even. There was a large bug that caused large, artificial spikes in frequency towards the extrema of x1 and x2 values that took a long time to fix.

Accomplishments that we're proud of

The team is most proud of getting all distributions of data organized and completed, as well as all graphs and CSV files generated.

Additionally, this was the team's first time doing a data science project, and the team's second ever hackathon. Delegating the work effectively and having plenty of time to do a write-up at the end was definitely a big accomplishment.

What we learned

Though the data downsampling project, the team learned data sampling solutions, primarily k-means clustering and convex hull for domain sampling. Additionally, the team was able to improve upon python and project management skills.

What's next for Motor Data Downsampling & Visualization

The next step is to generate more hybrid methods of data downsampling, through tweaking parameters such as the significance of even distribution and density. Currently, the use of k-means++ means that hybridization is not able to be tweaked by an end user.

Built With

Share this project:

Updates