CATSense

Inspiration

Do we really need to spend 15 minutes looking at each machine and manually filling out a 2–4 page inspection report? At a company like Caterpillar Inc, inspections are critical, but identification and documentation takes up a huge portion of the time. Having known the opportunity cost that comes with the filing of documentation and resource management, our team constructed CATsense to help CAT machine workers. With recent breakthroughs in computer vision and image recognition, we asked a simple question: how much can we cut down the amount of documentation the user has to do themselves. Short answer? A lot.

What it does

CATsense allows you to add specific CAT machines to your "garage", allowing for you to keep track of your machines and their necessary repairs. In order to make the lives of CAT machine workers easier, users can upload images and audio of the machine and its components into the system, and the model analyzes the visuals to detect damage, wear, leaks, cracks, corrosion, missing components and etc. The AI then converts those findings into a professional, structured report that mirrors traditional inspection documentation by following a model built on risk scores and confidence scores. Furthermore, to make documentation and repairs easier, the report comes with specific analysis of which parts are problematic and provides the link to the specific part to replace at Caterpillars official website. Instead of spending 15 minutes writing, inspectors can review, adjust if necessary, and submit within minutes.

How we built it

We built our app using many languages and frameworks and integrated many API tools. We built our backend using python, using FAST-APIs, and integrated AWS's ReKognition and Bedrock APIs to analyze the images and prepare the reports. Furthermore, we used Anthropic's API to integrate a RAG model in order to analyze the reports to generate the suggestions on repairs and replacements. We implemented Google Cloud APIs for translating the website for accessibilty and Google Maps API. We used 11Labs audio API in order to get the input audio for the system's features in order to be able to analyze how the machine sounds. For our front end, we used JSX, React, HTML, TailWind, TypeScript, and used FireBase for our database for accounts, and AWS for the rest of data, especially the garage feature.

Challenges we ran into

We ran into plenty of challenges, including one of our teammates having a faulty code editor, and a faulty github setup. We also messed up our setting up of AWS systems multiple times, but we got through all of our problems through willpower and perserverance. Our hosting system wouldn't properly host the app thanks to the many variable folders we have in the website.

Accomplishments that we're proud of

We integrated a lot of unique APIs and learnt many frameworks as we kept going with our integrations, and also got to learn how AWS works and how we can integrate it into our own projects.

What we learned

We learnt a lot about how API keys worked, and how to properly integrate them. We learnt how to use AWS in our everyday projects. We learnt about RAG models, and how it has a wide range of functions, and use cases. Lastly we learnt a lot about the industry of automotive repair with respect to big machinery, and gained lots of valuable insight on the industry.

What's next for CATsense

We have many features we plan to implement for CATsense in the future, since we are very passionate about the project and wish to improve upon it. A few features we would like to implement in the future is an augmented reality scenario where after detailed scanning, users can "enter" a machines specific parts, such as the engine, and "tour" them, to be able to see where probable issues can occur with parts, and how they can be fixed. We would also like to add features to keep track of time before the next maintainance checks, and be able to contact trusted CAT specialists/part dealerships through the website.