-
-
Comparing our LLMs in use!
-
A brief introduction to our ways to do inspections
-
The reported output and records of the inspection
-
Hey, its our take on CAT AI fully working, play around with it in our booth!
-
A cornerstone to our value emphasis via the infinite knowledge connections we retain via Supermemory
Inspiration
Imagine standing in the Arctic cold, gloves on, trying to inspect a CAT excavator. Paper checklist in one hand, flashlight in the other. You check one component, then another. Did you check the cab air filter? You think you did. Maybe you didn’t. Somewhere in that gap, things get missed.
CAT Inspect is powerful. CAT AI is impressive. But both assume ideal conditions. Out here, no Wi-Fi, no signal. One technician for ten machines. Machines everywhere, problems everywhere. That is where inspections actually happen.
We wanted to close that gap. Not with another app, but with a system that connects all the digital pieces and actually works where the work is happening.
What it does
CATalyst is the baby of CAT Inspect and CAT AI.
It turns a first-day operator into a ten-year veteran on the spot.
Open the phone. Scan the machine. The official 43-item CAT 320–352 checklist loads instantly.
Hold the phone to a component. Speak what you see. The system responds.
PASS. MONITOR. CRITICAL.
Eyes stay on the machine.
Serious finding? Escalated automatically to Claude 3.5 Haiku on AWS Bedrock with the full machine history attached. Dealer notified. Parts ready. One tap opens parts.cat.com.
Walk past a cab air filter? CATalyst reminds you out loud. Every inspection step tracked. Every finding stored in Supermemory tagged to that asset. Third inspection flagging the same hydraulic hose? System already knows. It warns you before you even touch the tool.
Faster inspections save money and reduce downtime. The knowledge behind the system makes it easier for new operators to step in and perform at a high level. CATalyst addresses the worker shortage while keeping operations efficient.
How we built it
FastAPI backend with offline-first architecture. Every finding queues locally and syncs to Supabase only after the inspection completes. The device works entirely on site. No signal, no problem.
Local LLM runs on Nvidia Jetson Nano. Smart enough to classify most findings instantly. Cloud escalation goes to CAT-AI, done by our flavor of AWS Bedrock with Claude 3.5 Haiku for high-stakes judgment.
Prompts are component-specific. Hydraulics get hydraulic prompts. Undercarriage gets undercarriage prompts. The library came directly from CAT’s pass/fail rules.
Supermemory stores every finding by asset ID and injects prior history into the LLM context. Over time, machines build their own health narrative without anyone configuring it.
Frontend is Expo React Native. Phone is the field device. Camera, mic, speaker. The phone is the glasses.
Challenges we ran into
The biggest challenge was keeping focus. It was easy to get sucked into feature creep like as chasing every cool idea, while the real goal was a niche but critical problem: helping inspectors perform better, faster, and safer. Staying disciplined and ruthless about scope was essential.
Second, orchestrating the frontend and backend into a single cohesive system was intense. One fix led to another, and as hours dwindled toward sunrise, we were integrating live video, voice, checklist tracking, LLM logic, and cloud escalation all at once. That pressure tested everything, from architecture to team patience, but it made the system stronger.
Accomplishments that we're proud of
We built a full fledged system architecture for Caterpillar that turns first-day operators into veterans, on the spot.
The app also tracks every inspection step, escalates critical issues automatically, and produces official CAT-formatted reports.
We made inspections faster, safer, and more accessible. New operators can step into a high-knowledge industry without years of experience.
Offline-first design with edge devices means it works where work actually happens no signal, no problem.
What we learned
Dual LLM architecture works. Local for speed. Cloud for judgment. Field operations need both. You need answers now. You escalate when stakes are high. Two problems, two solutions, one orchestration layer connecting them.
Using real CAT data changed everything. Real 43-item form. Real pass/fail library. Real report format. Every decision grounded in how CAT actually operates in the field.
The phone as the field device was the good compromise. Cleaner story, similar architecture, no unnecessary hardware complexity unlike the meta for the timebeing.
What's next for CATalyst
Full Meta glasses integration Live voice transcription via Whisper on-device Real CAT dealer API replacing mock Fleet-wide pattern recognition across machines Jetson Nano for true edge deployment in remote, desolate environments
Built With
- aws-bedrock
- claude-3-5-haiku
- expo.io
- fastapi
- llava
- ollama
- python
- react-native
- reportlab
- supabase
- supermemory
Log in or sign up for Devpost to join the conversation.