Inspiration
I’m a mechanical enginner who codes. After working as a software enginner at NVIDIA and using AI tools like Cursor and Claude Code, going back to mechanical/hardware workflows felt like I went back to the past. I notice that we need an in-the-field AI that understands parts, tools, and the physical reality. That’s why I built R-Hat, a “Cursor for hardware”.
What it does
R-Hat is a live AI assistant implemented with AR. A web app streams video + audio to the Gemini Live API and overlays step-by-step guidance, checks, and quick actions back into your view. UI is optimized for Xreal’s ~46° FOV, using a “liquid-crystal” style. I also realized that pure black appears nearly transparent when viewed through the glasses, so I leveraged that to make the UI feel like floating glass widgets instead of big rectangles blocking your view.
How we built it
Hardware: Xreal Air + Khadas Edge2 + ArduCam (camera and mic) + 10000mAh protable battery. I 3D printed encasing that have clips that allow them to be placed on a regular hat. The Khadas Edge2 is a mini computer running Linux that runs the webapp. I then connect the ArduCam through one of the USB ports to get video and audio input. I connect the Xreal Air glasses to the DP Type C port on the Khadas Edge2 which allows me to use them as a monitor for the Khadas. Finally I power everything with the portable battery.
Realtime app loop: Camera + mic → WebRTC → Gemini Live streaming → token-by-token UI updates.
The UI is based on best practices for AR content to avoid it being too distrancting and for the content to be easy and confortable to view for the user.
Challenges we ran into
Figuring out X11 vs Wayland vs Weston on the Edge2 and getting consistent output to the Xreal. Sometimes you really do have to nuke and restart the display server.
Limited ports on the Khadas Edge 2, could not have the the ArduCam, keyboard, and mouse connected at the same time for testing. Which meant I had to either be disconecting and reconnecting the keyboard or camera or I had to learn all the hidden keyboard shortcuts to use Linux without a mouse.
Setting up the connection to the websocket to get the live audio and video stream with the gemini live API.
AR UI that doesn’t annoy you. Placing the chat interface in a location where is legible but doesn't block your view too much. Having the real tiem video footage in the corner so you can see exactly what the AI is saying since the camera doesn't always look exactly where you are looking due to placement and angle.
Accomplishments that we're proud of
In a single day, I built the demo that Meta failed to showcase in their latest Meta connect. Honeslty, it is hard to explain to someone who hasn't tried it but haivng live AI assistace based on what you are seeing in the real world feels like I am peeking into the future. I am sincerely very proud of what I have managed to build so far for this project.
What we learned
I learned that working with hardware can be very tidious as sometime you think your code is not working and in reality a cable was just loose. I also experienced first hand the challenges of formfactor for AR devices. Also that designign UIs for AR is very different than designing desktop UIs, in AR minimalism wins.
What's next for R-Hat
I would like to implement spatial aspects to the UI so that you can anchor panels in space. Also I want to be able to uplaod more context to the AI for the task that you will be working on, like part drawings, build instructions, etc. I also plan on developing a feature where the AI can show relevant videos and images and display them in the UI. But yeah I am going to keep building this project and try to make it into a real product that changes engineering workflows forever.
Built With
- gemini
- linux
- react
- tailwind
- typescript
- xreal

Log in or sign up for Devpost to join the conversation.