Edith

landing page
graph
pipeline

Inspiration

Edith is inspired by the kind of assistant that feels invisible until you need it, then instantly capable. Like E.D.I.T.H. from Marvel, it sees what you see, understands what you mean, and takes action without pulling you out of the moment.

At conferences, great opportunities pass by because pulling out your phone breaks eye contact, kills momentum, and adds friction. We wanted to build a wearable assistant that keeps you present while still giving you live context, memory, and action.

What made this exciting for us was not just building a glasses demo, but building it around real agent infrastructure: ASI, OmegaClaw, and Agentverse. Instead of a single assistant hardcoded to do a few tricks, we wanted Edith to behave like an intelligent runtime that can reason, delegate, and grow by adding new skills.

What It Does

Edith is a real-time multimodal AI assistant that runs on Meta Ray-Ban glasses.

You speak naturally, the glasses see what you see, and Edith decides what to do by routing your request through OmegaClaw’s reasoning system and into the right ASI Agentverse skill.

That means Edith can handle both understanding and action:

“Who is this?” looks at a name badge and responds with a concise professional summary in under 7 seconds. “Send a follow-up to the person I just met” triggers a Gmail-connected agent skill. “Book a reservation here” routes into browser automation through Playwright MCP, with a spoken confirmation preview before anything executes. The important idea is that Edith is not just a voice interface. It is an agentic system built on ASI infrastructure, where OmegaClaw interprets intent and dispatches the task to specialized agents.

No phone. No typing. No broken eye contact.

How We Built It

We built Edith as a live multimodal pipeline across wearable hardware, real-time AI, and agent orchestration.

A Swift iOS app using the Meta DAT SDK streams audio and downsampled image frames from the glasses into Gemini Live through a FastAPI WebSocket bridge. Gemini Live handles real-time voice, vision, turn-taking, and speech output.

From there, we integrated Gemini Live with OmegaClaw, which became the core reasoning layer of the system. Instead of treating OmegaClaw like a simple wrapper or REST endpoint, we built a custom channel adapter that plugs directly into the OmegaClaw-Core extension pattern. That was important because we wanted Gemini tool calls to flow into the actual MeTTa reasoning loop, where OmegaClaw can classify conversational intent, reason over available capabilities, and decide which agent skill to invoke.

On the ASI side, we used Agentverse as the execution layer for specialized actions. Edith delegates from OmegaClaw into our registered uAgents, including Gmail-connected workflows and locally hosted Playwright MCP automation. FastAPI coordinates the full loop between the glasses, Gemini Live, OmegaClaw, and Agentverse.

One of the parts we’re proudest of is how extensible the architecture became. Adding a new capability is lightweight and composable:

a MeTTa skill declaration in OmegaClaw a Python bridge function an Agentverse agent That means Edith is not a closed assistant. It is an expandable ASI-powered agent system where new behaviors can be added cleanly.

Our Pipeline

Edith works as a live end-to-end pipeline across wearable hardware, multimodal AI, reasoning, and agent execution.

The interaction starts on the Meta Ray-Ban glasses, where the user speaks naturally and the device captures both audio and visual context. A Swift iOS app built on the Meta DAT SDK streams microphone audio and downsampled image frames through a FastAPI WebSocket bridge into Gemini Live, which handles real-time multimodal understanding, turn-taking, and speech generation.

From there, requests are routed into OmegaClaw through a custom channel adapter we built against the OmegaClaw-Core extension pattern. This was a key architectural decision: instead of treating OmegaClaw like a wrapper around tools, we connected it directly to the MeTTa reasoning loop so it could classify intent, reason over available skills, and decide which action path to take.

Once the request is understood, OmegaClaw delegates execution to our registered ASI Agentverse agents. Those agents handle specialized tasks such as sending follow-up emails through Gmail or performing browser actions through our locally hosted Playwright MCP setup. The result is then passed back through the same pipeline and spoken to the user through the glasses.

That full loop is what makes Edith feel seamless: see, speak, reason, act, respond

How ASI and OmegaClaw Shaped the Project

ASI and OmegaClaw were not just supporting tools for us; they shaped the way we built Edith.

OmegaClaw gave us the reasoning backbone. It let us move beyond a basic “if user says X, call tool Y” pattern and instead build a system that can interpret natural voice input, route tasks intelligently, and scale as more skills are added.

ASI and Agentverse gave us the agent ecosystem. Rather than baking every action directly into the app, we built Edith so that capabilities live as specialized agents that can be registered, invoked, and extended over time. That made the project feel much closer to a real agent platform than a one-off demo.

In practice, that meant we were building agents throughout the project, not just stitching APIs together. We defined skills in OmegaClaw, built bridge functions to connect them into the runtime, and exposed actual executable capabilities through Agentverse.

Challenges We Ran Into

The biggest challenge was latency.

Our first pipeline pushed raw visual frames from the glasses into Gemini Live and took around 600ms end-to-end. Through four iterations of optimization, we reduced that to 50ms:

v1: 600ms with unthrottled 24fps frame injection v2: 320ms by throttling to 1fps JPEG frames v3: 140ms by replacing naive HTTP polling with a direct OmegaClaw channel adapter v4: 50ms through full pipeline tuning The second major challenge was integration complexity. OmegaClaw is not a standard REST API; it is a neural-symbolic agent built in MeTTa. Wiring Gemini’s live tool-calling behavior into that reasoning system meant understanding OmegaClaw’s internal skill dispatch model and building a real adapter into the reasoning loop, not just calling an endpoint.

We also ran into low-level AVFoundation audio session issues on the glasses hardware, where incorrect teardown sequencing caused silent capture failures.

What We’re Proud Of

Most projects might integrate one or two advanced systems. Edith makes four work together in one live, usable pipeline:

Meta Ray-Ban glasses Gemini Live OmegaClaw ASI Agentverse What we’re especially proud of is that this was not just a demo of multimodal input. It was a full agentic system: wearable input, real-time reasoning, agent delegation, and tool execution working together end to end.

What’s Next

We want to expand Edith with more real-world ASI agent skills, including workflows like purchase assistance, meeting scheduling, and booth summarization.

We also want to add persistent event memory so Edith can remember who you met, what was discussed, and what follow-up actions matter later.

Longer term, we plan to open-source the OmegaClaw channel adapter template so other developers can build wearable agent experiences on top of ASI and OmegaClaw more easily.