Inspiration
Most AI assistants today are fundamentally reactive. They wait for a prompt, respond with text, and stop there. Over time, we realized that this interaction model creates friction: users must remember what to ask, when to ask it, and how to phrase it, every single time.
In our own workflows, important actions often happened after context already existed: a calendar invite arrives, a ticket is booked, a document is shared, or a pull request is merged. These moments don’t need another chatbot response, in fact they need action.
That insight inspired Echo: a proactive AI assistant that notices events, learns over time and executes meaningful workflows before you ask - always with your permission.
What We Built
Echo is an ambient, consumer-first AI assistant designed around two core concepts: Actions.and Thoughts
- Actions are multi-step workflows executed across connected tools like email, calendar, drive, and task managers. Echo proposes actions proactively, and users stay in control by approving, rejecting, or modifying them.
- Thoughts are persistent memory units which can be links, notes (for now), screenshots, voice notes, or documents (later) that users save over time. These are then used to understand the user's preferences and suggest relevant actions.
Example flow:
- Users sign up, complete onboarding, and connect everyday apps like Mail, Calendar, Drive, and To-Do, with full control to customize, modify, or revoke fine-grained access anytime. They can also choose whether to auto-execute or ask before execution.
- When activity occurs in a connected app (for example, a new calendar event), Echo utilises this trigger, gathers relevant context, and executes cross-app, multi-step actions automatically, such as checking availability, RSVPing, and preparing a meeting deck using past documents.
- Thoughts continuously augment Echo’s understanding, guiding and refining actions over time.
How Gemini is used
- We use the Gemini-3 API via the Vercel AI SDK to process and take the actual decisions on what has happened and what can be done via tool calls. This forms the core of the application thinking where Gemini reads the incoming event, processes it, suggests and performs tool calls via MCP
- We also used Gemini in building the application itself via Cursor and GitHub Copilot.
How We Built It
Frontend: React Native (Expo) for a fast, cross-platform mobile experience
Backend: Node.js with Express, handling triggers, orchestration, and permissions
Data: PostgreSQL via Supabase for structured data and user state
AI & Reasoning: We use the Gemini-3 AI Model via the Vercel SDK for all the reasoning and actual actions that are decided by Echo. This really forms the heart of how and what actions to be taken.
Memory: Long-term context is handled via Supermemory, which stores and retrieves user Thoughts (links, notes) using semantic embeddings. This memory layer allows Echo to reason with historical context rather than relying only on the current conversation. This is also used to store user's preferences of actions over time to suggest better actions in the future.
Integrations & Actions: Composio is used to connect third-party tools, manage authentication and permissions, listen to real-world triggers, and execute multi-step workflows across apps like email, calendar, drive, and task managers.
Notifications: Firebase Cloud Messaging (Android) and Apple Push Notifications (iOS) are used to surface time-sensitive actions and confirmations without overwhelming the user.
At runtime, Echo listens for triggers from connected apps or user activity, retrieves relevant long-term memory, reasons about the next best action, and executes workflows through integrations. Users stay in control through an inbox-style approval system and a transparent action log.
Challenges We Faced
Building a proactive assistant introduced challenges beyond typical chatbot design:
- Trust & Control: Acting without explicit commands required fine-grained permissions, clear explanations, and reversible actions.
- Context Retrieval: Long-term memory needed to be useful without becoming noisy, which required careful semantic indexing and retrieval strategies.
- Easy UX: Thinking about how to make it the easiest for any user to onboard and use the app, without needing the knowledge of complex integrations. It should be as simple as connecting via OAuth and the rest is taken care by us.
Each challenge pushed us to think deeply about real-world usability rather than just model capability.
What We Learned
We learned that large language models are most powerful when they actually do things, not just talk about them. Proactivity, memory, and execution combined with user trust unlock an entirely new class of consumer AI experiences.
Echo represents our attempt to build the assistant we always wanted: one that understands context, respects control, and quietly makes life easier.
Built With
- composio
- express.js
- firebase-cloud-messaging
- google-gemini-models
- javascript
- node.js
- postgresql
- react-native-(expo)
- semantic-memory-retrieval
- supermemory
- typescript
- vector-embeddings
Log in or sign up for Devpost to join the conversation.