Inspiration

We built Eventide AI because adding real-life events to your calendar shouldn't require tedious manual input or scattered screenshots. From bulletin board flyers to Instagram posts and video invites, event information is everywhere—but extracting it, remembering details, and actually showing up is harder than it should be.

Our inspiration comes from wanting to bridge the gap between the moments when you discover something exciting out in the world and when you’re actually reminded to go. By combining the latest in multimodal AI with seamless mobile workflows, Eventide AI aims to make calendar capture effortless, accurate, and context-rich—so you never miss out because of lost details, copy-paste fatigue, or confusing event logistics.


What It Does

Eventide AI enables users to add events to their calendar from four distinct inputs:

  • Image (Flyer): Snap a photo of a flyer, poster, or screenshot. Eventide extracts key info using Gemini Vision: title, date, time, location, and description.
  • URL (Social/Event): Paste links from social platforms or event pages. The system expands them (oEmbed/OG), extracting metadata or media.
  • Video URL: Paste or share a video link; key frames and audio are extracted, transcribed, and processed (partial pipeline).
  • Pasted Text: Paste in plain event info for quick processing.

All data is enriched using Google Maps Places and Time Zone APIs, mapping event location and timezone precisely. Conflicts are checked in a shared calendar, and users review before final save—reducing friction and making community engagement as simple as snapping or sharing.


How We Built It

Backend: Built in Node.js, TypeScript, and Express, with service modules for extraction (Gemini LLM, ffmpeg, url-expander), calendar management, and place/timezone enrichment. Integrations include Gemini 2.0 API (multimodal), Google Calendar, Google Maps Places/Time Zone, and YouTube downloads via yt-dlp and ffmpeg. Authentication uses a single service account for MVP simplicity.

Mobile: Developed in Expo and React Native (TypeScript). Flows include: capture flyer, paste URL/text, review event, trigger calendar save. All user actions communicate cleanly with the backend for extraction and save.

This modular architecture allowed us to iterate extraction, enrichment, and review logic quickly and independently.


Challenges We Ran Into

  • Complex video/URL extraction: Platform APIs were inconsistent, some needed authentication or throttled requests; scraping approaches proved brittle and required risk management around terms of service.
  • Handling long-form videos: Processing lengthy videos highlighted issues with timeouts, resource bottlenecks, and temp file management. Extracting key frames and transcribing audio reliably, without blocking the app, was especially challenging.
  • Reliable downloads: yt-dlp and ffmpeg needed to work cross-platform, and we had to tackle large and long videos while ensuring cleanup of temporary files.
  • Multimodal prompts: Building and tuning prompts for Gemini (Google Gemini Vision API) to consistently output structured JSON took multiple iterations.
  • Timezones/locations: Mapping vague event names to precise place IDs and timezones meant we needed robust logic and smart fallback routines.
  • Error handling: Long-running extractions, especially with videos and frames, often led to blocking errors; global error middleware and better timeout strategies were required.
  • API rate limits and retries: Frequent calls to Gemini and Google APIs required careful retry logic and circuit-breaker patterns to avoid failures.

Accomplishments That We're Proud Of

  • End-to-end pipeline: instant image/text extraction, Google enrichment, seamless calendar creation.
  • Service account Calendar integration: avoids OAuth, keeps MVP workflow shareable and simple.
  • Modular backend: focused, testable, replaceable extraction/enrichment/calendar services.
  • Mobile UX: fast capture to review to save, with immediate feedback.
  • Documentation: onboarding and readiness checklists for rapid iteration.

What We Learned

  • Parallel processing is essential: We dramatically improved video extraction speed and reliability by parallelizing frame analysis and API calls. This reduced bottlenecks and made it possible to handle longer-form video events.
  • Multimodal AI unlocks robust extraction: Modern AI like Gemini Vision made pulling structured data from images, videos, and social posts radically easier than traditional OCR or rule-based parsing.
  • Service accounts speed up calendar workflows: Using Google service accounts enabled rapid MVP scheduling without complex OAuth flows.
  • Resilient error handling and timeouts: We learned to anticipate failures in heavy media and video tasks, implementing timeouts, retries, and graceful fallback so users always get a usable result.
  • Small modular services enable fast iteration: Single-purpose service modules allowed us to isolate and debug extraction, enrichment, and calendar logic quickly.
  • Early observability pays off: Adding structured logs and request tracing helped us diagnose bottlenecks and unexpected errors early in development.

What's Next for Eventide AI

  • Complete video and URL extraction: add YouTube captions, platform-specific extractors, better frame/audio pipeline.
  • Improve resiliency: retries, timeouts, circuit-breakers, global error middleware with structured fallbacks.
  • UX/developer polish: clearer mobile errors, flexible API base URL, enhanced telemetry, updated arch diagrams, and OpenAPI and Postman tools.
  • Add tests/CI: robust integration testing, linting, type-checking workflows.
  • Product: smarter calendar deduplication, user-specific calendar via OAuth, analytics on failure/success rates.

Built With

Share this project:

Updates