Recall - Assistive Memory System

An intelligent assistive memory system that combines facial recognition, audio/video analysis, and AI-powered summarization to help users remember and track their daily interactions.

Features

🧠 Facial Recognition

Upload and store facial profiles with personal information
Real-time face detection using OpenCV
MongoDB-based profile storage with facial embeddings
Raspberry Pi camera integration for live recognition

🎙️ Audio Analysis

Audio transcription using OpenAI Whisper
AI-powered conversation summarization with Grok
Concise 3-sentence summaries of interactions

🎥 Video Analysis

Frame-by-frame video processing
AI-powered visual event detection and summarization
Temporal flow analysis of activities

🔊 Text-to-Speech

ElevenLabs voice synthesis integration
Matilda voice as default
PCM audio format support for real-time playback
Streaming and batch audio generation

Tech Stack

Backend

Python 3.14
Flask - REST API framework
MongoDB - Profile and timeline storage
OpenCV - Face detection and image processing
NumPy - Array operations for embeddings

AI Services

OpenAI Whisper - Audio transcription
Grok (xAI) - Video/audio summarization
ElevenLabs - Text-to-speech synthesis

Project Structure

tbd/
├── backend/
│   ├── app.py                  # Flask API for facial profiles
│   ├── api.py                  # AI service integrations (Grok, Whisper)
│   ├── elevenlabs_client.py    # Text-to-speech functionality
│   ├── memory_schema.py        # MongoDB database operations
│   ├── data_processing.py      # Data processing utilities
│   └── uploads/                # Uploaded photos storage
├── frontend/
│   └── main.py                 # Frontend application
└── .env                        # API keys (gitignored)

Installation

Prerequisites

Python 3.14+
MongoDB installed and running
API keys for:
- OpenAI (Whisper)
- Grok (xAI)
- ElevenLabs

Setup

Clone the repository

git clone https://github.com/Christinetrr/tbd.git
cd tbd

Install dependencies

pip install flask pymongo opencv-python numpy pillow werkzeug openai elevenlabs

Configure environment variables

Create a .env file in the project root:

GROK_API_KEY=your_grok_api_key
OPENAI_API_KEY=your_openai_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key

Start MongoDB

mongod --dbpath /path/to/your/data/directory

Run the Flask server

cd backend
python app.py

The server will start on http://127.0.0.1:5001

API Endpoints

Facial Profile Management

Setup Facial Profile

POST /api/profiles/setup
Content-Type: multipart/form-data

{
  "name": "John Doe",
  "relation": "Friend",
  "photo": <file>
}

Recognize Face (Raspberry Pi)

POST /api/profiles/recognize
Content-Type: application/json

{
  "embedding": [128-dimensional array]
}

Get All Profiles

GET /api/profiles

Get Specific Profile

GET /api/profiles/<name>

Delete Profile

DELETE /api/profiles/<name>

Health Check

GET /api/health

Usage Examples

Facial Profile Setup

import requests

url = "http://127.0.0.1:5001/api/profiles/setup"
files = {"photo": open("person.jpg", "rb")}
data = {"name": "Jane Smith", "relation": "Friend"}

response = requests.post(url, files=files, data=data)
print(response.json())

Audio Summarization

from api import summarize_audio

with open("conversation.mp3", "rb") as audio_file:
    summary = summarize_audio(audio_file)
    print(summary)

Text-to-Speech

from elevenlabs_client import text_to_speech

audio_data = text_to_speech(
    "Hello, how are you doing today?",
    output_path="greeting.pcm"
)

Video Frame Analysis

from api import summarize_frames
import cv2

# Read frames from video
frames = []
cap = cv2.VideoCapture("video.mp4")
while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break
    frames.append(frame)
cap.release()

# Get summary
summary = summarize_frames(frames)
print(summary)

Database Schema

Profiles Collection

{
  "name": "John Doe",
  "relation": "Friend",
  "conversations": [
    {
      "timestamp": ISODate("2025-11-09T12:00:00Z"),
      "summary": "Discussed weekend plans"
    }
  ],
  "embedding": [128-dimensional array],
  "metadata": {
    "created_at": ISODate("2025-11-09T10:00:00Z"),
    "last_seen": ISODate("2025-11-09T12:00:00Z")
  }
}

Timelines Collection

{
  "date": "2025-11-09",
  "events": [
    {
      "time": "14:30",
      "type": "interaction",
      "description": "Met with John Doe",
      "duration": 30
    }
  ]
}

Hardware Integration

Raspberry Pi Setup

The system supports Raspberry Pi camera integration for real-time facial recognition:

Capture face from Pi camera
Extract facial embedding on Pi
Send embedding to /api/profiles/recognize endpoint
Receive matched profile information

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Team

Andrea Sato - Facial recognition API & MongoDB integration
Christinetrr - API integration & video batching
shrenik - Audio capture & Raspberry Pi integration

License

This project is part of an academic assignment.

Acknowledgments

OpenAI for Whisper transcription API
xAI for Grok video/audio analysis
ElevenLabs for natural voice synthesis
OpenCV for computer vision capabilities

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
backend		backend
frontend		frontend
.DS_Store		.DS_Store
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Recall - Assistive Memory System

Features

🧠 Facial Recognition

🎙️ Audio Analysis

🎥 Video Analysis

🔊 Text-to-Speech

Tech Stack

Backend

AI Services

Project Structure

Installation

Prerequisites

Setup

API Endpoints

Facial Profile Management

Setup Facial Profile

Recognize Face (Raspberry Pi)

Get All Profiles

Get Specific Profile

Delete Profile

Health Check

Usage Examples

Facial Profile Setup

Audio Summarization

Text-to-Speech

Video Frame Analysis

Database Schema

Profiles Collection

Timelines Collection

Hardware Integration

Raspberry Pi Setup

Contributing

Team

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages