Skip to content

ChimdumebiNebolisa/AAAM

Repository files navigation

EleSynth

Bioacoustic workbench: isolate elephant WAV recordings from mechanical noise in the browser (spectrograms, playback, export). Backend: FastAPI, STFT / HPSS-style DSP, SciPy stack. Optional: Google Gemini for short noise explanations. License: MIT.

Contents

Overview

Problem and product

Field recordings mix infrasonic elephant rumbles with engines, wind, and gear. EleSynth lets researchers clean one file or many segments, compare before/after visually and audibly, and download cleaned WAV output.

Tech stack

Layer Technologies
Frontend Next.js 16, React 18, TypeScript, Tailwind v4, shadcn/ui, Motion, Wavesurfer.js
Backend FastAPI, Uvicorn, SciPy, NumPy, Librosa, google-generativeai, Pydantic
Audio WAV in/out · backend/app/dsp.py

Screenshots

Landing (home): value proposition and entry to analysis.

EleSynth landing page with hero copy and Begin Analysis

Marketing carousel: before/after spectrogram comparison (“See it”).

EleSynth carousel comparing raw and cleaned spectrograms

Bioacoustics Lab (/workbench): intake, session queue, and empty analysis canvas before files are processed.

EleSynth workbench with intake drop zone and empty analysis state

Prerequisites

Requirement Notes
Python 3.11+ (venv recommended)
Node.js 20+ and npm
Samples .wav files

Quick start

Use two terminals from the repo root.

Backend

  1. Create and enter the venv, install deps, run the API:

    cd backend
    python -m venv .venv
  2. Activate (current directory must be backend/):

    OS Command
    Windows .venv\Scripts\activate
    macOS / Linux source .venv/bin/activate
  3. Install and start:

    pip install -r requirements.txt
    uvicorn app.main:app --reload --host 127.0.0.1 --port 8000
  4. Check health: http://127.0.0.1:8000/api/health{"status":"ok"}.

Frontend

  1. Install and configure:

    cd frontend
    npm install
  2. Env file (frontend/.env.local):

    OS Command
    macOS / Linux cp .env.example .env.local
    Windows copy .env.example .env.local

    Required:

    NEXT_PUBLIC_API_BASE_URL=http://127.0.0.1:8000
  3. Dev server:

    npm run dev
  4. Open http://localhost:3000Begin Analysis or /workbench → upload WAV → process → compare → download.

Production (frontend only)

cd frontend
npm run build
npm run start

Set NEXT_PUBLIC_API_BASE_URL to your deployed API (no trailing slash).

Deploy backend (Render)

The FastAPI app in backend/ is a good fit for a Render Web Service. The repo includes a Blueprint at render.yaml (Python Starter in oregon, rootDir: backend).

  1. In the Render dashboard: NewBlueprint, connect this repository, apply render.yaml.
  2. When prompted, set FRONTEND_ORIGIN to your deployed frontend origin(s), comma-separated, no trailing slash (for example https://your-app.vercel.app). This extends the CORS allowlist in backend/app/main.py.
  3. Optionally set GEMINI_API_KEY or GOOGLE_API_KEY for Gemini; omit if you only need DSP.
  4. After deploy, confirm GET https://<your-service>.onrender.com/api/health.
  5. On Vercel (or any Next host), set NEXT_PUBLIC_API_BASE_URL to that same API origin (no trailing slash).

Without a Blueprint: create a Web Service, set Root Directory to backend, build pip install -r requirements.txt, start uvicorn app.main:app --host 0.0.0.0 --port $PORT, health check path /api/health. Python version follows backend/runtime.txt (3.11.x).

SciPy and Librosa are memory-heavy; if the build or runtime fails with OOM, upgrade the instance type in Render. Very large upload bodies may hit platform limits; see Render request limits.

Deploy frontend (Vercel)

The Next.js app lives under frontend/. Vercel should build that directory only (the backend stays on Render).

  1. Import this GitHub repository in the Vercel dashboard (Add NewProject).

  2. Root Directory: set to frontend (monorepo). Framework preset should detect Next.js.

  3. Node.js: use 20.x (matches local dev).

  4. Environment variables (Production, and Preview if you use them):

    Name Value
    NEXT_PUBLIC_API_BASE_URL Your Render API origin, e.g. https://elesynth-api.onrender.com (no trailing slash)

    Optional: set NEXT_PUBLIC_DIRECT_API_BASE_URL to the same URL if you rely on the fallback path in frontend/lib/api.ts.

  5. Deploy. Your site will be at https://<project>.vercel.app (or your custom domain).

  6. CORS: On Render, set FRONTEND_ORIGIN to that exact origin (comma-separated if you have several, e.g. production + a preview URL). Redeploy the API if you change it. The app already allows localhost:3000 by default in backend/app/main.py.

Preview deployments: each preview has its own *.vercel.app URL. Add each origin you need to FRONTEND_ORIGIN on Render, or test previews only against an API that allows those origins.

Uploads: Audio is sent from the browser directly to the Render API, not through Vercel’s serverless layer, so the Next app itself does not need a larger body limit for POST /api/process-audio.

Configuration

Loaded in backend/app/main.py from backend/.env and repo root .env (order is defined in code).

Variable Where Purpose
GEMINI_API_KEY or GOOGLE_API_KEY Backend .env Gemini on. Omit = DSP still works; analysis text may be empty/static.
GEMINI_MODEL Backend .env Optional. Default gemini-2.5-flash.
FRONTEND_ORIGIN Backend .env Optional. Extra CORS origins, comma-separated.
NEXT_PUBLIC_API_BASE_URL frontend/.env.local API origin for fetch (no trailing slash).
NEXT_PUBLIC_DIRECT_API_BASE_URL frontend/.env.local Fallback if primary is same-origin and first call fails (default http://127.0.0.1:8000).

Gemini responses are cached under backend/.cache/gemini_results.json (per file digest). Quota errors trigger a short API backoff.

Usage

Routes

Path Role
/ Landing · app/page.tsx
/upload Stage WAV · app/(app)/upload/page.tsx
/workbench Queue, CSV, process, export · app/(app)/workbench/page.tsx

(app) is a Next.js route group: shared app/(app)/layout.tsx (shell, session); the folder name is not part of the URL.

CSV (optional)

Columns: filename, start_time, end_time (seconds), aligned with staged files.

API

POST /api/process-audio accepts .wav only.

System architecture

  1. Client: multipart/form-data with file. Shipped UI (frontend/lib/api.ts) sends file only; server still accepts optional metadata_json.
  2. Parse: load_audio_from_bytes → optional ProcessMetadatawindows_from_metadata. No metadata = one window from 0 to full duration.
  3. Process: process_clip in dsp.py (rumble band 10 Hz to 1000 Hz in code). Merge clips with gaps → build_spectrogram_preview → JSON + base64 WAV (ProcessAudioResponse, frontend/lib/types.ts).
  4. Gemini (optional): noise_analysis, research_metadata; cached by content hash.

API reference

GET /api/health

Returns JSON for connectivity checks.

POST /api/process-audio

Content-Type: multipart/form-data

Field Req Description
file Yes WAV
metadata_json No JSON: start_time + end_time, or segments: [{ start_time, end_time }, ...]. Omit = whole file once.

Response: original_filename, cleaned_filename, sample_rate, duration_seconds, processed_windows, cleaned_audio_base64, before_spectrogram, after_spectrogram (times, frequencies, magnitude_db), noise_analysis, research_metadata.

Repository structure

AAAM/
├── docs/
│   └── images/                  # README screenshots
├── render.yaml                  # Render Blueprint (FastAPI web service)
├── backend/
│   ├── app/
│   │   ├── main.py              # FastAPI, CORS, routes, Gemini, cache
│   │   ├── dsp.py               # WAV I/O, process_clip, spectrograms
│   │   └── schemas.py           # Pydantic models
│   ├── requirements.txt
│   ├── runtime.txt              # Python version for Render (and similar hosts)
│   └── .env                     # optional (or use repo root .env)
├── frontend/
│   ├── app/
│   │   ├── layout.tsx           # root layout, fonts, providers
│   │   ├── page.tsx             # marketing home
│   │   ├── globals.css
│   │   └── (app)/               # route group (omitted from URL)
│   │       ├── layout.tsx       # app shell for /upload, /workbench
│   │       ├── upload/page.tsx
│   │       └── workbench/page.tsx
│   ├── components/              # layout/, marketing/, workbench, ui/
│   ├── lib/
│   │   ├── api.ts
│   │   └── types.ts
│   ├── public/
│   ├── package.json
│   └── .env.local               # from .env.example (gitignored)
├── isolate_elephant_vocalizations.py
├── _archive/Elephant/           # legacy reference only
├── LICENSE
└── README.md

Batch CLI

From repo root, same Python env as backend:

python isolate_elephant_vocalizations.py --audio-dir PATH_TO_WAVS --annotations PATH_TO_CSV --output-dir cleaned_audio

CSV: filename, start_time, end_time.

DSP notes

Low fundamentals (~10-20 Hz) with margin, harmonics to ~1000 Hz, harmonic-percussive separation and band masking, inverse STFT to WAV.

Troubleshooting

Issue Fix
Fetch / network errors in UI API on :8000? NEXT_PUBLIC_API_BASE_URL=http://127.0.0.1:8000 (no / at end)? Restart npm run dev after env changes.
CORS Use localhost:3000 or 127.0.0.1:3000, or set FRONTEND_ORIGIN on the backend.
Only .wav files are supported Input must be WAV.
No Gemini text Set GEMINI_API_KEY or GOOGLE_API_KEY. Quota: wait for backoff. DSP still works without Gemini.
ModuleNotFoundError (backend) Activate the venv where you ran pip install. Run uvicorn from backend/.
Frontend TS / build errors npm install in frontend/. Run next dev or next build to refresh next-env.d.ts.

License

MIT License.

About

A bioacoustic workbench designed to isolate elephant vocalizations from mechanical background noise

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors