Bioacoustic workbench: isolate elephant WAV recordings from mechanical noise in the browser (spectrograms, playback, export). Backend: FastAPI, STFT / HPSS-style DSP, SciPy stack. Optional: Google Gemini for short noise explanations. License: MIT.
- Overview
- Screenshots
- Prerequisites
- Quick start
- Deploy backend (Render)
- Deploy frontend (Vercel)
- Configuration
- Usage
- System architecture
- API reference
- Repository structure
- Batch CLI
- DSP notes
- Troubleshooting
- License
Field recordings mix infrasonic elephant rumbles with engines, wind, and gear. EleSynth lets researchers clean one file or many segments, compare before/after visually and audibly, and download cleaned WAV output.
| Layer | Technologies |
|---|---|
| Frontend | Next.js 16, React 18, TypeScript, Tailwind v4, shadcn/ui, Motion, Wavesurfer.js |
| Backend | FastAPI, Uvicorn, SciPy, NumPy, Librosa, google-generativeai, Pydantic |
| Audio | WAV in/out · backend/app/dsp.py |
Landing (home): value proposition and entry to analysis.
Marketing carousel: before/after spectrogram comparison (“See it”).
Bioacoustics Lab (/workbench): intake, session queue, and empty analysis canvas before files are processed.
| Requirement | Notes |
|---|---|
| Python | 3.11+ (venv recommended) |
| Node.js | 20+ and npm |
| Samples | .wav files |
Use two terminals from the repo root.
-
Create and enter the venv, install deps, run the API:
cd backend python -m venv .venv -
Activate (current directory must be
backend/):OS Command Windows .venv\Scripts\activatemacOS / Linux source .venv/bin/activate -
Install and start:
pip install -r requirements.txt uvicorn app.main:app --reload --host 127.0.0.1 --port 8000
-
Check health:
http://127.0.0.1:8000/api/health→{"status":"ok"}.
-
Install and configure:
cd frontend npm install -
Env file (
frontend/.env.local):OS Command macOS / Linux cp .env.example .env.localWindows copy .env.example .env.localRequired:
NEXT_PUBLIC_API_BASE_URL=http://127.0.0.1:8000
-
Dev server:
npm run dev
-
Open
http://localhost:3000→ Begin Analysis or/workbench→ upload WAV → process → compare → download.
cd frontend
npm run build
npm run startSet NEXT_PUBLIC_API_BASE_URL to your deployed API (no trailing slash).
The FastAPI app in backend/ is a good fit for a Render Web Service. The repo includes a Blueprint at render.yaml (Python Starter in oregon, rootDir: backend).
- In the Render dashboard: New → Blueprint, connect this repository, apply
render.yaml. - When prompted, set
FRONTEND_ORIGINto your deployed frontend origin(s), comma-separated, no trailing slash (for examplehttps://your-app.vercel.app). This extends the CORS allowlist inbackend/app/main.py. - Optionally set
GEMINI_API_KEYorGOOGLE_API_KEYfor Gemini; omit if you only need DSP. - After deploy, confirm
GET https://<your-service>.onrender.com/api/health. - On Vercel (or any Next host), set
NEXT_PUBLIC_API_BASE_URLto that same API origin (no trailing slash).
Without a Blueprint: create a Web Service, set Root Directory to backend, build pip install -r requirements.txt, start uvicorn app.main:app --host 0.0.0.0 --port $PORT, health check path /api/health. Python version follows backend/runtime.txt (3.11.x).
SciPy and Librosa are memory-heavy; if the build or runtime fails with OOM, upgrade the instance type in Render. Very large upload bodies may hit platform limits; see Render request limits.
The Next.js app lives under frontend/. Vercel should build that directory only (the backend stays on Render).
-
Import this GitHub repository in the Vercel dashboard (Add New → Project).
-
Root Directory: set to
frontend(monorepo). Framework preset should detect Next.js. -
Node.js: use 20.x (matches local dev).
-
Environment variables (Production, and Preview if you use them):
Name Value NEXT_PUBLIC_API_BASE_URLYour Render API origin, e.g. https://elesynth-api.onrender.com(no trailing slash)Optional: set
NEXT_PUBLIC_DIRECT_API_BASE_URLto the same URL if you rely on the fallback path infrontend/lib/api.ts. -
Deploy. Your site will be at
https://<project>.vercel.app(or your custom domain). -
CORS: On Render, set
FRONTEND_ORIGINto that exact origin (comma-separated if you have several, e.g. production + a preview URL). Redeploy the API if you change it. The app already allowslocalhost:3000by default inbackend/app/main.py.
Preview deployments: each preview has its own *.vercel.app URL. Add each origin you need to FRONTEND_ORIGIN on Render, or test previews only against an API that allows those origins.
Uploads: Audio is sent from the browser directly to the Render API, not through Vercel’s serverless layer, so the Next app itself does not need a larger body limit for POST /api/process-audio.
Loaded in backend/app/main.py from backend/.env and repo root .env (order is defined in code).
| Variable | Where | Purpose |
|---|---|---|
GEMINI_API_KEY or GOOGLE_API_KEY |
Backend .env |
Gemini on. Omit = DSP still works; analysis text may be empty/static. |
GEMINI_MODEL |
Backend .env |
Optional. Default gemini-2.5-flash. |
FRONTEND_ORIGIN |
Backend .env |
Optional. Extra CORS origins, comma-separated. |
NEXT_PUBLIC_API_BASE_URL |
frontend/.env.local |
API origin for fetch (no trailing slash). |
NEXT_PUBLIC_DIRECT_API_BASE_URL |
frontend/.env.local |
Fallback if primary is same-origin and first call fails (default http://127.0.0.1:8000). |
Gemini responses are cached under backend/.cache/gemini_results.json (per file digest). Quota errors trigger a short API backoff.
| Path | Role |
|---|---|
/ |
Landing · app/page.tsx |
/upload |
Stage WAV · app/(app)/upload/page.tsx |
/workbench |
Queue, CSV, process, export · app/(app)/workbench/page.tsx |
(app) is a Next.js route group: shared app/(app)/layout.tsx (shell, session); the folder name is not part of the URL.
Columns: filename, start_time, end_time (seconds), aligned with staged files.
POST /api/process-audio accepts .wav only.
- Client:
multipart/form-datawithfile. Shipped UI (frontend/lib/api.ts) sendsfileonly; server still accepts optionalmetadata_json. - Parse:
load_audio_from_bytes→ optionalProcessMetadata→windows_from_metadata. No metadata = one window from0to full duration. - Process:
process_clipindsp.py(rumble band 10 Hz to 1000 Hz in code). Merge clips with gaps →build_spectrogram_preview→ JSON + base64 WAV (ProcessAudioResponse,frontend/lib/types.ts). - Gemini (optional):
noise_analysis,research_metadata; cached by content hash.
Returns JSON for connectivity checks.
Content-Type: multipart/form-data
| Field | Req | Description |
|---|---|---|
file |
Yes | WAV |
metadata_json |
No | JSON: start_time + end_time, or segments: [{ start_time, end_time }, ...]. Omit = whole file once. |
Response: original_filename, cleaned_filename, sample_rate, duration_seconds, processed_windows, cleaned_audio_base64, before_spectrogram, after_spectrogram (times, frequencies, magnitude_db), noise_analysis, research_metadata.
AAAM/
├── docs/
│ └── images/ # README screenshots
├── render.yaml # Render Blueprint (FastAPI web service)
├── backend/
│ ├── app/
│ │ ├── main.py # FastAPI, CORS, routes, Gemini, cache
│ │ ├── dsp.py # WAV I/O, process_clip, spectrograms
│ │ └── schemas.py # Pydantic models
│ ├── requirements.txt
│ ├── runtime.txt # Python version for Render (and similar hosts)
│ └── .env # optional (or use repo root .env)
├── frontend/
│ ├── app/
│ │ ├── layout.tsx # root layout, fonts, providers
│ │ ├── page.tsx # marketing home
│ │ ├── globals.css
│ │ └── (app)/ # route group (omitted from URL)
│ │ ├── layout.tsx # app shell for /upload, /workbench
│ │ ├── upload/page.tsx
│ │ └── workbench/page.tsx
│ ├── components/ # layout/, marketing/, workbench, ui/
│ ├── lib/
│ │ ├── api.ts
│ │ └── types.ts
│ ├── public/
│ ├── package.json
│ └── .env.local # from .env.example (gitignored)
├── isolate_elephant_vocalizations.py
├── _archive/Elephant/ # legacy reference only
├── LICENSE
└── README.md
From repo root, same Python env as backend:
python isolate_elephant_vocalizations.py --audio-dir PATH_TO_WAVS --annotations PATH_TO_CSV --output-dir cleaned_audioCSV: filename, start_time, end_time.
Low fundamentals (~10-20 Hz) with margin, harmonics to ~1000 Hz, harmonic-percussive separation and band masking, inverse STFT to WAV.
| Issue | Fix |
|---|---|
| Fetch / network errors in UI | API on :8000? NEXT_PUBLIC_API_BASE_URL=http://127.0.0.1:8000 (no / at end)? Restart npm run dev after env changes. |
| CORS | Use localhost:3000 or 127.0.0.1:3000, or set FRONTEND_ORIGIN on the backend. |
Only .wav files are supported |
Input must be WAV. |
| No Gemini text | Set GEMINI_API_KEY or GOOGLE_API_KEY. Quota: wait for backoff. DSP still works without Gemini. |
ModuleNotFoundError (backend) |
Activate the venv where you ran pip install. Run uvicorn from backend/. |
| Frontend TS / build errors | npm install in frontend/. Run next dev or next build to refresh next-env.d.ts. |


